计算文件的MD5校验和

本文翻译自:Calculate MD5 checksum for a file

I'm using iTextSharp to read the text from a PDF file. 我正在使用iTextSharp从PDF文件读取文本。 However, there are times I cannot extract text, because the PDF file is only containing images. 但是,有时我无法提取文本,因为PDF文件仅包含图像。 I download the same PDF files everyday, and I want to see if the PDF has been modified. 我每天都下载相同的PDF文件,我想看看PDF是否已被修改。 If the text and modification date cannot be obtained, is a MD5 checksum the most reliable way to tell if the file has changed? 如果无法获得文本和修改日期,则MD5校验和是判断文件是否已更改的最可靠方法吗?

If it is, some code samples would be appreciated, because I don't have much experience with cryptography. 如果是这样,将不胜感激一些代码示例,因为我在密码学方面没有太多经验。


#1楼

参考:https://stackoom.com/question/i8kC/计算文件的MD-校验和


#2楼

It's very simple using System.Security.Cryptography.MD5 : 使用System.Security.Cryptography.MD5非常简单:

using (var md5 = MD5.Create())
{
    using (var stream = File.OpenRead(filename))
    {
        return md5.ComputeHash(stream);
    }
}

(I believe that actually the MD5 implementation used doesn't need to be disposed, but I'd probably still do so anyway.) (我相信实际上不需要处置使用的MD5实现,但是无论如何我还是会这样做。)

How you compare the results afterwards is up to you; 之后如何比较结果由您决定; you can convert the byte array to base64 for example, or compare the bytes directly. 您可以将字节数组转换为例如base64,或直接比较字节。 (Just be aware that arrays don't override Equals . Using base64 is simpler to get right, but slightly less efficient if you're really only interested in comparing the hashes.) (请注意,数组不会覆盖Equals 。使用base64更容易解决问题,但如果您只对比较哈希值感兴趣,则使用效率稍低。)

If you need to represent the hash as a string, you could convert it to hex using BitConverter : 如果您需要将散列表示为字符串,则可以使用BitConverter将其转换为十六进制:

static string CalculateMD5(string filename)
{
    using (var md5 = MD5.Create())
    {
        using (var stream = File.OpenRead(filename))
        {
            var hash = md5.ComputeHash(stream);
            return BitConverter.ToString(hash).Replace("-", "").ToLowerInvariant();
        }
    }
}

#3楼

Here is a slightly simpler version that I found. 这是我发现的稍微简单一些的版本。 It reads the entire file in one go and only requires a single using directive. 它可以一次性读取整个文件,只需要一个using指令。

byte[] ComputeHash(string filePath)
{
    using (var md5 = MD5.Create())
    {
        return md5.ComputeHash(File.ReadAllBytes(filePath));
    }
}

#4楼

This is how I do it: 这是我的方法:

using System.IO;
using System.Security.Cryptography;

public string checkMD5(string filename)
{
    using (var md5 = MD5.Create())
    {
        using (var stream = File.OpenRead(filename))
        {
            return Encoding.Default.GetString(md5.ComputeHash(stream));
        }
    }
}

#5楼

I know this question was already answered, but this is what I use: 我知道已经回答了这个问题,但这是我使用的:

using (FileStream fStream = File.OpenRead(filename)) {
    return GetHash(fStream)
}

Where GetHash : 哪里GetHash

public static String GetHash(Stream stream) where T : HashAlgorithm {
    StringBuilder sb = new StringBuilder();

    MethodInfo create = typeof(T).GetMethod("Create", new Type[] {});
    using (T crypt = (T) create.Invoke(null, null)) {
        byte[] hashBytes = crypt.ComputeHash(stream);
        foreach (byte bt in hashBytes) {
            sb.Append(bt.ToString("x2"));
        }
    }
    return sb.ToString();
}

Probably not the best way, but it can be handy. 可能不是最好的方法,但是它很方便。


#6楼

并且,如果您需要计算MD5以确定它是否与Azure blob的MD5相匹配,那么此SO问答可能会有所帮助: 在Azure上上传的blob的MD5哈希与本地计算机上的相同文件不匹配

你可能感兴趣的:(计算文件的MD5校验和)