本文翻译自:Calculate MD5 checksum for a file
I'm using iTextSharp to read the text from a PDF file. 我正在使用iTextSharp从PDF文件读取文本。 However, there are times I cannot extract text, because the PDF file is only containing images. 但是,有时我无法提取文本,因为PDF文件仅包含图像。 I download the same PDF files everyday, and I want to see if the PDF has been modified. 我每天都下载相同的PDF文件,我想看看PDF是否已被修改。 If the text and modification date cannot be obtained, is a MD5 checksum the most reliable way to tell if the file has changed? 如果无法获得文本和修改日期,则MD5校验和是判断文件是否已更改的最可靠方法吗?
If it is, some code samples would be appreciated, because I don't have much experience with cryptography. 如果是这样,将不胜感激一些代码示例,因为我在密码学方面没有太多经验。
参考:https://stackoom.com/question/i8kC/计算文件的MD-校验和
It's very simple using System.Security.Cryptography.MD5 : 使用System.Security.Cryptography.MD5非常简单:
using (var md5 = MD5.Create())
{
using (var stream = File.OpenRead(filename))
{
return md5.ComputeHash(stream);
}
}
(I believe that actually the MD5 implementation used doesn't need to be disposed, but I'd probably still do so anyway.) (我相信实际上不需要处置使用的MD5实现,但是无论如何我还是会这样做。)
How you compare the results afterwards is up to you; 之后如何比较结果由您决定; you can convert the byte array to base64 for example, or compare the bytes directly. 您可以将字节数组转换为例如base64,或直接比较字节。 (Just be aware that arrays don't override Equals
. Using base64 is simpler to get right, but slightly less efficient if you're really only interested in comparing the hashes.) (请注意,数组不会覆盖Equals
。使用base64更容易解决问题,但如果您只对比较哈希值感兴趣,则使用效率稍低。)
If you need to represent the hash as a string, you could convert it to hex using BitConverter
: 如果您需要将散列表示为字符串,则可以使用BitConverter
将其转换为十六进制:
static string CalculateMD5(string filename)
{
using (var md5 = MD5.Create())
{
using (var stream = File.OpenRead(filename))
{
var hash = md5.ComputeHash(stream);
return BitConverter.ToString(hash).Replace("-", "").ToLowerInvariant();
}
}
}
Here is a slightly simpler version that I found. 这是我发现的稍微简单一些的版本。 It reads the entire file in one go and only requires a single using
directive. 它可以一次性读取整个文件,只需要一个using
指令。
byte[] ComputeHash(string filePath)
{
using (var md5 = MD5.Create())
{
return md5.ComputeHash(File.ReadAllBytes(filePath));
}
}
This is how I do it: 这是我的方法:
using System.IO;
using System.Security.Cryptography;
public string checkMD5(string filename)
{
using (var md5 = MD5.Create())
{
using (var stream = File.OpenRead(filename))
{
return Encoding.Default.GetString(md5.ComputeHash(stream));
}
}
}
I know this question was already answered, but this is what I use: 我知道已经回答了这个问题,但这是我使用的:
using (FileStream fStream = File.OpenRead(filename)) {
return GetHash(fStream)
}
Where GetHash : 哪里GetHash :
public static String GetHash(Stream stream) where T : HashAlgorithm {
StringBuilder sb = new StringBuilder();
MethodInfo create = typeof(T).GetMethod("Create", new Type[] {});
using (T crypt = (T) create.Invoke(null, null)) {
byte[] hashBytes = crypt.ComputeHash(stream);
foreach (byte bt in hashBytes) {
sb.Append(bt.ToString("x2"));
}
}
return sb.ToString();
}
Probably not the best way, but it can be handy. 可能不是最好的方法,但是它很方便。
并且,如果您需要计算MD5以确定它是否与Azure blob的MD5相匹配,那么此SO问答可能会有所帮助: 在Azure上上传的blob的MD5哈希与本地计算机上的相同文件不匹配