python如何区分文件类型_Python使用filetype精确判断文件类型

filetype.py

Small and dependency free Python package to infer file type and MIME type checking the magic numbers signature of a file or buffer.

This is a Python port from filetype Go package. Works in Python +3 .

一个小巧自由开放Python开发包,主要用来获得文件类型。包要求Python 3.+

功能特色

•简单友好的API

•支持宽范围文件类型

•提供文件扩展名和MIME类型判断

•文件的MIME类型扩展新增

•通过文件(图像、视频、音频…)简单分析

•可插拔:添加新的自定义类型的匹配

•快,即使处理大文件

•只需要前261个字节表示的最大文件头,这样你就可以通过一个单字节

•依赖自由(只是Python代码,没有C的扩展,没有libmagic绑定)

•跨平台文件识别

安装

pip install filetype

API

详情请查看 annotated API reference .

实例

简单的文件类型识别

import filetype

def main():

kind = filetype.guess('tests/fixtures/sample.jpg')

if kind is None:

print('Cannot guess file type!')

return

print('File extension: %s' % kind.extension)

print('File MIME type: %s' % kind.mime)

if __name__ == '__main__':

main()

支持类型

图片

• jpg – image/jpeg

• png – image/png

• gif – image/gif

• webp – image/webp

• cr2 – image/x-canon-cr2

• tif – image/tiff

• bmp – image/bmp

• jxr – image/vnd.ms-photo

• psd – image/vnd.adobe.photoshop

• ico – image/x-icon

视频

• mp4 – video/mp4

• m4v – video/x-m4v

• mkv – video/x-matroska

• webm – video/webm

• mov – video/quicktime

• avi – video/x-msvideo

• wmv – video/x-ms-wmv

• mpg – video/mpeg

• flv – video/x-flv

音频

• mid – audio/midi

• mp3 – audio/mpeg

• m4a – audio/m4a

• ogg – audio/ogg

• flac – audio/x-flac

• wav – audio/x-wav

• amr – audio/amr

资料库

• epub – application/epub+zip

• zip – application/zip

• tar – application/x-tar

• rar – application/x-rar-compressed

• gz – application/gzip

• bz2 – application/x-bzip2

• 7z – application/x-7z-compressed

• xz – application/x-xz

• pdf – application/pdf

• exe – application/x-msdownload

• swf – application/x-shockwave-flash

• rtf – application/rtf

• eot – application/octet-stream

• ps – application/postscript

• sqlite – application/x-sqlite3

• nes – application/x-nintendo-nes-rom

• crx – application/x-google-chrome-extension

• cab – application/vnd.ms-cab-compressed

• deb – application/x-deb

• ar – application/x-unix-archive

• Z – application/x-compress

• lz – application/x-lzip

字体

• woff – application/font-woff

• woff2 – application/font-woff

• ttf – application/font-sfnt

• otf – application/font-sfnt

基准测试

使用链接中的文件进行测试,你可以点击获得到它: real files .

Environment: OSX x64 i7 2.7 Ghz

------------------------------------------------------------------------------------------ benchmark: 7 tests ------------------------------------------------------------------------------------------

Name (time in ns) Min Max Mean StdDev Median IQR Outliers(*) Rounds Iterations

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

test_infer_image_from_bytes 357.6279 (1.0) 29,166.5395 (1.0) 1,642.3360 (1.0) 380.9934 (1.0) 1,509.9843 (1.0) 158.9457 (1.0) 9095;13752 102301 6

test_infer_audio_from_bytes 953.6743 (2.67) 96,082.6874 (3.29) 16,534.5880 (10.07) 3,002.1143 (7.88) 15,974.0448 (10.58) 953.6743 (6.00) 4514;6051 41528 1

test_infer_video_from_bytes 13,828.2776 (38.67) 272,989.2731 (9.36) 16,151.3144 (9.83) 3,361.2320 (8.82) 15,020.3705 (9.95) 953.6743 (6.00) 2522;2887 22193 1

test_infer_image_from_disk 15,974.0448 (44.67) 108,957.2906 (3.74) 18,621.0844 (11.34) 3,895.4441 (10.22) 17,166.1377 (11.37) 1,192.0929 (7.50) 1528;1804 10206 1

test_infer_video_from_disk 23,841.8579 (66.67) 229,120.2545 (7.86) 28,691.3476 (17.47) 6,242.9901 (16.39) 25,987.6251 (17.21) 4,053.1158 (25.50) 1987;1247 15651 1

test_infer_zip_from_disk 26,941.2994 (75.33) 230,073.9288 (7.89) 32,123.3861 (19.56) 7,524.4988 (19.75) 29,087.0667 (19.26) 4,768.3716 (30.00) 1349;1292 16132 1

test_infer_tar_from_disk 33,855.4382 (94.67) 164,031.9824 (5.62) 36,884.4401 (22.46) 4,489.4443 (11.78) 36,001.2054 (23.84) 953.6743 (6.00) 1036;1828 14666 1

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

本文标题: Python使用filetype精确判断文件类型

本文地址: http://www.cppcns.com/jiaoben/python/195098.html

你可能感兴趣的:(python如何区分文件类型)