内容目录
- 一、腾讯云文字识别OCR接口:通用票据识别(高级版),识别效果图(转载于腾讯云官网):
- 二、准备工作
-
- 1、安装python,本文用PyCharm演示代码。
- 2、注册腾讯云(需要实名认证),获取API访问密匙:SecretId和SecretKey:
- 3、申请腾讯云免费使用额度,每月可免费识别1000张混贴财务发票:
- 4、阅读api接口参数说明文档:
- 5、准备文件夹用于存放待识别的混贴发票
- 三、python代码调试
-
- 1、以下代码用于识别发票大类type为3的增值税发票,发票子类型包括增值税纸质专票、增值税纸质普票、增值税电子专票、增值税电子普票、区块链电子发票、增值税电子普通发票(通行费):
- 2、发票识别数据转存excel后,最终结果如下图:
一、腾讯云文字识别OCR接口:通用票据识别(高级版),识别效果图(转载于腾讯云官网):
![python调用腾讯云接口,实现财务发票混贴模式下,批量识别并转存excel表格的功能_第1张图片](http://img.e-com-net.com/image/info8/677e75a6e5b245f6b3557f3f5a73b71d.jpg)
二、准备工作
1、安装python,本文用PyCharm演示代码。
2、注册腾讯云(需要实名认证),获取API访问密匙:SecretId和SecretKey:
![python调用腾讯云接口,实现财务发票混贴模式下,批量识别并转存excel表格的功能_第2张图片](http://img.e-com-net.com/image/info8/b1d0ee6c6e014b2d9b8d04493ee94767.jpg)
![python调用腾讯云接口,实现财务发票混贴模式下,批量识别并转存excel表格的功能_第3张图片](http://img.e-com-net.com/image/info8/5f04a2768552461e80fe56dcf73a5de8.jpg)
3、申请腾讯云免费使用额度,每月可免费识别1000张混贴财务发票:
![python调用腾讯云接口,实现财务发票混贴模式下,批量识别并转存excel表格的功能_第4张图片](http://img.e-com-net.com/image/info8/04178e6478fa499697ed2d76ed225a3a.jpg)
4、阅读api接口参数说明文档:
![python调用腾讯云接口,实现财务发票混贴模式下,批量识别并转存excel表格的功能_第5张图片](http://img.e-com-net.com/image/info8/da187d44f2fc4347b5c0d8dd41fabdf3.jpg)
![python调用腾讯云接口,实现财务发票混贴模式下,批量识别并转存excel表格的功能_第6张图片](http://img.e-com-net.com/image/info8/79cec0686e6843afa11127f64bc2b418.jpg)
5、准备文件夹用于存放待识别的混贴发票
![python调用腾讯云接口,实现财务发票混贴模式下,批量识别并转存excel表格的功能_第7张图片](http://img.e-com-net.com/image/info8/decd1b7c3317484bb5fb5afb5c607285.jpg)
三、python代码调试
1、以下代码用于识别发票大类type为3的增值税发票,发票子类型包括增值税纸质专票、增值税纸质普票、增值税电子专票、增值税电子普票、区块链电子发票、增值税电子普通发票(通行费):
import json
import base64
import os
import pandas
from tencentcloud.common import credential
from tencentcloud.common.profile.client_profile import ClientProfile
from tencentcloud.common.profile.http_profile import HttpProfile
from tencentcloud.common.exception.tencent_cloud_sdk_exception import TencentCloudSDKException
from tencentcloud.ocr.v20181119 import ocr_client, models
PNG_path = r'C:\Users\Administrator\Desktop\invoices'
list01 = os.listdir(PNG_path)
cred = credential.Credential("SecretId", "SecretKey")
httpProfile = HttpProfile()
httpProfile.endpoint = "ocr.tencentcloudapi.com"
clientProfile = ClientProfile()
clientProfile.httpProfile = httpProfile
client = ocr_client.OcrClient(cred, "ap-guangzhou", clientProfile)
req = models.RecognizeGeneralInvoiceRequest()
SingleInvoiceInfos = []
SingleInvoiceInfos2 = []
newary = []
for ids in list01:
print(ids)
try:
with open(os.path.join(PNG_path, ids), 'rb') as fp:
file_content = fp.read()
image_base64 = base64.b64encode(file_content)
params = {
"ImageBase64": str(image_base64, encoding="utf-8"),
}
req.from_json_string(json.dumps(params))
resp = client.RecognizeGeneralInvoice(req)
result1 = json.loads(resp.to_json_string())
for item in result1['MixedInvoiceItems']:
if item['Type'] == 3:
VatInvoiceName = item['SubType']
newary.append({'发票文件名': ids,
'发票类别-腾讯云':VatInvoiceName,
'发票种类': item['SingleInvoiceInfos'][VatInvoiceName]['Title'],
'发票代码': item['SingleInvoiceInfos'][VatInvoiceName]['Code'],
'发票号码': item['SingleInvoiceInfos'][VatInvoiceName]['Number'],
'开票日期': item['SingleInvoiceInfos'][VatInvoiceName]['Date'],
'校验码': item['SingleInvoiceInfos'][VatInvoiceName]['CheckCode'],
'合计金额': item['SingleInvoiceInfos'][VatInvoiceName]['PretaxAmount'],
'合计税额': item['SingleInvoiceInfos'][VatInvoiceName]['Tax'],
'价税合计(小写)': item['SingleInvoiceInfos'][VatInvoiceName]['Total'],
'价税合计(大写)': item['SingleInvoiceInfos'][VatInvoiceName]['TotalCn'],
'购买方名称': item['SingleInvoiceInfos'][VatInvoiceName]['Buyer'],
'购买方纳税人识别号': item['SingleInvoiceInfos'][VatInvoiceName]['BuyerTaxID'],
'购买方地址及电话': item['SingleInvoiceInfos'][VatInvoiceName]['BuyerAddrTel'],
'购买方银行及账户': item['SingleInvoiceInfos'][VatInvoiceName]['BuyerBankAccount'],
'销售方名称': item['SingleInvoiceInfos'][VatInvoiceName]['Seller'],
'销售方纳税人识别号': item['SingleInvoiceInfos'][VatInvoiceName]['SellerTaxID'],
'销售方地址及电话': item['SingleInvoiceInfos'][VatInvoiceName]['SellerAddrTel'],
'销售方银行及账户': item['SingleInvoiceInfos'][VatInvoiceName]['SellerBankAccount'],
'收款人': item['SingleInvoiceInfos'][VatInvoiceName]['Receiptor'],
'复核人': item['SingleInvoiceInfos'][VatInvoiceName]['Reviewer'],
'开票人': item['SingleInvoiceInfos'][VatInvoiceName]['Issuer'],
'备注': item['SingleInvoiceInfos'][VatInvoiceName]['Remark']})
except TencentCloudSDKException as err:
print(err)
print(newary)
newsdf = pandas.DataFrame(newary,
columns=['发票文件名', '发票类别-腾讯云', '发票种类', '发票代码', '发票号码', '开票日期', '校验码', '合计金额', '合计税额', '价税合计(小写)', '价税合计(大写)', '购买方名称',
'购买方纳税人识别号', '购买方地址及电话', '购买方银行及账户', '销售方名称', '销售方纳税人识别号', '销售方地址及电话', '销售方银行及账户',
'收款人', '复核人', '开票人', '备注'])
newsdf.to_excel('增值税发票信息统计' + '-' + '腾讯云API接口-发票与其他单据混贴' + '.xlsx')
2、发票识别数据转存excel后,最终结果如下图:
![python调用腾讯云接口,实现财务发票混贴模式下,批量识别并转存excel表格的功能_第8张图片](http://img.e-com-net.com/image/info8/c6a261781adb4d4da5ee8ac8634f8267.jpg)