内容目录
- 一、腾讯云文字识别OCR接口:通用票据识别(高级版),识别效果图(转载于腾讯云官网):
- 二、准备工作
-
- 1、安装python,本文用PyCharm演示代码。
- 2、注册腾讯云(需要实名认证),获取API访问密匙:SecretId和SecretKey:
- 3、申请腾讯云免费使用额度,每月可免费识别1000张混贴财务发票:
- 4、阅读api接口参数说明文档:
- 5、准备文件夹用于存放待识别的混贴发票
- 三、python代码调试
-
- 1、以下代码用于识别发票大类type为3的增值税发票,发票子类型包括增值税纸质专票、增值税纸质普票、增值税电子专票、增值税电子普票、区块链电子发票、增值税电子普通发票(通行费):
- 2、发票识别数据转存excel后,最终结果如下图:
一、腾讯云文字识别OCR接口:通用票据识别(高级版),识别效果图(转载于腾讯云官网):
二、准备工作
1、安装python,本文用PyCharm演示代码。
2、注册腾讯云(需要实名认证),获取API访问密匙:SecretId和SecretKey:
3、申请腾讯云免费使用额度,每月可免费识别1000张混贴财务发票:
4、阅读api接口参数说明文档:
5、准备文件夹用于存放待识别的混贴发票
三、python代码调试
1、以下代码用于识别发票大类type为3的增值税发票,发票子类型包括增值税纸质专票、增值税纸质普票、增值税电子专票、增值税电子普票、区块链电子发票、增值税电子普通发票(通行费):
import json
import base64
import os
import pandas
from tencentcloud.common import credential
from tencentcloud.common.profile.client_profile import ClientProfile
from tencentcloud.common.profile.http_profile import HttpProfile
from tencentcloud.common.exception.tencent_cloud_sdk_exception import TencentCloudSDKException
from tencentcloud.ocr.v20181119 import ocr_client, models
PNG_path = r'C:\Users\Administrator\Desktop\invoices'
list01 = os.listdir(PNG_path)
cred = credential.Credential("SecretId", "SecretKey")
httpProfile = HttpProfile()
httpProfile.endpoint = "ocr.tencentcloudapi.com"
clientProfile = ClientProfile()
clientProfile.httpProfile = httpProfile
client = ocr_client.OcrClient(cred, "ap-guangzhou", clientProfile)
req = models.RecognizeGeneralInvoiceRequest()
SingleInvoiceInfos = []
SingleInvoiceInfos2 = []
newary = []
for ids in list01:
print(ids)
try:
with open(os.path.join(PNG_path, ids), 'rb') as fp:
file_content = fp.read()
image_base64 = base64.b64encode(file_content)
params = {
"ImageBase64": str(image_base64, encoding="utf-8"),
}
req.from_json_string(json.dumps(params))
resp = client.RecognizeGeneralInvoice(req)
result1 = json.loads(resp.to_json_string())
for item in result1['MixedInvoiceItems']:
if item['Type'] == 3:
VatInvoiceName = item['SubType']
newary.append({'发票文件名': ids,
'发票类别-腾讯云':VatInvoiceName,
'发票种类': item['SingleInvoiceInfos'][VatInvoiceName]['Title'],
'发票代码': item['SingleInvoiceInfos'][VatInvoiceName]['Code'],
'发票号码': item['SingleInvoiceInfos'][VatInvoiceName]['Number'],
'开票日期': item['SingleInvoiceInfos'][VatInvoiceName]['Date'],
'校验码': item['SingleInvoiceInfos'][VatInvoiceName]['CheckCode'],
'合计金额': item['SingleInvoiceInfos'][VatInvoiceName]['PretaxAmount'],
'合计税额': item['SingleInvoiceInfos'][VatInvoiceName]['Tax'],
'价税合计(小写)': item['SingleInvoiceInfos'][VatInvoiceName]['Total'],
'价税合计(大写)': item['SingleInvoiceInfos'][VatInvoiceName]['TotalCn'],
'购买方名称': item['SingleInvoiceInfos'][VatInvoiceName]['Buyer'],
'购买方纳税人识别号': item['SingleInvoiceInfos'][VatInvoiceName]['BuyerTaxID'],
'购买方地址及电话': item['SingleInvoiceInfos'][VatInvoiceName]['BuyerAddrTel'],
'购买方银行及账户': item['SingleInvoiceInfos'][VatInvoiceName]['BuyerBankAccount'],
'销售方名称': item['SingleInvoiceInfos'][VatInvoiceName]['Seller'],
'销售方纳税人识别号': item['SingleInvoiceInfos'][VatInvoiceName]['SellerTaxID'],
'销售方地址及电话': item['SingleInvoiceInfos'][VatInvoiceName]['SellerAddrTel'],
'销售方银行及账户': item['SingleInvoiceInfos'][VatInvoiceName]['SellerBankAccount'],
'收款人': item['SingleInvoiceInfos'][VatInvoiceName]['Receiptor'],
'复核人': item['SingleInvoiceInfos'][VatInvoiceName]['Reviewer'],
'开票人': item['SingleInvoiceInfos'][VatInvoiceName]['Issuer'],
'备注': item['SingleInvoiceInfos'][VatInvoiceName]['Remark']})
except TencentCloudSDKException as err:
print(err)
print(newary)
newsdf = pandas.DataFrame(newary,
columns=['发票文件名', '发票类别-腾讯云', '发票种类', '发票代码', '发票号码', '开票日期', '校验码', '合计金额', '合计税额', '价税合计(小写)', '价税合计(大写)', '购买方名称',
'购买方纳税人识别号', '购买方地址及电话', '购买方银行及账户', '销售方名称', '销售方纳税人识别号', '销售方地址及电话', '销售方银行及账户',
'收款人', '复核人', '开票人', '备注'])
newsdf.to_excel('增值税发票信息统计' + '-' + '腾讯云API接口-发票与其他单据混贴' + '.xlsx')
2、发票识别数据转存excel后,最终结果如下图: