百度OCR图像文字提取

1.创建应用

打开下面的链接,点击创建应用。

https://console.bce.baidu.com/ai/?_=1579078251518&fromai=1#/ai/ocr/overview/index

填写应用名称、应用类型、应用描述。

创建好后返回应用列表,如下图所示

百度OCR图像文字提取_第1张图片

2.获取access_token

https://aip.baidubce.com/oauth/2.0/token?grant_type=client_credentials&client_id=【API Key】&client_secret=【Secret Key】

把【API Key】和【Secret Key】替换成应用列表中的信息

可以得到如下返回:

{
    "refresh_token": "25.4979a30850668de1c396b0807ffa0fa8.315360000.1894433433.282335-18282523",
    "expires_in": 2592000,
    "session_key": "9mzdCy/SVjCt4Tx49sMo7m9EMJumMpu/LoOCbwZOZGi9Jy1tHLI1+2LOmD85L4FCTURQhsXe3Gq3yU0RRpAigO7FP22A8A==",
    "access_token": "24.073a6d247821b17228a40fd66a19c8a1.2592000.1581665433.282335-18282523",
    "scope": "public vis-ocr_ocr brain_ocr_scope brain_ocr_general brain_ocr_general_basic vis-ocr_business_license brain_ocr_webimage brain_all_scope brain_ocr_idcard brain_ocr_driving_license brain_ocr_vehicle_license vis-ocr_plate_number brain_solution brain_ocr_plate_number brain_ocr_accurate brain_ocr_accurate_basic brain_ocr_receipt brain_ocr_business_license brain_solution_iocr brain_qrcode brain_ocr_handwriting brain_ocr_passport brain_ocr_vat_invoice brain_numbers brain_ocr_business_card brain_ocr_train_ticket brain_ocr_taxi_receipt vis-ocr_household_register vis-ocr_vis-classify_birth_certificate vis-ocr_台湾通行证 vis-ocr_港澳通行证 vis-ocr_机动车检验合格证识别 vis-ocr_车辆vin码识别 vis-ocr_定额发票识别 vis-ocr_保单识别 vis-ocr_行程单识别 brain_ocr_vin brain_ocr_quota_invoice brain_ocr_birth_certificate brain_ocr_household_register brain_ocr_HK_Macau_pass brain_ocr_taiwan_pass brain_ocr_vehicle_certificate brain_ocr_air_ticket brain_ocr_insurance_doc wise_adapt lebo_resource_base lightservice_public hetu_basic lightcms_map_poi kaidian_kaidian ApsMisTest_Test权限 vis-classify_flower lpq_开放 cop_helloScope ApsMis_fangdi_permission smartapp_snsapi_base iop_autocar oauth_tp_app smartapp_smart_game_openapi oauth_sessionkey smartapp_swanid_verify smartapp_opensource_openapi smartapp_opensource_recapi fake_face_detect_开放Scope vis-ocr_虚拟人物助理 idl-video_虚拟人物助理",
    "session_secret": "35f2d34377b3359bf5b9912fa679ee23"
}

其中,access_token就是我们需要的,expires_in则是过期时间。

3.进行图片文字提取

HTTP 方法:POST

请求URL: https://aip.baidubce.com/rest/2.0/ocr/v1/general_basic

URL参数:

参数
access_token 必填,使用上一步中获取到的access_token
Content-Type application/x-www-form-urlencoded

百度OCR图像文字提取_第2张图片

返回参数:

百度OCR图像文字提取_第3张图片

4.测试

我们使用当天的百度热搜截图做为测试图片,如下所示:

百度OCR图像文字提取_第4张图片

把图片转为base64的格式(对应请求的image参数),或者直接上传图床得到对应的URL(对应请求的url参数),然后进行请求。

得到的返回结果如下:

{
    "log_id": 7888675245507784087,
    "words_result_num": 23,
    "words_result": [
        {
            "words": "搜索热点"
        },
        {
            "words": "换一换υ"
        },
        {
            "words": "1张纪中劲歌热舞"
        },
        {
            "words": "281万↑"
        },
        {
            "words": "2阿根廷53级地震"
        },
        {
            "words": "250万會"
        },
        {
            "words": "博士被纹冒"
        },
        {
            "words": "195万會"
        },
        {
            "words": "4热刺2-1迎新年首胜断"
        },
        {
            "words": "188万"
        },
        {
            "words": "5郝云方律师声明"
        },
        {
            "words": "184万"
        },
        {
            "words": "6陈冠希表白爱妻"
        },
        {
            "words": "153万"
        },
        {
            "words": "女王支持哈里决定"
        },
        {
            "words": "145"
        },
        {
            "words": "5万"
        },
        {
            "words": "8北大清华开放课程"
        },
        {
            "words": "124万t"
        },
        {
            "words": "9湖人大胜骑士"
        },
        {
            "words": "124万"
        },
        {
            "words": "⑩躁翟天临即将复出"
        },
        {
            "words": "121万"
        }
    ]
}

可见识别的准确度还是不错的。

你可能感兴趣的:(机器学习)