刚开始用appium进行手机自动化处理,优点是支持苹果手机,但是缺点也很明显,构造环境比较复杂,而且初始化很慢,后来发现python-uiautomator2,这个真的很好用,简单易懂,基本操作可以参考手机自动化测试(准备篇)。
但是我发现这个库对于xpath并没有特别好的支持,而且图像识别方面并没有给出API,需要自己进行设定,因此本篇主要介绍笔者对于uiautomator2库的拓展代码。
首先文件命名为“appmi.py”,定义了dxpath类,因此首先将appmi.py放入指定路径,可以通过环境变量添加,或者放入python安装的路径里,例如F:\Anaconda3。然后通过输入:from appmi import dxpath
进行导入。下面逐一介绍其内容:
导入模块
导入要用到的第三方库:PIL用于图像文件处理;lxml用于xpath定位;re用于正则表达式;os调取系统接口;其中tencentOCR是自己写的模块,用于调取腾讯文字OCR接口,随后会介绍;aircv用于图像匹配,这里有一点要注意的是aircv依赖于cv2库,有个很坑的地方是,不能直接通过pip install cv2进行安装,而应该是pip install opencv-python,就可以安装了。threading用于多线程;decorate为装饰器,查看程序运行时间,便于优化调试。
from PIL import Image
from lxml import etree
import re ,os ,time
import tencentOCR
import aircv as ac
import threading
from decorate import timer
import uiautomator2 as u2
初始化类:d为连接设备
class dxpath():
def __init__(self,d):
self.d = d
获取底层渲染信息程序
获取底层xml文件,并自动打开,这里的保存路径需要更改,否则默认为当前工作路径,如果不想要打开,可以将open_file设为False,这样只是在该路径下创建xml文件,函数返回底层xml,类型为str。
def code(self,file='dump_hierarchy.xml',
open_file=True):
xml_content = self.d.dump_hierarchy()
with open(file, 'w',encoding='utf-8') as f:
f.write(xml_content)
if open_file :
threadObj = threading.Thread(target=os.system,args=[file])
threadObj.start()
return xml_content
针对xpath再定位编写的程序
通过xpath进行定位,arg为xpath参数,返回的对象可以继续通过xpath定位,其返回类型为xpath对象
def dxpath(self,arg):
#通过xpath获得etree.module,可以继续使用xpath定位,适用于从某一区域继续检索
xml_content = self.d.dump_hierarchy()
root = etree.fromstring(xml_content.encode('utf-8'))
return root.xpath(arg)
将以上xpath获取的对象继续xpath定位,并获取控件文本信息,其中t为dxpath定位后的目标对象,arg为xpath参数,One为选取满足其条件的第一个文本,其类型为str;选择False则返回满足条件的所有文本,其类型为list。
def dxpath_text(self,t,arg,One=True):
#t为dxpath迭代的对象
#for t in dxpath('')
args = '{}/@text'.format(arg)
text= []
for txt in t.xpath(args):
text.append(str(txt))
if One : return text[0]
else : return text
将以上xpath获取的对象继续xpath定位,t为dxpath初次定位后的目标对象,arg为xpath参数,并判断其控件是否存在,其返回类型为xpath对象。
def dxpath_exist(self,t,arg):
return t.xpath(arg)
主体部分
通过xpath进行定位,判断控件是否存在,arg为xpath参数,其返回类型为xpath对象。
def exist(self,arg):
return self.dxpath(arg)
等待指定控件出现,其timeout控制最大等待时间,默认等待十秒,若超过最大等待时间则报错 ;arg为xpath参数;always设定True时,一直进行等待直到出现为止;若控件出现,函数返回xpath对象。
def wait(self,arg ,timeout=10 ,always = False):
deadline = time.time() + timeout
while time.time() < deadline or always:
data = self.dxpath(arg)
if data : return data
time.sleep(0.1)
raise Exception("未找到控件")
判断控件是否存在,不存在则退出循环,并返回xpath对象;若存在,则一直进行等待,其中arg为xpath参数。
def not_exists(self,arg):
while True :
data = self.dxpath(arg)
if not data : return data
time.sleep(0.1)
得到xpath匹配的控件文本信息,arg为xpath参数,One为选取满足其条件的第一个文本,其类型为str;选择False则返回满足条件的所有文本,其类型为list。
def get_text(self,arg,One=True):
args = '{}/@text'.format(arg)
text= []
for txt in self.dxpath(args):
text.append(str(txt))
if One : return text[0]
else : return text
获取xpath定位的控件四个坐标信息,arg为xpath参数,默认t=None获取初次xpath定位的控件中心坐标,若不存在则报错,返回控件矩形的左上角坐标(lx,ly),以及宽度和高度(rx,ry);t可以为xpath对象,用于xpath再定位控件坐标的获取
def center(self,arg,t=None):
bounds = '{}/@bounds'.format(arg)
try :
if t is not None: coord = str(t.xpath(bounds)[0])
else : coord = str(self.dxpath(bounds)[0])
lx, ly, rx, ry = map(int,re.findall(r"\d+", coord))
return lx, ly, rx, ry
except : raise Exception("未找到控件")
点击xpath定位的控件中心坐标;arg为xpath参数;timeout为最大响应时间,否则报错;at_once=True代表无需等待直接点击,若不存在该控件则报错,at_once=False代表等待timeout秒;set_x代表x坐标调整;set_y代表y坐标调整;picture=False代表不进行截图,picture=True代表点击且截图,默认存入当前工作路径;picture_name为截图名,默认为’crop.jpg’;t=None获取初次xpath定位的控件中心坐标,t可以为xpath对象,用于xpath再定位控件坐标的获取
def click(self,arg,timeout=10,at_once = False,set_x=0,set_y=0,
picture=False,picture_name='crop',t=None):
deadline = time.time() + timeout
while time.time() < deadline:
try :
lx, ly, rx, ry = self.center(arg,t)
x ,y =(lx + rx) // 2, (ly + ry) // 2
x=set_x+x ; y=set_y+y
self.d.click(x,y)
if picture:
catIm = Image.open('screenshot.jpg')
croppedIm = catIm.crop((lx+set_x, ly+set_y,
lx+set_x+rx, ly+set_y+ry))
croppedIm.save('%s.jpg'%picture_name)
return x,y
except :
if at_once : raise Exception("未找到控件")
else :time.sleep(0.1)
raise Exception("未找到控件")
文字OCR和图像识别部分
通过接入腾讯文字OCR接口,获取图片里的文字以及对应位置信息,返回类型为dict。
def ocr(self):
self.d.screenshot('screenshot.jpg')
connect = tencentOCR.TencentOCR('screenshot.jpg')
ocr={}
for key in connect:
lx, ly, rx, ry =connect[key]['itemcoord'].values()
x, y = lx + rx//2, ly + ry // 2
ocr[key]=(x,y)
return ocr
图像识别点击,根据图像信息进行点击,支持多文字点击以及截图。char代表识别的字符串,需要是list类型。timeout为最大响应时间,否则报错;at_once=True代表无需等待直接点击,若不存在该控件则报错,at_once=False代表等待timeout秒;repetition代表重复点击次数,默认为1次;set_x代表x坐标调整;set_y代表y坐标调整;picture=False代表不进行截图,picture=True代表点击且截图,默认存入当前工作路径;picture_name为截图名,默认为’crop.jpg’;其返回对象:connect为dict类型,为图片文本信息,success为列表,保存点击文本的坐标,其内元素为元组类型。
def ocr_click(self,char=None,timeout=10,at_once = False,repetition=1,
set_x=0,set_y=0,picture=False,picture_name='crop'):
deadline = time.time() + timeout
while time.time() < deadline:
success = []
self.d.screenshot('screenshot.jpg')
connect = tencentOCR.TencentOCR('screenshot.jpg')
for single in char:
try : lx, ly, rx, ry =connect[single]['itemcoord'].values() #优先精确查找
except :
for key in connect: #精确查找找不到的话查看是否包含在里面
if single in key :
lx, ly, rx, ry =connect[key]['itemcoord'].values()
break
try : x, y = set_x + lx + rx//2, set_y + ly + ry // 2
except : continue
for i in range(repetition) :
if i != 0 : time.sleep(0.3)
self.d.click(x,y)
success.append((x,y))
if picture:
catIm = Image.open('screenshot.jpg')
croppedIm = catIm.crop((lx+set_x, ly+set_y,
lx+set_x+rx, ly+set_y+ry))
croppedIm.save('%s.jpg'%picture_name)
if len(char)>1 :time.sleep(0.3)
if success or at_once: break
if not success : raise Exception("未找到控件")
return connect , success #connect是腾讯OCR发回的数据,success为点击坐标
进行图像匹配,判断手机屏幕内是否出现待查找图片,若出现则返回匹配信息;其中imgobj为待查找图片,imsrc为原始图像,默认为手机截屏,confidence为模糊度,尽量在0.8以上,click=True表示点击匹配坐标
def locatonScreen(self,imgobj,confidence=0.9,click=True,imsrc=None):
#imgobj待查找图片,imsrc为原始图像,默认为手机截屏
if not imsrc: imsrc = ac.imread(self.d.screenshot('screenshot.jpg'))
else : imsrc = ac.imread(imsrc)
imobj = ac.imread(imgobj)
match_result = ac.find_template(imsrc,imobj,confidence)
if match_result is not None:
match_result['shape']=(imsrc.shape[1],imsrc.shape[0])#0为高,1为宽
if click:
x,y = match_result['result']
self.d.click(x,y)
return match_result
程序调试,并显示程序运行时间
@timer
def main():
# =============================================================================
# xml = mi.code() #获取底层渲染xml文件
# ocr =mi.ocr()
# mi.click('//*[@text="发现"]/preceding-sibling::node')
# connect , [(x,y)] = mi.ocr_click([x for x in '15806527017'])
# connect , [(x,y)] = mi.ocr_click(['QQ浏览器'],picture=True)
# a=mi.locatonScreen(r'C:\Users\Administrator\Desktop\1.jpg')
# =============================================================================
connect , success = mi.ocr_click(['QQ浏览器'],picture=True)
if __name__ == '__main__':
os.chdir(r'C:\Users\Administrator\Desktop')
d = u2.connect_usb('c00c166c')
#d = u2.connect_usb('d5212f2c')
mi=dxpath(d)
main()
下面介绍tencentOCR.py文件
uiautomator2库本身带有OCR识别功能,但是并没有对应的API接口,需要自己找,于是从腾讯开发者里找到了文字OCR,结果发现并没有对应的python3SDK接口,真是坑,只好自己编个程序获取数据了,找到代码里appid,secret_id,secret_key,三个变量需要自己更改一下,腾讯AI中心申请一个账号,在控制台能看到这几个变量,每个账号都有专属的ID。然后将下面的代码保存为tencentOCR.py文件,放到python路径里,作为第三方模块导入,后面要用到,如果python报错没找到该模块,就需要pip install [ packname],安装一下。其中参数picture为图像路径,函数返回类型为dict,为图像文本信息。
import requests
import hmac
import hashlib
import base64
import time
import random
import json
def TencentOCR(picture):
appid = 'xxxx'
secret_id = 'xxxx'
secret_key = 'xxxxx'
bucket = 'BUCKET'
expired = time.time() + 2592000
current = time.time()
rdm = ''.join(random.choice("0123456789") for i in range(10))
info = "a=" + appid + "&b=" + bucket + "&k=" + secret_id + "&e=" + str(expired) + "&t=" + str(current) + "&r=" + str(
rdm) + "&u=0&f="
signindex = hmac.new(bytes(secret_key,'utf-8'),bytes(info,'utf-8'), hashlib.sha1).digest() # HMAC-SHA1加密
sign = base64.b64encode(signindex + bytes(info,'utf-8')) # base64转码,也可以用下面那行转码
#sign=base64.b64encode(signindex+info.encode('utf-8'))
url = "http://recognition.image.myqcloud.com/ocr/general"
headers = {'Host': 'recognition.image.myqcloud.com',
"Authorization": sign,
}
files = {'appid': (None,appid),
'bucket': (None,bucket),
'image': ('1.jpg',open(picture,'rb'),'image/jpeg')
}
r = requests.post(url, files=files,headers=headers)
responseinfo = r.content
data = responseinfo.decode('utf-8')
json_data = json.loads(data)
datas = json_data['data']['items']
recognise = {}
for obj in datas:
recognise[obj['itemstring']] = obj
return recognise
if __name__ == '__main__':
beg = time.time()
connect = TencentOCR(r'C:\Users\Administrator\Desktop\1.jpg')#图片路径
end = time.time()
print(end - beg)
最后给出完整的appmi.py代码
# -*- coding: utf-8 -*-
"""
Created on Sat May 4 11:04:19 2019
@author: Administrator
"""
from PIL import Image
from lxml import etree
import re ,os ,time
import tencentOCR
import aircv as ac
import threading
from decorate import timer
import uiautomator2 as u2
class dxpath():
def __init__(self,d):
self.d = d
def code(self,file=r'C:\Users\Administrator\Desktop\dump_hierarchy.xml',
open_file=True):
xml_content = self.d.dump_hierarchy()
with open(file, 'w',encoding='utf-8') as f:
f.write(xml_content)
if open_file :
threadObj = threading.Thread(target=os.system,args=[file])
threadObj.start()
return xml_content
def dxpath(self,arg):
#通过xpath获得etree.module,可以继续使用xpath定位,适用于从某一区域继续检索
xml_content = self.d.dump_hierarchy()
root = etree.fromstring(xml_content.encode('utf-8'))
return root.xpath(arg)
def dxpath_text(self,t,arg,One=True):
#t为dxpath迭代的对象
#for t in dxpath('')
args = '{}/@text'.format(arg)
text= []
for txt in t.xpath(args):
text.append(str(txt))
if One : return text[0]
else : return text
def dxpath_exist(self,t,arg):
return t.xpath(arg)
def exist(self,arg):
return self.dxpath(arg)
def wait(self,arg ,timeout=10 ,always = False):
#默认等待十秒,否则返回False ,always设定True时,一直进行等待直到出现为止
deadline = time.time() + timeout
while time.time() < deadline or always:
data = self.dxpath(arg)
if data : return data
time.sleep(0.1)
raise Exception("未找到控件")
def not_exists(self,arg):
while True :
data = self.dxpath(arg)
if not data : return data
time.sleep(0.1)
def get_text(self,arg,One=True):
args = '{}/@text'.format(arg)
text= []
for txt in self.dxpath(args):
text.append(str(txt))
if One : return text[0]
else : return text
def center(self,arg,t=None):
bounds = '{}/@bounds'.format(arg)
try :
if t is not None: coord = str(t.xpath(bounds)[0])
else : coord = str(self.dxpath(bounds)[0])
lx, ly, rx, ry = map(int,re.findall(r"\d+", coord))
return lx, ly, rx, ry
except : raise Exception("未找到控件")
def click(self,arg,timeout=10,at_once = False,set_x=0,set_y=0,
picture=False,picture_name='crop',t=None):
#char代表识别的字符串,timeout为响应时间,at_once为只判别一次
#,set_x代表x坐标调整,picture是保存图像,picture_name为保存图片名, #t为dxpath迭代的对象
deadline = time.time() + timeout
while time.time() < deadline:
try :
lx, ly, rx, ry = self.center(arg,t)
x ,y =(lx + rx) // 2, (ly + ry) // 2
x=set_x+x ; y=set_y+y
self.d.click(x,y)
if picture:
catIm = Image.open('screenshot.jpg')
croppedIm = catIm.crop((lx+set_x, ly+set_y,
lx+set_x+rx, ly+set_y+ry))
croppedIm.save('%s.jpg'%picture_name)
return x,y
except :
if at_once : raise Exception("未找到控件")
else :time.sleep(0.1)
raise Exception("未找到控件")
def ocr(self):
self.d.screenshot('screenshot.jpg')
connect = tencentOCR.TencentOCR('screenshot.jpg')
ocr={}
for key in connect:
lx, ly, rx, ry =connect[key]['itemcoord'].values()
x, y = lx + rx//2, ly + ry // 2
ocr[key]=(x,y)
return ocr
def ocr_click(self,char=None,timeout=10,at_once = False,repetition=1,
set_x=0,set_y=0,picture=False,picture_name='crop'):
#char代表识别的字符串,timeout为响应时间,at_once为只判别一次,repetition代表重复点击次数
#,set_x代表x坐标调整,picture是保存图像,picture_name为保存图片名
deadline = time.time() + timeout
while time.time() < deadline:
success = []
self.d.screenshot('screenshot.jpg')
connect = tencentOCR.TencentOCR('screenshot.jpg')
for single in char:
try : lx, ly, rx, ry =connect[single]['itemcoord'].values() #优先精确查找
except :
for key in connect: #精确查找找不到的话查看是否包含在里面
if single in key :
lx, ly, rx, ry =connect[key]['itemcoord'].values()
break
try : x, y = set_x + lx + rx//2, set_y + ly + ry // 2
except : continue
for i in range(repetition) :
if i != 0 : time.sleep(0.3)
self.d.click(x,y)
success.append((x,y))
if picture:
catIm = Image.open('screenshot.jpg')
croppedIm = catIm.crop((lx+set_x, ly+set_y,
lx+set_x+rx, ly+set_y+ry))
croppedIm.save('%s.jpg'%picture_name)
if len(char)>1 :time.sleep(0.3)
if success or at_once: break
if not success : raise Exception("未找到控件")
return connect , success #connect是腾讯OCR发回的数据,success为点击坐标
def locatonScreen(self,imgobj,confidence=0.9,click=True,imsrc=None):
#imgobj待查找图片,imsrc为原始图像,默认为手机截屏
if not imsrc: imsrc = ac.imread(self.d.screenshot('screenshot.jpg'))
else : imsrc = ac.imread(imsrc)
imobj = ac.imread(imgobj)
match_result = ac.find_template(imsrc,imobj,confidence)
if match_result is not None:
match_result['shape']=(imsrc.shape[1],imsrc.shape[0])#0为高,1为宽
if click:
x,y = match_result['result']
self.d.click(x,y)
return match_result
@timer
def main():
# =============================================================================
# xml = mi.code() #获取底层渲染xml文件
# ocr =mi.ocr()
# mi.click('//*[@text="发现"]/preceding-sibling::node')
# connect , [(x,y)] = mi.ocr_click([x for x in '15806527017'])
# connect , [(x,y)] = mi.ocr_click(['QQ浏览器'],picture=True)
# a=mi.locatonScreen(r'C:\Users\Administrator\Desktop\1.jpg')
# =============================================================================
connect , success = mi.ocr_click(['QQ浏览器'],picture=True)
if __name__ == '__main__':
os.chdir(r'C:\Users\Administrator\Desktop')
d = u2.connect_usb('c00c166c')
#d = u2.connect_usb('d5212f2c')
mi=dxpath(d)
main()