在一个月前,某个群友在获取整个群的群友信息时遇到了一点问题:
对整个群进行群友数据获取经测试确实有点棘手,下面我将我的解决过程公布给大家。
基础教程详见:
https://blog.csdn.net/as604049322/article/details/121391639
人工打开要抓取的群聊窗口后,首先获取微信窗口并点击聊天信息按钮:
import uiautomation as auto
auto.uiautomation.SetGlobalSearchTimeout(1)
wechatWindow = auto.WindowControl(
Depth=1, Name="微信", ClassName='WeChatMainWndForPC')
wechatWindow.SetActive()
mes_b = wechatWindow.ButtonControl(Name="聊天信息")
mes_b.Click()
然后立马点击查看更多群成员
按钮展开群友信息列表:
当然部分群,由于人数可能并没有这个按钮,所以需要判断是否存在:
more_b = wechatWindow.ButtonControl(Name="查看更多群成员")
if more_b.Exists(0, 0):
more_b.Click()
此时获取群友昵称列表就非常简单:
member_list = wechatWindow.ListControl(Name="聊天成员")
rect = member_list.GetParentControl().BoundingRectangle
for listItem, d in auto.WalkControl(member_list, maxDepth=1):
if not isinstance(listItem, auto.ListItemControl):
continue
if not listItem.Name or listItem.Name in ("添加", "删除"):
continue
print(listItem.Name, end=" ")
可以看到已经完美获取到每位群友的昵称:
但是假如我们需要对每位群友都点击打开详细信息面板后采集,就会有些麻烦,例如:
认真分析该面板的控件信息后,最终编写出如下读取代码:
mes_p = wechatWindow.WindowControl(Name="聊天信息")
def get_user_info():
user_info_pane = mes_p.GetFirstChildControl()
result = []
company = None
for c, d in auto.WalkControl(user_info_pane, maxDepth=7):
if c.ControlType not in [auto.ControlType.EditControl,
auto.ControlType.TextControl,
auto.ControlType.EditControl]:
continue
text = c.Name
if text and text.replace(" ", "") not in ["备注", "企业"]:
if text.startswith("@"):
company = text[1:]
else:
result.append(text)
data = {
"昵称": result[0]}
if company:
data["企业"] = company
keys = [k.replace(" ", "").replace(":", "") for k in result[1::2]]
values = result[2::2]
data.update(dict(zip(keys, values)))
return data
为了适当兼容企业微信,上述代码稍微有点复杂。
下面我们开始遍历群友,并在适当的时候进行鼠标滑轮滚动:
member_list = wechatWindow.ListControl(Name="聊天成员")
rect = member_list.GetParentControl().BoundingRectangle
for listItem, d in auto.WalkControl(member_list, maxDepth=1):
if not isinstance(listItem, auto.ListItemControl):
continue
if not listItem.Name or listItem.Name in ("添加", "删除"):
continue
if listItem.BoundingRectangle.bottom >= rect.bottom:
auto.WheelDown(waitTime=0.01)
listItem.Click(waitTime=0.01)
data = get_user_info()
data["备注"] = listItem.Name
print(data)
listItem.Click(0, 0, waitTime=0.01)
经测试程序已经可以顺利遍历每一位群友。我们可以设置读取按键的状态,在指定按键按下时停止采集:
if auto.IsKeyPressed(auto.Keys.VK_F12):
print("F12已被按下,停止采集")
break
最终决定将读取结果保存到pandas中,最终完整代码为:
import uiautomation as auto
import pandas as pd
auto.uiautomation.SetGlobalSearchTimeout(1)
wechatWindow = auto.WindowControl(
Depth=1, Name="微信", ClassName='WeChatMainWndForPC')
wechatWindow.SetActive()
mes_b = wechatWindow.ButtonControl(Name="聊天信息")
mes_b.Click()
more_b = wechatWindow.ButtonControl(Name="查看更多群成员")
if more_b.Exists(0, 0):
more_b.Click()
mes_p = wechatWindow.WindowControl(Name="聊天信息")
def get_user_info():
user_info_pane = mes_p.GetFirstChildControl()
result = []
company = None
for c, d in auto.WalkControl(user_info_pane, maxDepth=7):
if c.ControlType not in [auto.ControlType.EditControl,
auto.ControlType.TextControl,
auto.ControlType.EditControl]:
continue
text = c.Name
if text and text.replace(" ", "") not in ["备注", "企业"]:
if text.startswith("@"):
company = text[1:]
else:
result.append(text)
data = {
"昵称": result[0]}
if company:
data["企业"] = company
keys = [k.replace(" ", "").replace(":", "") for k in result[1::2]]
values = result[2::2]
data.update(dict(zip(keys, values)))
return data
member_list = wechatWindow.ListControl(Name="聊天成员")
rect = member_list.GetParentControl().BoundingRectangle
data = []
for listItem, d in auto.WalkControl(member_list, maxDepth=1):
if not isinstance(listItem, auto.ListItemControl):
continue
if not listItem.Name or listItem.Name in ("添加", "删除"):
continue
if listItem.BoundingRectangle.bottom >= rect.bottom:
auto.WheelDown(waitTime=0.01)
listItem.Click(waitTime=0.01)
row = get_user_info()
row["备注"] = listItem.Name
data.append(row)
listItem.Click(0, 0, waitTime=0.01)
if auto.IsKeyPressed(auto.Keys.VK_F12):
print("F12已被按下,停止采集")
break
df = pd.DataFrame(data)
df
自动化程序运行过程中的操作效果:
采集结果:
可以看到169个群友的可见数据已经完美被保存到pandas对象中,可随便导出到Excel中。
注意:本文仅仅获取人眼能够看到的数据,方便便捷保存,解放双手。