1.通过Network找到接口
2.分析接口参数
3.多次刷新页面,分析出固定的参数和变动的参数
i:需要进行翻译的字符串
from:源语言的语种
to:翻译后的语种
smartresult:智能结果,固定值
client:客户端,固定值
salt:加密用到的盐,待定
sign:签名字符串,待定
ts:毫秒时间戳
bv:未知的md5值,固定值
doctype:文档类型,固定值
version:版本,固定值
keyfrom:键来源,固定值
action:操作动作,固定值
typoResult:是否打印错误,固定值
4.搜索sign,找到http://shared.ydstatic.com/fanyi/newweb/v1.0.18/scripts/newweb/fanyi.min.js,相关代码如下:
var r = function(e) {
var t = n.md5(navigator.appVersion)
, r = "" + (new Date).getTime()
, i = r + parseInt(10 * Math.random(), 10);
return {
ts: r,
bv: t,
salt: i,
sign: n.md5("fanyideskweb" + e + i + "97_3(jkMYg@T[KZQmqjTK")
}
};
t.recordUpdate = function(e) {
var t = e.i
, i = r(t);
n.ajax({
type: "POST",
contentType: "application/x-www-form-urlencoded; charset=UTF-8",
url: "/bettertranslation",
data: {
i: e.i,
client: "fanyideskweb",
salt: i.salt,
sign: i.sign,
ts: i.ts,
bv: i.bv,
tgt: e.tgt,
modifiedTgt: e.modifiedTgt,
from: e.from,
to: e.to
},
success: function(e) {},
error: function(e) {}
})
}
5.分析出参数变动规律
i:需要进行翻译的字符串的前5000字
salt:当前毫秒时间戳与10以内随机数字字符串的拼接
sign:"fanyideskweb"+i+salt+"97_3(jkMYg@T[KZQmqjTK"的md5值
ts:当前毫秒时间戳
6.实现有道接口爬取
import random
import time
import requests
import hashlib
def generateSaltSign(e):
navigator_appVersion = "5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36"
t = hashlib.md5(navigator_appVersion.encode("utf-8")).hexdigest()
r = str(int(time.time()*1000))
i = r + str(random.randint(1,10))
return {
"ts": r,
"bv": t,
"salt": i,
"sign": hashlib.md5(str("fanyideskweb" + e + i + "97_3(jkMYg@T[KZQmqjTK").encode("utf-8")).hexdigest()
}
def spider(i):
url = 'http://fanyi.youdao.com/translate_o?smartresult=dict&smartresult=rule'
r = generateSaltSign(i)
data = {
"i": i,
"from": "AUTO",
"to": "AUTO",
"smartresult": "dict",
"client": "fanyideskweb",
"salt": r["salt"],
"sign": r["sign"],
"ts": r["ts"],
"bv": r["bv"],
"doctype": "json",
"version": "2.1",
"keyfrom": "fanyi.web",
"action": "FY_BY_REALTlME",
}
# data = parse.urlencode(data).encode('utf-8')
headers = {
"Cookie": "[email protected];",
"Referer": "http://fanyi.youdao.com/",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36",
}
response = requests.post(url=url, data=data, headers=headers)
print(response.text)
if __name__ == '__main__':
i = "你好,有道!"
spider(i)