抖音抓取在线主播信息

抖音抓取在线主播信息,这个功能最开始就写过。关键的还是xgorgon算法了,算法代码参见本博客其它文章。
抓取网址:http://47.107.249.230:8080/mydouyin/servlet/GetOnlineAnchor

首先还是抓包,打开抖音,进入在线主播界面。
抖音抓取在线主播信息_第1张图片
通过fiddler抓取到数据:
抖音抓取在线主播信息_第2张图片

根据抓的包写代码:
因为采集是费时间的事情,因此这里只采集100条数据,用以演示。

public String getOnline() {
		String result = "";
		try {
			int show_location = 0;
			GetAnchorOnline getAnchor = new GetAnchorOnline();
			String[] getAnchor2 = getAnchor.GetAnchor(0);

			while (getAnchor2[0].equals("true")) {
				getAnchor2 = getAnchor.GetAnchor(show_location);
				show_location++;
				if (show_location == 10) {
					break;
				}

				result = result + getAnchor2[1];
				int randInt = new Random().nextInt(2);
				try {
					new Thread().sleep((randInt + 2) * 200);
				} catch (InterruptedException e) {
					e.printStackTrace();
				}
			}

		} catch (Exception e) {
			e.printStackTrace();
		}

		System.out.println(result);
		return result;

	}

	public static String[] GetAnchor(int show_location) throws Exception {
		int ts = (int) (System.currentTimeMillis() / 1000);
		String _rticket = System.currentTimeMillis() + "";

		String http = "https://webcast3-normal-c-hl.amemv.com/webcast/feed/?cate_id=0&channel_id=21&content_type=0&req_type=0&show_location="
				+ show_location
				+ "&style=2&sub_channel_id=0&sub_type=live_merge&tab_id=1&type=live&max_time=0&req_from=enter_auto_from_room&webcast_sdk_version=1430&webcast_language=zh&webcast_locale=zh_CN&os_api=22&device_type=HUAWEI%20MLA-AL10&ssmix=a&manifest_version_code=100201&dpi=240&uuid=863064010762377&app_name=aweme&version_name=10.2.0&ts="
				+ ts
				+ "&app_type=normal&ac=wifi&update_version_code=10209900&channel=tengxun_new&_rticket="
				+ _rticket
				+ "&device_platform=android&iid=108560829049&version_code=100200&cdid=36db7bfc-b670-4fa8-815f-6a352b973cd1&openudid=4cedfb75190b8733&device_id=69244350688&resolution=720*1280&os_version=5.1.1&language=zh&device_brand=HUAWEI&aid=1128&mcc_mnc=46007";
		String params = http.substring(http.indexOf("?") + 1, http.length());
		String STUB = "";
		String s = getXGon(params, STUB, cookies);

		String Gorgon = xGorgon(ts, StrToByte(s));

		long timeInMillis = Calendar.getInstance().getTimeInMillis();
		Long timestamp = (long) ((Calendar.getInstance().getTimeInMillis()) / 1e3);

		String result = "";
		URL url = new URL(http);
		HttpURLConnection conn = (HttpURLConnection) url.openConnection();
		conn.setRequestProperty("Host", "api3-normal-c-hl.amemv.com");
		conn.setRequestProperty("Connection", "keep-alive");
		conn.setRequestProperty("Cookie", cookies);
		conn.setRequestProperty("X-SS-REQ-TICKET", _rticket);
		conn.setRequestProperty(
				"X-Tt-Token",
				"00d96187a3556c02090bf038dbf72ed3a3ad665069ee9dcf71b8b97e3459a142c87a979d0650c24acd989578b8203dfafd2e");
		conn.setRequestProperty("sdk-version", "1");
		conn.setRequestProperty("X-SS-DP", "1128");
		conn.setRequestProperty("x-tt-trace-id",
				"00-4ee05bbb0a101f48f0e07bf495af0468-4ee05bbb0a101f48-01");
		conn.setRequestProperty(
				"User-Agent",
				"com.ss.android.ugc.aweme/100201 (Linux; U; Android 5.1.1; zh_CN; HUAWEI MLA-AL10; Build/HUAWEIMLA-AL10; Cronet/TTNetVersion:79d23018 2020-02-03 QuicVersion:ac58aac6 2020-01-20)");
		// conn.setRequestProperty("Accept-Encoding","gzip, deflate, br");
		conn.setRequestProperty("X-Gorgon", Gorgon);
		conn.setRequestProperty("X-Khronos", ts + "");
		conn.setRequestProperty(
				"x-common-params-v2",
				"com.ss.android.ugc.aweme/100501 (Linux; U; Android 5.1.1; zh_CN; HUAWEI MLA-AL10; Build/HUAWEIMLA-AL10; Cronet/TTNetVersion:3154e555 2020-03-04 QuicVersion:8fc8a2f3 2020-03-02)");

		InputStream in = conn.getInputStream();

		try {
			int len = 0;
			byte[] buffer = new byte[1024];
			while ((len = in.read(buffer)) > 0) {
				result += new String(buffer, 0, len);
			}
		} finally {
			if (in != null)
				try {
					in.close();
				} catch (Exception e) {
				}
		}
		JsonRootBean parseObject = JSON.parseObject(result, JsonRootBean.class);
		boolean has_more = parseObject.getExtra().getHas_more();
		List data = parseObject.getData();

		String info = "";
		Iterator iterator = data.iterator();
		while (iterator.hasNext()) {
			Data data2 = (Data) iterator.next();
			info = info + "主播房间号:" + data2.getData().getId_str() + " | 主播抖音号:"
					+ data2.getData().getOwner().getDisplay_id() + "\n";
		}

		String[] dataStrings = new String[2];
		dataStrings[0] = has_more + "";
		dataStrings[1] = info;
		return dataStrings;

	}

查看抓取的主播信息,这里只抓取了主播的房间号和抖音号,其实还可以抓很多数据,这里仅抓取需要的数据就可以了。
抖音抓取在线主播信息_第3张图片

随便拿个抖音号来测试

抖音抓取在线主播信息_第4张图片
可以看到这个主播是在线的。

抓取在线抖音主播信息完成。

以上内容仅作为学习交流,我的v: lb87626

你可能感兴趣的:(爬虫,java)