折腾了下Python网络编程,按照网上的教程做了个简单的例子。地址如下:
https://www.liaoxuefeng.com/wiki/001374738125095c955c1e6d8bb493182103fac9270762a000/001386832511628f1fe2c65534a46aa86b8e654b6d3567c000
完成后并没有发现什么问题,但是对
sock.send('GET / HTTP/1.1\r\nHost: www.sina.com.cn\r\nConnection: close\r\n\r\n')
这行代码甚是难理解,尝试把地址换成Host: www.baidu.com,依旧返回的是200,没啥问题。后然把地址换成了Host: www.2298.com,问题出来,运行结果是301,代码如下:
# -*- coding : UTF-8 -*-
import socket
sock =socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect(('www.2298.com', 80))
sock.send(b'GET / HTTP/1.1\r\nHost: www.2298.com\r\nConnection: close\r\n\r\n')
buffer = []
while True:
d = sock.recv(1024)
if d:
buffer.append(d)
else:break
data = b''.join(buffer)
print(data.decode('utf-8'))
sock.close()
网上各种找原因,最后明白了sock.send(b'GET / HTTP/1.1\r\nHost: www.2298.com......)是什么意思了,是向服务器发送headers请求,类似浏览器的包头。
根据自己浏览器的headers的内容,逐行在末尾增加了\r\n,调整代码如下:
sock.send('GET / HTTP/1.1\r\n'.encode())
sock.send('Host: www.2298.com\r\n'.encode())
sock.send('Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8\r\n'.encode())
sock.send('Upgrade-Insecure-Requests: 1\r\n'.encode())
sock.send('User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36\r\n'.encode())
sock.send('Cookie:Hm_lvt_9f5dfe5de2393d254b0527c81e9b1bf9=1532226004,1532477712,1532570533,1532650365; Hm_lvt_b78dbd3bc3e520b7455189750ea8c8db=1532217434,1532477712,1532570536,1532650365; Hm_lpvt_9f5dfe5de2393d254b0527c81e9b1bf9=1532657564; Hm_lpvt_b78dbd3bc3e520b7455189750ea8c8db=1532657564\r\n'.encode())
sock.send('Connection: close\r\n\r\n'.encode())
整块一起发送始终不行,具体原因还不明白,还有最后一行sock.send('Connection: close\r\n\r\n'.encode())不能省略,不然运行没有结果。
运行上面调整的代码依旧提示301,各种查了后,原因是,www.2298.com是自动跳转到https://www.2298.com,上面代码走的是http协议的,端口换成443,重新调整代码如下:
sock = ssl.wrap_socket(socket.socket(socket.AF_INET, socket.SOCK_STREAM))
sock.connect(('www.2298.com', 443))
这次运行就没啥问题了。
完整代码如下:
# -*- coding : UTF-8 -*-
import socket
import ssl
sock = ssl.wrap_socket(socket.socket(socket.AF_INET, socket.SOCK_STREAM))
sock.connect(('www.2298.com', 443))
sock.send('GET / HTTP/1.1\r\n'.encode())
sock.send('Host: www.2298.com\r\n'.encode())
#sock.send('Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8\r\n'.encode())
#sock.send('Upgrade-Insecure-Requests: 1\r\n'.encode())
sock.send('User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36\r\n'.encode())
#sock.send('Cookie:Hm_lvt_9f5dfe5de2393d254b0527c81e9b1bf9=1532226004,1532477712,1532570533,1532650365; Hm_lvt_b78dbd3bc3e520b7455189750ea8c8db=1532217434,1532477712,1532570536,1532650365; Hm_lpvt_9f5dfe5de2393d254b0527c81e9b1bf9=1532657564; Hm_lpvt_b78dbd3bc3e520b7455189750ea8c8db=1532657564\r\n'.encode())
sock.send('Connection: close\r\n\r\n'.encode())
buffer = []
while True:
d = sock.recv(1024)
if d:
buffer.append(d)
else:break
data = b''.join(buffer)
print(data.decode('utf-8'))
sock.close()
headers里面,部分请求可以不用发送,我注释掉了一部分,在有需要的时候大家自行调整。