想用Go写爬虫联系一下语法的,结果数据类型转换还不熟悉。
强行 map嵌套取值就会报错type interface {} does not support indexing
// 示例字符串
resString := `
{
"args": {},
"headers": {
"Accept-Encoding": "gzip",
"Host": "httpbin.org",
"User-Agent": "GRequests/0.10",
"X-Amzn-Trace-Id": "Root=1-5f3f3xxxxxxccdc4068"
},
"origin": "11.11.11.22",
"url": "http://httpbin.org/get"
}
`
func StringToMap(content string) map[string]interface{}{
var resMap map[string]interface{}
err := json.Unmarshal([]byte(content), &resMap)
if err != nil {
fmt.Println("string转map失败", err)
}
return resMap
}
resMap := StringToMap(resString)
fmt.Println("url取值", resMap["url"], reflect.TypeOf(resMap["url"]))
resMap["url"]
是一个insterface类型,如下转成stringresMap["url"].(string)
接着上面的例子,比如要取headers
里面的Host
值
如果是直接
resMap["headers"]["Host"]
就会报错 type interface {} does not support indexing
// 内部嵌套的map 也要转换
innerMap := resMap["headers"].(map[string]interface{})
fmt.Println("Host取值", innerMap["Host"], reflect.TypeOf(innerMap["Host"]))
我使用了Go二次封装的http库 grequests https://github.com/levigross/grequests
对目标网址发送请求,返回一个json类型字符串。
目标网址Get请求: http://httpbin.org/get
会得到一个Json字符串
{
"args": {},
"headers": {
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9",
"Accept-Encoding": "gzip, deflate",
"Accept-Language": "zh-CN,zh;q=0.9",
"Host": "httpbin.org",
"Upgrade-Insecure-Requests": "1",
"User-Agent": "Mozilla/5.0 (Macintosh; xx) Appit/537.3xxxxx6 (KxxxxHTML, like Gecko) Chrome/84.0.414xxx25 xxxxx37.36",
"X-Amzn-Trace-Id": "Root=1-5f3f3bae-cxxxxxxdb4121b44de07769"
},
"origin": "111.111.111.111",
"url": "http://httpbin.org/get"
}
package main
import (
"encoding/json"
"fmt"
"github.com/levigross/grequests"
"reflect"
)
func main() {
//You can modify the request by passing an optional RequestOptions struc
//resp, err := grequests.Get("http://httpbin.org/get", nil)
//
//if err != nil {
// fmt.Println("Unable to make request: ", err)
//}
//resString := resp.String()
//
//fmt.Println(resString)
resString := `
{
"args": {},
"headers": {
"Accept-Encoding": "gzip",
"Host": "httpbin.org",
"User-Agent": "GRequests/0.10",
"X-Amzn-Trace-Id": "Root=1-5f3f3e21-44e7f0e4cec2d98cccdc4068"
},
"origin": "116.233.234.60",
"url": "http://httpbin.org/get"
}
`
var resMap map[string]interface{}
err := json.Unmarshal([]byte(resString), &resMap)
if err != nil {
fmt.Println("string转map失败", err)
}
// go type interface {} does not support indexing
fmt.Println("args取值", resMap["args"], reflect.TypeOf(resMap["args"]))
fmt.Println("origin取值", resMap["origin"], reflect.TypeOf(resMap["origin"]))
// 内部嵌套的map 也要转换
innerMap := resMap["headers"].(map[string]interface{})
fmt.Println("Host取值", innerMap["Host"], reflect.TypeOf(innerMap["Host"]))
}
输出
args取值 map[] map[string]interface {}
origin取值 116.233.234.60 string
Host取值 httpbin.org string
还有个二次封装的请求库,看起来也不错
github.com/imroc/req
Go 这些数据类型操作,还是有些繁琐,相对比Python这些操作,就显得很简洁了。