大家好,我叫谢伟,是一名程序员。
下面结合我的经历和见闻,讲述下一名非科班程序员的成长过程:
- 学习一门编程语言
- 写尽量多的代码
- 补尽量多的基础知识
- 一定阶段后(有开发任务,能按时完成),开始思考架构:即如何更好的设计一个项目
- 阅读源代码,看热门的项目的源代码
- 重点梳理源代码的流程而不是细节
- 借鉴好的源代码的思路编写程序
- 掌握更多的软件设计知识
- 架构师:技术选型、设计
- ...
一般初学者确定一个方向,比如web 后端、前端等,会选择一门编程语言深入下去,比如后端java、python、go等。通过项目不断练习编程语言和编程思维,知道如何将需求实现出来。一段时间后,有可能算是某一阶段的瓶颈,希望写出更好的代码,除了继续做项目之外,更好的方式是阅读某一个库或者某一项目的源代码,从源代码里学习一些编程的处理方式,之后借鉴到自己的项目中。突破瓶颈,继续精进技能。
一般的软件构建过程是这样的:
- 设计:方案确定
- 编写代码
- 编码风格
- 技术选型
- 包
- 类
- 子程序
- 语句
- 测试
- 联调
- 迭代:继续改善代码
本节的主题是:如何阅读源代码?
1. 明确你的问题
开源领域,值得学习的东西太多了,你应该明确知道你需要解决的问题是什么,才能针对性的对某一项目或者某一库进行源代码的阅读。
2. 示例
go-restful
是用于构建REST-style web
服务的golang
包。
在这之前我们需要了解下 HTTP
协议、Web 客户端、服务端。
这些知识和我们访问网址获取到的信息息息相关。
我们在浏览器中输入:URL
(www.baidu.com)的整体过程如下:
- 浏览器(客户端)请求DNS(域名管理系统),获取IP
- IP 能够找到对应的服务器
- 建立TCP 服务
- 服务器根据请求处理请求包(HTTP Request)
- 服务器返回HTTP Response
- 浏览器(客户端)收到响应后渲染Response 包里的主体(body)
- 断开连接,浏览器显示网页信息
我们关注里面的:HTTP Request
和 HTTP Response
随意找个网页查看源代码看看:
HTTP 协议:HTTP Request
GET /u/58f0817209aa HTTP/1.1
Host: www.jianshu.com
Connection: keep-alive
Pragma: no-cache
Cache-Control: no-cache
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.181 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8
Referer: https://www.jianshu.com/
Accept-Encoding: gzip, deflate, br
Accept-Language: zh-CN,zh;q=0.9,en;q=0.8
主要包括:
- 请求行: 请求方法、请求URI、HTTP 协议、协议版本
- 服务端信息: Host、...
- 消息体
HTTP 协议 HTTP Response
HTTP/1.1 200 OK
Date: Sun, 20 May 2018 03:19:36 GMT
Server: Tengine
Content-Type: text/html; charset=utf-8
Transfer-Encoding: chunked
X-Frame-Options: DENY
X-XSS-Protection: 1; mode=block
X-Content-Type-Options: nosniff
Content-Security-Policy: script-src 'self' 'unsafe-inline' 'unsafe-eval' *.jianshu.com *.jianshu.io api.geetest.com static.geetest.com dn-staticdown.qbox.me zz.bdstatic.com *.google-analytics.com hm.baidu.com push.zhanzhang.baidu.com res.wx.qq.com qzonestyle.gtimg.cn as.alipayobjects.com ;style-src 'self' 'unsafe-inline' *.jianshu.com *.jianshu.io api.geetest.com static.geetest.com ;
ETag: W/"4d22fb2fcef7cdb3f874a6b4960ff2ae"
Cache-Control: max-age=0, private, must-revalidate
Set-Cookie: locale=zh-CN; path=/
Set-Cookie: _m7e_session=708ecf714930ebc19da67ae3141bd6c0; path=/; expires=Sun, 20 May 2018 09:19:36 -0000; secure; HttpOnly
X-Request-Id: c61a268c-896f-4e03-afbe-2547db04943d
X-Runtime: 0.137573
Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
Content-Encoding: gzip
X-Via: 1.1 PSfjfzdx2wn96:6 (Cdn Cache Server V2.0), 1.1 jsyz89:1 (Cdn Cache Server V2.0)
Connection: keep-alive
X-Dscp-Value: 0
主要包括:
- 状态行:HTTP 协议、HTTP 协议版本、状态码
- 服务端信息
- 消息体
所以关于设计 restful api 的主体部分包括这些:
- HTTP 方法:GET、POST、PUT、DELETE
- HTTP Request:URI 路径、路径参数、请求参数
- HTTP Response:状态码(2XX、3XX、4XX、5XX)、消息体(body)
鉴于上面的知识点,我们如果使用内置的golang 包,处理 http 信息会这么做:
func Downloader(url string) ([]byte, error) {
var (
req *http.Request
err error
)
if req, err = http.NewRequest("GET", url, nil); err != nil {
return nil, ErrorHttpRequest
}
client := http.DefaultClient
req.Header.Add("User-Agent", "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.139 Safari/537.36")
var (
resp *http.Response
)
if resp, err = client.Do(req); err != nil {
return nil, ErrorHttpResponse
}
defer resp.Body.Close()
return ioutil.ReadAll(resp.Body)
}
查看下源代码 net/http 库中的 http.Request 和 http.Response 都有些什么?
type Request struct {
// Method specifies the HTTP method (GET, POST, PUT, etc.).
// For client requests an empty string means GET.
Method string
// URL specifies either the URI being requested (for server
// requests) or the URL to access (for client requests).
//
// For server requests the URL is parsed from the URI
// supplied on the Request-Line as stored in RequestURI. For
// most requests, fields other than Path and RawQuery will be
// empty. (See RFC 2616, Section 5.1.2)
//
// For client requests, the URL's Host specifies the server to
// connect to, while the Request's Host field optionally
// specifies the Host header value to send in the HTTP
// request.
URL *url.URL
// The protocol version for incoming server requests.
//
// For client requests these fields are ignored. The HTTP
// client code always uses either HTTP/1.1 or HTTP/2.
// See the docs on Transport for details.
Proto string // "HTTP/1.0"
ProtoMajor int // 1
ProtoMinor int // 0
// Header contains the request header fields either received
// by the server or to be sent by the client.
//
// If a server received a request with header lines,
//
// Host: example.com
// accept-encoding: gzip, deflate
// Accept-Language: en-us
// fOO: Bar
// foo: two
//
// then
//
// Header = map[string][]string{
// "Accept-Encoding": {"gzip, deflate"},
// "Accept-Language": {"en-us"},
// "Foo": {"Bar", "two"},
// }
//
// For incoming requests, the Host header is promoted to the
// Request.Host field and removed from the Header map.
//
// HTTP defines that header names are case-insensitive. The
// request parser implements this by using CanonicalHeaderKey,
// making the first character and any characters following a
// hyphen uppercase and the rest lowercase.
//
// For client requests, certain headers such as Content-Length
// and Connection are automatically written when needed and
// values in Header may be ignored. See the documentation
// for the Request.Write method.
Header Header
// Body is the request's body.
//
// For client requests a nil body means the request has no
// body, such as a GET request. The HTTP Client's Transport
// is responsible for calling the Close method.
//
// For server requests the Request Body is always non-nil
// but will return EOF immediately when no body is present.
// The Server will close the request body. The ServeHTTP
// Handler does not need to.
Body io.ReadCloser
// GetBody defines an optional func to return a new copy of
// Body. It is used for client requests when a redirect requires
// reading the body more than once. Use of GetBody still
// requires setting Body.
//
// For server requests it is unused.
GetBody func() (io.ReadCloser, error)
// ContentLength records the length of the associated content.
// The value -1 indicates that the length is unknown.
// Values >= 0 indicate that the given number of bytes may
// be read from Body.
// For client requests, a value of 0 with a non-nil Body is
// also treated as unknown.
ContentLength int64
// TransferEncoding lists the transfer encodings from outermost to
// innermost. An empty list denotes the "identity" encoding.
// TransferEncoding can usually be ignored; chunked encoding is
// automatically added and removed as necessary when sending and
// receiving requests.
TransferEncoding []string
// Close indicates whether to close the connection after
// replying to this request (for servers) or after sending this
// request and reading its response (for clients).
//
// For server requests, the HTTP server handles this automatically
// and this field is not needed by Handlers.
//
// For client requests, setting this field prevents re-use of
// TCP connections between requests to the same hosts, as if
// Transport.DisableKeepAlives were set.
Close bool
// For server requests Host specifies the host on which the
// URL is sought. Per RFC 2616, this is either the value of
// the "Host" header or the host name given in the URL itself.
// It may be of the form "host:port". For international domain
// names, Host may be in Punycode or Unicode form. Use
// golang.org/x/net/idna to convert it to either format if
// needed.
//
// For client requests Host optionally overrides the Host
// header to send. If empty, the Request.Write method uses
// the value of URL.Host. Host may contain an international
// domain name.
Host string
// Form contains the parsed form data, including both the URL
// field's query parameters and the POST or PUT form data.
// This field is only available after ParseForm is called.
// The HTTP client ignores Form and uses Body instead.
Form url.Values
// PostForm contains the parsed form data from POST, PATCH,
// or PUT body parameters.
//
// This field is only available after ParseForm is called.
// The HTTP client ignores PostForm and uses Body instead.
PostForm url.Values
// MultipartForm is the parsed multipart form, including file uploads.
// This field is only available after ParseMultipartForm is called.
// The HTTP client ignores MultipartForm and uses Body instead.
MultipartForm *multipart.Form
// Trailer specifies additional headers that are sent after the request
// body.
//
// For server requests the Trailer map initially contains only the
// trailer keys, with nil values. (The client declares which trailers it
// will later send.) While the handler is reading from Body, it must
// not reference Trailer. After reading from Body returns EOF, Trailer
// can be read again and will contain non-nil values, if they were sent
// by the client.
//
// For client requests Trailer must be initialized to a map containing
// the trailer keys to later send. The values may be nil or their final
// values. The ContentLength must be 0 or -1, to send a chunked request.
// After the HTTP request is sent the map values can be updated while
// the request body is read. Once the body returns EOF, the caller must
// not mutate Trailer.
//
// Few HTTP clients, servers, or proxies support HTTP trailers.
Trailer Header
// RemoteAddr allows HTTP servers and other software to record
// the network address that sent the request, usually for
// logging. This field is not filled in by ReadRequest and
// has no defined format. The HTTP server in this package
// sets RemoteAddr to an "IP:port" address before invoking a
// handler.
// This field is ignored by the HTTP client.
RemoteAddr string
// RequestURI is the unmodified Request-URI of the
// Request-Line (RFC 2616, Section 5.1) as sent by the client
// to a server. Usually the URL field should be used instead.
// It is an error to set this field in an HTTP client request.
RequestURI string
// TLS allows HTTP servers and other software to record
// information about the TLS connection on which the request
// was received. This field is not filled in by ReadRequest.
// The HTTP server in this package sets the field for
// TLS-enabled connections before invoking a handler;
// otherwise it leaves the field nil.
// This field is ignored by the HTTP client.
TLS *tls.ConnectionState
// Cancel is an optional channel whose closure indicates that the client
// request should be regarded as canceled. Not all implementations of
// RoundTripper may support Cancel.
//
// For server requests, this field is not applicable.
//
// Deprecated: Use the Context and WithContext methods
// instead. If a Request's Cancel field and context are both
// set, it is undefined whether Cancel is respected.
Cancel <-chan struct{}
// Response is the redirect response which caused this request
// to be created. This field is only populated during client
// redirects.
Response *Response
// ctx is either the client or server context. It should only
// be modified via copying the whole Request using WithContext.
// It is unexported to prevent people from using Context wrong
// and mutating the contexts held by callers of the same request.
ctx context.Context
}
type Response struct {
Status string // e.g. "200 OK"
StatusCode int // e.g. 200
Proto string // e.g. "HTTP/1.0"
ProtoMajor int // e.g. 1
ProtoMinor int // e.g. 0
// Header maps header keys to values. If the response had multiple
// headers with the same key, they may be concatenated, with comma
// delimiters. (Section 4.2 of RFC 2616 requires that multiple headers
// be semantically equivalent to a comma-delimited sequence.) Values
// duplicated by other fields in this struct (e.g., ContentLength) are
// omitted from Header.
//
// Keys in the map are canonicalized (see CanonicalHeaderKey).
Header Header
// Body represents the response body.
//
// The http Client and Transport guarantee that Body is always
// non-nil, even on responses without a body or responses with
// a zero-length body. It is the caller's responsibility to
// close Body. The default HTTP client's Transport does not
// attempt to reuse HTTP/1.0 or HTTP/1.1 TCP connections
// ("keep-alive") unless the Body is read to completion and is
// closed.
//
// The Body is automatically dechunked if the server replied
// with a "chunked" Transfer-Encoding.
Body io.ReadCloser
// ContentLength records the length of the associated content. The
// value -1 indicates that the length is unknown. Unless Request.Method
// is "HEAD", values >= 0 indicate that the given number of bytes may
// be read from Body.
ContentLength int64
// Contains transfer encodings from outer-most to inner-most. Value is
// nil, means that "identity" encoding is used.
TransferEncoding []string
// Close records whether the header directed that the connection be
// closed after reading Body. The value is advice for clients: neither
// ReadResponse nor Response.Write ever closes a connection.
Close bool
// Uncompressed reports whether the response was sent compressed but
// was decompressed by the http package. When true, reading from
// Body yields the uncompressed content instead of the compressed
// content actually set from the server, ContentLength is set to -1,
// and the "Content-Length" and "Content-Encoding" fields are deleted
// from the responseHeader. To get the original response from
// the server, set Transport.DisableCompression to true.
Uncompressed bool
// Trailer maps trailer keys to values in the same
// format as Header.
//
// The Trailer initially contains only nil values, one for
// each key specified in the server's "Trailer" header
// value. Those values are not added to Header.
//
// Trailer must not be accessed concurrently with Read calls
// on the Body.
//
// After Body.Read has returned io.EOF, Trailer will contain
// any trailer values sent by the server.
Trailer Header
// Request is the request that was sent to obtain this Response.
// Request's Body is nil (having already been consumed).
// This is only populated for Client requests.
Request *Request
// TLS contains information about the TLS connection on which the
// response was received. It is nil for unencrypted responses.
// The pointer is shared between responses and should not be
// modified.
TLS *tls.ConnectionState
}
可以看出这两个结构体内存在着我们之前分析的那些点。
如果只使用内置的 net/http 的包如何启动一个web 服务?
package main
import (
"fmt"
"net/http"
)
func Say(resp http.ResponseWriter, req *http.Request) {
req.ParseForm()
fmt.Println(req.URL.Host, "-", req.URL.Path, "-", req.Form)
fmt.Fprintf(resp, "hello world")
}
func main() {
http.HandleFunc("/user/hello", Say)
http.ListenAndServe(":8080", nil)
}
访问:localhost:8080/user/hello
返回响应值:"hello world"
上文中:URL、和响应值response,我们在代码中进行了处理。同样的我们访问真实的网址, 比如 https://www.baidu.com
则是百度的服务器端代码进行了处理。
3. 抄和使用 example
上文中大概知道了构建 restful api 相关的一些 http 协议的知识, 和内置的库 net/http 的基本使用方法。
但别忘了我们的主题是:阅读 go-restful 的源代码。
首先,我们应该根据官方文档学会基本的使用:
package main
import (
"fmt"
"log"
"net/http"
"github.com/emicklei/go-restful"
)
type User struct {
Name string
Age string
ID []int
}
type UserResource struct {
// normally one would use DAO (data access object)
users map[string]User
}
// WebService creates a new service that can handle REST requests for User resources.
func (u UserResource) WebService() *restful.WebService {
ws := new(restful.WebService)
ws.
Path("/users").
Consumes(restful.MIME_XML, restful.MIME_JSON).
Produces(restful.MIME_JSON, restful.MIME_XML) // you can specify this per route as well
ws.Route(ws.GET("/").To(u.findAllUsers).
// docs
Doc("get all users").
Writes([]User{}).
Returns(200, "OK", []User{}))
ws.Route(ws.GET("/{user-id}").To(u.findUser).
// docs
Doc("get a user").
Param(ws.PathParameter("user-id", "identifier of the user").DataType("integer").DefaultValue("1")).
Writes(User{}). // on the response
Returns(200, "OK", User{}).
Returns(404, "Not Found", nil))
return ws
}
// GET http://localhost:8080/users
//
func (u UserResource) findAllUsers(request *restful.Request, response *restful.Response) {
list := []User{}
for _, each := range u.users {
list = append(list, each)
}
response.WriteEntity(list)
}
func (u UserResource) findUser(request *restful.Request, response *restful.Response) {
id := request.PathParameter("user-id")
usr := u.users[id]
if len(usr.ID) == 0 {
response.WriteErrorString(http.StatusNotFound, "User could not be found.")
} else {
response.WriteEntity(usr)
}
}
func main() {
type APIServer struct {
Container *restful.Container
}
u := UserResource{map[string]User{}}
u.users["xiewei"] = User{
Name: "xiewei",
Age: "20",
ID: []int{1, 2, 3, 4},
}
apiServer := &APIServer{
Container: restful.DefaultContainer.Add(u.WebService()),
}
log.Printf("start listening on localhost:9990")
log.Fatal(http.ListenAndServe(":9990", apiServer.Container))
}
访问:localhost:9990/users
HTTP/1.1 200 OK
Content-Type: application/json
Date: Sun, 20 May 2018 04:21:29 GMT
Content-Length: 92
[
{
"Name": "xiewei",
"Age": "20",
"ID": [
1,
2,
3,
4
]
}
]
访问:localhost:9990/users/xiewei
HTTP/1.1 200 OK
Content-Type: application/json
Date: Sun, 20 May 2018 04:21:29 GMT
Content-Length: 92
[
{
"Name": "xiewei",
"Age": "20",
"ID": [
1,
2,
3,
4
]
}
]
访问:localhost:9990/users/xiewei2
HTTP/1.1 404 Not Found
Date: Sun, 20 May 2018 04:22:59 GMT
Content-Length: 24
Content-Type: text/plain; charset=utf-8
User could not be found.
通过这个简单的例子,我们大概能够使用 go-restful 了。
无外乎还是操作:http.Request、http.Response, 上述例子的核心是:findAllUsers
和 findUser
这个两个函数,具体的返回值、状态码什么的都是由这两个函数定义。其他的都是一些路由的定义、定义生产者和消费者格式、启动指定端口的web 服务。
4. 梳理流程
1. 启动并监控指定端口的 http 服务
func ListenAndServe(addr string, handler Handler) error {
server := &Server{Addr: addr, Handler: handler}
return server.ListenAndServe()
}
能看出函数的入口是:Handler 接口
type Handler interface {
ServeHTTP(ResponseWriter, *Request)
}
httpServer 包含 container .
log.Fatal(http.ListenAndServe(":9990", apiServer.Container))
一个 Container 包含多个 WebService
type Container struct {
webServicesLock sync.RWMutex
webServices []*WebService
ServeMux *http.ServeMux
isRegisteredOnRoot bool
containerFilters []FilterFunction
doNotRecover bool // default is true
recoverHandleFunc RecoverHandleFunction
serviceErrorHandleFunc ServiceErrorHandleFunction
router RouteSelector // default is a CurlyRouter (RouterJSR311 is a slower alternative)
contentEncodingEnabled bool // default is false
}
container 实现的了Handler 接口
func (c *Container) ServeHTTP(httpwriter http.ResponseWriter, httpRequest *http.Request) {
c.ServeMux.ServeHTTP(httpwriter, httpRequest)
}
一个 webservice 包含多个Route
type WebService struct {
rootPath string
pathExpr *pathExpression // cached compilation of rootPath as RegExp
routes []Route
produces []string
consumes []string
pathParameters []*Parameter
filters []FilterFunction
documentation string
apiVersion string
typeNameHandleFunc TypeNameHandleFunction
dynamicRoutes bool
// protects 'routes' if dynamic routes are enabled
routesLock sync.RWMutex
}
一个 Route 包含HTTP 协议协议相关的HTTP Request 、HTTP Reponse 、方法等处理
type Route struct {
Method string
Produces []string
Consumes []string
Path string // webservice root path + described path
Function RouteFunction
Filters []FilterFunction
If []RouteSelectionConditionFunction
// cached values for dispatching
relativePath string
pathParts []string
pathExpr *pathExpression // cached compilation of relativePath as RegExp
// documentation
Doc string
Notes string
Operation string
ParameterDocs []*Parameter
ResponseErrors map[int]ResponseError
ReadSample, WriteSample interface{} // structs that model an example request or response payload
// Extra information used to store custom information about the route.
Metadata map[string]interface{}
// marks a route as deprecated
Deprecated bool
}
具体的处理函数是:RouteFunction
type RouteFunction func(*Request, *Response)
再回过来看一下,我们的代码是怎么处理的:
- 启动http 服务,指定端口并监听:需要传入端口和Handler 接口
log.Fatal(http.ListenAndServe(":9990", apiServer.Container))
- 定义一个 container ,container 类实现了Handler 接口
apiServer := &APIServer{
Container: restful.DefaultContainer.Add(u.WebService()),
}
- container 内需要定义一个或者多个 webservice, 内含具体的Route 处理函数 RouteFunction
func (u UserResource) WebService() *restful.WebService {
ws := new(restful.WebService)
ws.
Path("/users").
Consumes(restful.MIME_XML, restful.MIME_JSON).
Produces(restful.MIME_JSON, restful.MIME_XML) // you can specify this per route as well
ws.Route(ws.GET("/").To(u.findAllUsers).
// docs
Doc("get all users").
Writes([]User{}).
Returns(200, "OK", []User{}))
ws.Route(ws.GET("/{user-id}").To(u.findUser).
// docs
Doc("get a user").
Param(ws.PathParameter("user-id", "identifier of the user").DataType("integer").DefaultValue("1")).
Writes(User{}). // on the response
Returns(200, "OK", User{}).
Returns(404, "Not Found", nil))
return ws
}
好,上面的大致处理流程我们已经梳理清楚。
5. 借鉴使用
- 如何抽象出的客观实体:比如Route、Webservice、Container
- 如何对Router、WebService、Container 定义方法
- 如何对项目进行组织。
- 方法如何进行的复用
内置库内存在很多的接口,对接口的实现,不断的对内置库的扩展,有可能就重新发明了一个热门的轮子。
go-restful 库便是对内置库 net/http 的扩展。
总结:
阅读源代码首先你需要明确解决的问题是什么,其次你会使用该项目的Demo 或者多个示例,然后你需要根据源代码梳理源代码流程,最后由抄的过程转变为借鉴使用的过程。
再会,希望对你有所启发,我是谢伟。