记一次Gin框架使用url参数值获取+号出现空格的排查

记一次Gin框架使用url参数值获取+号出现空格的排查

问题描述:

前几天使用gin框架遇到一个问题,接口请求之后,后台拿到的数据中出现取url参数,url参数值如果有携带+,后台拿到的是一个空格。

经排查客户端请求时如果对url参数没有进行编码直接传递+号过来,go语言的net包会将其替换成空格。+号只有转换成url编码, %2b时,gin框架才能正确的解析出+号。

源码部分

可以查看net/url/url.go这个文件,url.go文件处理参数认为传递过来的是进行url编码之后的参数。
在这个函数QueryUnescape下面有一个PathUnsecape。给出的解释是:PathUnescape与QueryUnescape完全相同,只是它不是 unescape ‘+’ to ’ '(空格)。

// QueryUnescape does the inverse transformation of QueryEscape,
// converting each 3-byte encoded substring of the form "%AB" into the
// hex-decoded byte 0xAB.
// It returns an error if any % is not followed by two hexadecimal
// digits.
func QueryUnescape(s string) (string, error) {
	return unescape(s, encodeQueryComponent)
}

// PathUnescape does the inverse transformation of PathEscape,
// converting each 3-byte encoded substring of the form "%AB" into the
// hex-decoded byte 0xAB. It returns an error if any % is not followed
// by two hexadecimal digits.
//
// PathUnescape is identical to QueryUnescape except that it does not
// unescape '+' to ' ' (space).
func PathUnescape(s string) (string, error) {
	return unescape(s, encodePathSegment)
}

重点看一下unsecape函数



// unescape unescapes a string; the mode specifies
// which section of the URL string is being unescaped.
func unescape(s string, mode encoding) (string, error) {
	// Count %, check that they're well-formed.
	n := 0
	hasPlus := false
	for i := 0; i < len(s); {
		switch s[i] {
		case '%':
			n++
			if i+2 >= len(s) || !ishex(s[i+1]) || !ishex(s[i+2]) {
				s = s[i:]
				if len(s) > 3 {
					s = s[:3]
				}
				return "", EscapeError(s)
			}
			// Per https://tools.ietf.org/html/rfc3986#page-21
			// in the host component %-encoding can only be used
			// for non-ASCII bytes.
			// But https://tools.ietf.org/html/rfc6874#section-2
			// introduces %25 being allowed to escape a percent sign
			// in IPv6 scoped-address literals. Yay.
			if mode == encodeHost && unhex(s[i+1]) < 8 && s[i:i+3] != "%25" {
				return "", EscapeError(s[i : i+3])
			}
			if mode == encodeZone {
				// RFC 6874 says basically "anything goes" for zone identifiers
				// and that even non-ASCII can be redundantly escaped,
				// but it seems prudent to restrict %-escaped bytes here to those
				// that are valid host name bytes in their unescaped form.
				// That is, you can use escaping in the zone identifier but not
				// to introduce bytes you couldn't just write directly.
				// But Windows puts spaces here! Yay.
				v := unhex(s[i+1])<<4 | unhex(s[i+2])
				if s[i:i+3] != "%25" && v != ' ' && shouldEscape(v, encodeHost) {
					return "", EscapeError(s[i : i+3])
				}
			}
			i += 3
		case '+':
			hasPlus = mode == encodeQueryComponent
			i++
		default:
			if (mode == encodeHost || mode == encodeZone) && s[i] < 0x80 && shouldEscape(s[i], mode) {
				return "", InvalidHostError(s[i : i+1])
			}
			i++
		}
	}

	if n == 0 && !hasPlus {
		return s, nil
	}

	var t strings.Builder
	t.Grow(len(s) - 2*n)
	for i := 0; i < len(s); i++ {
		switch s[i] {
		case '%':
			t.WriteByte(unhex(s[i+1])<<4 | unhex(s[i+2]))
			i += 2
		case '+':
			if mode == encodeQueryComponent {
				t.WriteByte(' ')
			} else {
				t.WriteByte('+')
			}
		default:
			t.WriteByte(s[i])
		}
	}
	return t.String(), nil
}

解析的过程中如果遇到了+号,会将这里hasPlus置为true,表示有加号的标志。这里做的事情就是为了检查+加号

	case '+':
			hasPlus = mode == encodeQueryComponent
			i++

如果没有加号出现 这里url.go line249,便直接返回了

	if n == 0 && !hasPlus {
		return s, nil
	}

如果确实出现了加号,这里会继续向下,具体转换url编码为值的时候,如果遇到了加号,line259,会判断前面的模式model

然后替换加号为空格

var t strings.Builder
	t.Grow(len(s) - 2*n)
	for i := 0; i < len(s); i++ {
		switch s[i] {
		case '%':
			t.WriteByte(unhex(s[i+1])<<4 | unhex(s[i+2]))
			i += 2
		case '+':
			if mode == encodeQueryComponent {
				t.WriteByte(' ')   //这里会替换成空格
			} else {
				t.WriteByte('+')
			}
		default:
			t.WriteByte(s[i])
		}
	}

改进措施:

后台可以考虑对这种情况,将url中的+号都替换成%2b,正确情况url编码之后加号是%2b。

c.Request.URL.RawQuery = strings.ReplaceAll(c.Request.URL.RawQuery, "+", "%2b")

你可能感兴趣的:(编程收获,golang)