背景
ios端在线上遇到了网络请求失败的问题,查到原因是客户的userId里面包含了符号”|”,在网路哦请求的时候需要这个参数了,ios没有对这个符号进行url encode,导致请求失败。
然后我测试和排查到安卓端没有这个问题,发现网络请求框架Retrofit+OkHttp自动对参数做了编码。
现象
传递进去的userId参数是:”哈哈|ABC”
然后编码出来的userId参数是:%E5%93%88%E5%93%88%7CABC
https://a.b.c/ef/v1/werqwe/csdfwef/xcvweg?userId=%E5%93%88%E5%93%88%7CABC&origin=android-SDK
原理
首先,这个url encode编码,也称为percent-encode,即百分号编码。关于编码原理,可以参考这篇:percent-encode 百分号编码。
这里能明显看得到,我们传递进去的参数,在框架内部自动做了url encode。
下文则开始分析网络请求框架中是在哪个地方做了这个编码的。
从Retrofit开始
Retrofit请求实例代码
@GET("users/{user}/repos")suspend fun listReposKt(@Path("user") user: String,@Query("uid") uid: String): List<GithubUserReposVO>
我们就看GET请求,这里分别用了注解:@GET、@Path、@Query@GET表示这是get请求。@Path用来拼接请求url的路径。@Query用来设置请求的query参数。
我们测试的情况是对参数做了url encode,那么我们看query注解:
@Documented@Target(PARAMETER)@Retention(RUNTIME)public @interface Query {/** The query parameter name. */String value();/*** Specifies whether the parameter {@linkplain #value() name} and value are already URL encoded.*/boolean encoded() default false;}
我们看到encode变量默认是false,表明这个变量默认是没有提前url编码的,那么后续会由框架内部进行url encode处理。
如果设置成了true,表明开发者已经提前做了自定义的url encode了,框架内部将不对这个参数做url encode处理。
实际上发现只要能够作为标志请求参数的注解,都有一个encoded()方法,包括了query, field, queryMap, fieldMap等。
GET请求
Retrofit内部应该是根据不同的请求类型去处理不同的注解参数的标记的。即只有遇到了GET注解,才会去处理query注解。根据这个猜想,我们看GET注解。(后面发现这个猜想是错的)
//RequestFactory.javaprivate void parseMethodAnnotation(Annotation annotation) {if (annotation instanceof DELETE) {parseHttpMethodAndPath("DELETE", ((DELETE) annotation).value(), false);} else if (annotation instanceof GET) {parseHttpMethodAndPath("GET", ((GET) annotation).value(), false);} else if (annotation instanceof HEAD) {parseHttpMethodAndPath("HEAD", ((HEAD) annotation).value(), false);} else if (annotation instanceof PATCH) {parseHttpMethodAndPath("PATCH", ((PATCH) annotation).value(), true);} else if (annotation instanceof POST) {parseHttpMethodAndPath("POST", ((POST) annotation).value(), true);} else if (annotation instanceof PUT) {parseHttpMethodAndPath("PUT", ((PUT) annotation).value(), true);} else if (annotation instanceof OPTIONS) {parseHttpMethodAndPath("OPTIONS", ((OPTIONS) annotation).value(), false);} else if (annotation instanceof HTTP) {HTTP http = (HTTP) annotation;parseHttpMethodAndPath(http.method(), http.path(), http.hasBody());} else if (annotation instanceof retrofit2.http.Headers) {String[] headersToParse = ((retrofit2.http.Headers) annotation).value();if (headersToParse.length == 0) {throw methodError(method, "@Headers annotation is empty.");}headers = parseHeaders(headersToParse);} else if (annotation instanceof Multipart) {if (isFormEncoded) {throw methodError(method, "Only one encoding annotation is allowed.");}isMultipart = true;} else if (annotation instanceof FormUrlEncoded) {if (isMultipart) {throw methodError(method, "Only one encoding annotation is allowed.");}isFormEncoded = true;}
看第7行,这里把注解的值传递进去。一般GET注解的值是拼接一个相对路径的,Retrofit的用法是一开始构造的时候传递一个baseUrl,然后再请求的时候拼接各种相对路径。
继续看:
//RequestFactory.javaprivate void parseHttpMethodAndPath(String httpMethod, String value, boolean hasBody) {if (this.httpMethod != null) {throw methodError(method, "Only one HTTP method is allowed. Found: %s and %s.",this.httpMethod, httpMethod);}this.httpMethod = httpMethod;this.hasBody = hasBody;if (value.isEmpty()) {return;}// Get the relative URL path and existing query string, if present.int question = value.indexOf('?');if (question != -1 && question < value.length() - 1) {// Ensure the query string does not have any named parameters.String queryParams = value.substring(question + 1);Matcher queryParamMatcher = PARAM_URL_REGEX.matcher(queryParams);if (queryParamMatcher.find()) {throw methodError(method, "URL query string \"%s\" must not have replace block. "+ "For dynamic query parameters use @Query.", queryParams);}}this.relativeUrl = value;this.relativeUrlParamNames = parsePathParameters(value);}
这里保存了相对路径到变量relativeUrl,然后也解析了相对路径中可能由符号’?’拼接的路径中的请求参数。把他们保存在relativeUrlParamNames容器中。
在这里没有找到query的影子,所以上述的猜想:Retrofit内部应该是根据不同的请求类型去处理不同的注解参数的标记的。即只有遇到了GET注解,才会去处理query注解。
是错误的。
一般带着问题找源码的时候很难去整个源码全局架构去分析,只能通过猜想和代码跳转,直接去看我们想要了解的那部分,这样无法站在全局、架构、设计的角度去吃透源码,但是比较方便快速定位问题的原理,比较节省时间,并且对具体问题的印象更深刻。
那么继续代码跳转query注解:
//RequestFactory.java@Nullableprivate ParameterHandler<?> parseParameterAnnotation(int p, Type type, Annotation[] annotations, Annotation annotation) {if (annotation instanceof Url) {//...}else if (annotation instanceof Path){//...}else if (annotation instanceof Query){validateResolvableType(p, type);Query query = (Query) annotation;String name = query.value();boolean encoded = query.encoded();Class<?> rawParameterType = Utils.getRawType(type);gotQuery = true;if (Iterable.class.isAssignableFrom(rawParameterType)) {if (!(type instanceof ParameterizedType)) {throw parameterError(method, p, rawParameterType.getSimpleName()+ " must include generic type (e.g., "+ rawParameterType.getSimpleName()+ "<String>)");}ParameterizedType parameterizedType = (ParameterizedType) type;Type iterableType = Utils.getParameterUpperBound(0, parameterizedType);Converter<?, String> converter =retrofit.stringConverter(iterableType, annotations);return new ParameterHandler.Query<>(name, converter, encoded).iterable();} else if (rawParameterType.isArray()) {Class<?> arrayComponentType = boxIfPrimitive(rawParameterType.getComponentType());Converter<?, String> converter =retrofit.stringConverter(arrayComponentType, annotations);return new ParameterHandler.Query<>(name, converter, encoded).array();} else {Converter<?, String> converter =retrofit.stringConverter(type, annotations);return new ParameterHandler.Query<>(name, converter, encoded);}}else if (annotation instanceof QueryName){//...}else if(...){//...}//...}
在parseParameterAnnotation函数中找到了处理query的逻辑,这是一个比较长的函数,达到了400多行。我们只看query的处理。
在14行提取了encoded变量,然后作为参数构造了对象:ParameterHandler.Query。
// ParameterHandler.javastatic final class Query<T> extends ParameterHandler<T> {private final String name;private final Converter<T, String> valueConverter;private final boolean encoded;Query(String name, Converter<T, String> valueConverter, boolean encoded) {this.name = checkNotNull(name, "name == null");this.valueConverter = valueConverter;this.encoded = encoded;}@Override void apply(RequestBuilder builder, @Nullable T value) throws IOException {if (value == null) return; // Skip null values.String queryValue = valueConverter.convert(value);if (queryValue == null) return; // Skip converted but null valuesbuilder.addQueryParam(name, queryValue, encoded);}}
encoded变量在19行,apply函数中调用,传递到builder.addQueryParam。builder是RequestBuilder。其实就是用来构建OkHttp的Request对象的。
看他的addQueryParam函数:
// RequestBuilder.javavoid addQueryParam(String name, @Nullable String value, boolean encoded) {if (relativeUrl != null) {// Do a one-time combination of the built relative URL and the base URL.urlBuilder = baseUrl.newBuilder(relativeUrl);if (urlBuilder == null) {throw new IllegalArgumentException("Malformed URL. Base: " + baseUrl + ", Relative: " + relativeUrl);}relativeUrl = null;}if (encoded) {//noinspection ConstantConditions Checked to be non-null by above 'if' block.urlBuilder.addEncodedQueryParameter(name, value);} else {//noinspection ConstantConditions Checked to be non-null by above 'if' block.urlBuilder.addQueryParameter(name, value);}}
根据encoded变量,分别执行了urlBuilder的addEncodedQueryParameter和addQueryParameter方法。
urlBuilder是HttpUrl.Builder类对象,也就是用来构造url的类。
HttpUrl类型则来自OkHttp,我们需要看OkHttp的内容了。
到OkHttp了
分别看上述的两个方法定义:
//HttpUrl.Builder/** Encodes the query parameter using UTF-8 and adds it to this URL's query string. */fun addQueryParameter(name: String, value: String?) = apply {if (encodedQueryNamesAndValues == null) encodedQueryNamesAndValues = mutableListOf()encodedQueryNamesAndValues!!.add(name.canonicalize(encodeSet = QUERY_COMPONENT_ENCODE_SET,plusIsSpace = true))encodedQueryNamesAndValues!!.add(value?.canonicalize(encodeSet = QUERY_COMPONENT_ENCODcanonicalE_SET,plusIsSpace = true))}/** Adds the pre-encoded query parameter to this URL's query string. */fun addEncodedQueryParameter(encodedName: String, encodedValue: String?) = apply {if (encodedQueryNamesAndValues == null) encodedQueryNamesAndValues = mutableListOf()encodedQueryNamesAndValues!!.add(encodedName.canonicalize(encodeSet = QUERY_COMPONENT_REENCODE_SET,alreadyEncoded = true,plusIsSpace = true))encodedQueryNamesAndValues!!.add(encodedValue?.canonicalize(encodeSet = QUERY_COMPONENT_REENCODE_SET,alreadyEncoded = true,plusIsSpace = true))}
他的逻辑其实就是,向encodedQueryNamesAndValues容器中先添加canonicalize函数处理过的name,再添加canonicalize函数处理过的value。
canonical有规范化的意思,这里把参数的name和value规范化了,难道就是url encode了?继续看下
/*** Returns a substring of `input` on the range `[pos..limit)` with the following* transformations:** * Tabs, newlines, form feeds and carriage returns are skipped.** * In queries, ' ' is encoded to '+' and '+' is encoded to "%2B".** * Characters in `encodeSet` are percent-encoded.** * Control characters and non-ASCII characters are percent-encoded.** * All other characters are copied without transformation.** @param alreadyEncoded true to leave '%' as-is; false to convert it to '%25'.* @param strict true to encode '%' if it is not the prefix of a valid percent encoding.* @param plusIsSpace true to encode '+' as "%2B" if it is not already encoded.* @param unicodeAllowed true to leave non-ASCII codepoint unencoded.* @param charset which charset to use, null equals UTF-8.*/internal fun String.canonicalize(pos: Int = 0,limit: Int = length,encodeSet: String,alreadyEncoded: Boolean = false,strict: Boolean = false,plusIsSpace: Boolean = false,unicodeAllowed: Boolean = false,charset: Charset? = null): String {var codePoint: Intvar i = poswhile (i < limit) {codePoint = codePointAt(i)if (codePoint < 0x20 ||codePoint == 0x7f ||codePoint >= 0x80 && !unicodeAllowed ||codePoint.toChar() in encodeSet ||codePoint == '%'.toInt() &&(!alreadyEncoded || strict && !isPercentEncoded(i, limit)) ||codePoint == '+'.toInt() && plusIsSpace) {// Slow path: the character at i requires encoding!val out = Buffer()out.writeUtf8(this, pos, i)out.writeCanonicalized(input = this,pos = i,limit = limit,encodeSet = encodeSet,alreadyEncoded = alreadyEncoded,strict = strict,plusIsSpace = plusIsSpace,unicodeAllowed = unicodeAllowed,charset = charset)return out.readUtf8()}i += Character.charCount(codePoint)}// Fast path: no characters in [pos..limit) required encoding.return substring(pos, limit)}
算是猜对了,这个函数做的事情就是url encode。
函数的具体算法就不看了,可以看到函数的参数有个alreadyEncoded: Boolean,即可以配置是不是已经编码过了。
前面看到所有的url encode后的参数都存在容器encodedQueryNamesAndValues里面,他是怎么被使用的呢?
HttpUrl.Builder是用来buildHttpUrl的,他会把query参数全部拼接好然后给到HttpUrl。
// HttpUrl.Builderfun build(): HttpUrl {@Suppress("UNCHECKED_CAST") // percentDecode returns either List<String?> or List<String>.return HttpUrl(scheme = scheme ?: throw IllegalStateException("scheme == null"),username = encodedUsername.percentDecode(),password = encodedPassword.percentDecode(),host = host ?: throw IllegalStateException("host == null"),port = effectivePort(),pathSegments = encodedPathSegments.percentDecode() as List<String>,queryNamesAndValues = encodedQueryNamesAndValues?.percentDecode(plusIsSpace = true),fragment = encodedFragment?.percentDecode(),url = toString())}
看第13行的toString
override fun toString(): String {return buildString {if (scheme != null) {append(scheme)append("://")} else {append("//")}if (encodedUsername.isNotEmpty() || encodedPassword.isNotEmpty()) {append(encodedUsername)if (encodedPassword.isNotEmpty()) {append(':')append(encodedPassword)}append('@')}if (host != null) {if (':' in host!!) {// Host is an IPv6 address.append('[')append(host)append(']')} else {append(host)}}if (port != -1 || scheme != null) {val effectivePort = effectivePort()if (scheme == null || effectivePort != defaultPort(scheme!!)) {append(':')append(effectivePort)}}encodedPathSegments.toPathString(this)if (encodedQueryNamesAndValues != null) {append('?')encodedQueryNamesAndValues!!.toQueryString(this)}if (encodedFragment != null) {append('#')append(encodedFragment)}}}
看35到38行,拼接”?”,然后拼接参数:
// HttpUrl.companion object/** Returns a string for this list of query names and values. */internal fun List<String?>.toQueryString(out: StringBuilder) {for (i in 0 until size step 2) {val name = this[i]val value = this[i + 1]if (i > 0) out.append('&')out.append(name)if (value != null) {out.append('=')out.append(value)}}}
步长为2,将name和value依次拼接。
那这个HttpUrl构造给谁用呢?
内部和外部都在直接用。
内部给底层连接池用,上层拦截器用,外部给Retrofit等第三方库用,或者也可以直接面向客户使用。
总结
okhttp太牛掰了。
