本博客(http://blog.csdn.net/livelylittlefish )贴出作者(阿波)相关研究、学习内容所做的笔记,欢迎广大朋友指正!
Content
0.序
1.hash结构
1.1ngx_hash_t结构
1.2ngx_hash_init_t结构
1.3ngx_hash_key_t结构
1.4hash的逻辑结构
2.hash操作
2.1NGX_HASH_ELT_SIZE宏
2.2hash函数
2.3hash初始化
2.4hash查找
3.一个例子
3.1代码
3.2如何编译
3.3运行结果
3.3.1bucket_size=64字节
3.3.2bucket_size=256字节
4.小结
0. 序
本文继续介绍nginx的数据结构——hash结构。
链表实现文件:文件:./src/core/ngx_hash.h/.c。.表示nginx-1.0.4代码目录,本文为/usr/src/nginx-1.0.4。
1. hash结构
nginx的hash结构比其list、array、queue等结构稍微复杂一些,下图是hash相关数据结构图。下面一一介绍。
1.1 ngx_hash_t结构
nginx的hash结构为ngx_hash_t,hash元素结构为ngx_hash_elt_t,定义如下。
typedef struct { //hash元素结构 void *value; //value,即某个key对应的值,即<key,value>中的value u_short len; //name长度 u_char name[1]; //某个要hash的数据(在nginx中表现为字符串),即<key,value>中的key } ngx_hash_elt_t; typedef struct { //hash结构 ngx_hash_elt_t **buckets; //hash桶(有size个桶) ngx_uint_t size; //hash桶个数 } ngx_hash_t;
其中,sizeof(ngx_hash_t) = 8,sizeof(ngx_hash_elt_t) = 8。实际上,ngx_hash_elt_t结构中的name字段就是ngx_hash_key_t结构中的key。这在ngx_hash_init()函数中可以看到,请参考后续的分析。该结构在模块配置解析时经常使用。
1.2 ngx_hash_init_t结构
nginx的hash初始化结构是ngx_hash_init_t,用来将其相关数据封装起来作为参数传递给ngx_hash_init()或ngx_hash_wildcard_init()函数。这两个函数主要是在http相关模块中使用,例如ngx_http_server_names()函数(优化http Server Names),ngx_http_merge_types()函数(合并httptype),ngx_http_fastcgi_merge_loc_conf()函数(合并FastCGI Location Configuration)等函数或过程用到的参数、局部对象/变量等。这些内容将在后续的文章中讲述。
ngx_hash_init_t结构如下。sizeof(ngx_hash_init_t)=28。
typedef struct { //hash初始化结构 ngx_hash_t *hash; //指向待初始化的hash结构 ngx_hash_key_pt key; //hash函数指针 ngx_uint_t max_size; //bucket的最大个数 ngx_uint_t bucket_size; //每个bucket的空间 char *name; //该hash结构的名字(仅在错误日志中使用) ngx_pool_t *pool; //该hash结构从pool指向的内存池中分配 ngx_pool_t *temp_pool; //分配临时数据空间的内存池 } ngx_hash_init_t;
1.3 ngx_hash_key_t结构
该结构也主要用来保存要hash的数据,即键-值对<key,value>,在实际使用中,一般将多个键-值对保存在ngx_hash_key_t结构的数组中,作为参数传给ngx_hash_init()或ngx_hash_wildcard_init()函数。其定义如下。
typedef struct { //hash key结构 ngx_str_t key; //key,为nginx的字符串结构 ngx_uint_t key_hash; //由该key计算出的hash值(通过hash函数如ngx_hash_key_lc()) void *value; //该key对应的值,组成一个键-值对<key,value> } ngx_hash_key_t; typedef struct { //字符串结构 size_t len; //字符串长度 u_char *data; //字符串内容 } ngx_str_t;
其中,sizeof(ngx_hash_key_t) = 16。一般在使用中,value指针可能指向静态数据区(例如全局数组、常量字符串)、堆区(例如动态分配的数据区用来保存value值)等。可参考本文后面的例子。
关于ngx_table_elt_t结构和ngx_hash_keys_arrays_t结构,因其对于hash结构本身没有太大作用,主要是为模块配置、referer合法性验证等设计的数据结构,例如http的core模块、map模块、referer模块、SSI filter模块等,此处不再讲述,将在后续的文章中介绍。
1.4 hash的逻辑结构
ngx_hash_init_t结构引用了ngx_pool_t结构,因此本文参考nginx-1.0.4源码分析—内存池结构ngx_pool_t及内存管理一文画出相关结构的逻辑图,如下。注:本文采用UML的方式画出该图。
2. hash操作
2.1 NGX_HASH_ELT_SIZE宏
NGX_HASH_ELT_SIZE宏用来计算上述ngx_hash_elt_t结构大小,定义如下。
#define NGX_HASH_ELT_SIZE(name) \ //该参数name即为ngx_hash_elt_t结构指针 (sizeof(void *) + ngx_align((name)->key.len + 2, sizeof(void *))) //以4字节对齐
在32位平台上,sizeof(void*)=4,(name)->key.len即是ngx_hash_elt_t结构中name数组保存的内容的长度,其中的"+2"是要加上该结构中len字段(u_short类型)的大小。
因此,NGX_HASH_ELT_SIZE(name)=4+ngx_align((name)->key.len + 2, 4),该式后半部分即是(name)->key.len+2以4字节对齐的大小。
2.2 hash函数
nginx-1.0.4提供的hash函数有以下几种。
#define ngx_hash(key, c) ((ngx_uint_t) key * 31 + c) //hash宏 ngx_uint_t ngx_hash_key(u_char *data, size_t len); ngx_uint_t ngx_hash_key_lc(u_char *data, size_t len); //lc表示lower case,即字符串转换为小写后再计算hash值 ngx_uint_t ngx_hash_strlow(u_char *dst, u_char *src, size_t n);
hash函数都很简单,以上3个函数都会调用ngx_hash宏,该宏返回一个(长)整数。此处介绍第一个函数,定义如下。
ngx_uint_t ngx_hash_key(u_char *data, size_t len) { ngx_uint_t i, key; key = 0; for (i = 0; i < len; i++) { key = ngx_hash(key, data[i]); } return key; }
因此,ngx_hash_key函数的计算可表述为下列公式。
Key[0] = data[0] Key[1] = data[0]*31 + data[1] Key[2] = (data[0]*31 + data[1])*31 + data[2] ... Key[len-1] = ((((data[0]*31 + data[1])*31 + data[2])*31) ... data[len-2])*31 + data[len-1]
key[len-1]即为传入的参数data对应的hash值。
2.3 hash初始化
hash初始化由ngx_hash_init()函数完成,其names参数是ngx_hash_key_t结构的数组,即键-值对<key,value>数组,nelts表示该数组元素的个数。因此,在调用该函数进行初始化之前,ngx_hash_key_t结构的数组是准备好的,如何使用,可以采用nginx的ngx_array_t结构,详见本文后面的例子。
该函数初始化的结果就是将names数组保存的键-值对<key,value>,通过hash的方式将其存入相应的一个或多个hash桶(即代码中的buckets)中,该hash过程用到的hash函数一般为ngx_hash_key_lc等。hash桶里面存放的是ngx_hash_elt_t结构的指针(hash元素指针),该指针指向一个基本连续的数据区。该数据区中存放的是经hash之后的键-值对<key',value'>,即ngx_hash_elt_t结构中的字段<name,value>。每一个这样的数据区存放的键-值对<key',value'>可以是一个或多个。
此处有几个问题需要说明。
问题1:为什么说是基本连续?
——用NGX_HASH_ELT_SIZE宏计算某个hash元素的总长度时,存在以sizeof(void*)对齐的填补(padding)。因此将names数组中的键-值对<key,value>中的key拷贝到ngx_hash_elt_t结构的name[1]数组中时,已经为该hash元素分配的空间不会完全被用完,故这个数据区是基本连续的。这一点也可以参考本节后面的结构图或本文后面的例子。
问题2:这些基本连续的数据区从哪里分配的?
——当然是从该函数的第一个参数ngx_hash_init_t的pool字段指向的内存池中分配的。
问题3:<key',value'>与<key,value>不同的是什么?
——key保存的仅仅是个指针,而key'却是key拷贝到name[1]的结果。而value和value'都是指针。如1.3节说明,value指针可能指向静态数据区(例如全局数组、常量字符串)、堆区(例如动态分配的数据区用来保存value值)等。可参考本文后面的例子。
问题4:如何知道某个键-值对<key,value>放在哪个hash桶中?
——key = names[n].key_hash % size; 代码中的这个计算是也。计算结果key即是该键要放在那个hash桶的编号(从0到size-1)。
该函数代码如下。一些疑点、难点的解释请参考//后笔者所加的注释,也可参考本节的hash结构图。
//nelts是names数组中(实际)元素的个数 ngx_int_t ngx_hash_init(ngx_hash_init_t *hinit, ngx_hash_key_t *names, ngx_uint_t nelts) { u_char *elts; size_t len; u_short *test; ngx_uint_t i, n, key, size, start, bucket_size; ngx_hash_elt_t *elt, **buckets; for (n = 0; n < nelts; n++) { //检查names数组的每一个元素,判断桶的大小是否够分配 if (hinit->bucket_size < NGX_HASH_ELT_SIZE(&names[n]) + sizeof(void *)) { //有任何一个元素,桶的大小不够为该元素分配空间,则退出 ngx_log_error(NGX_LOG_EMERG, hinit->pool->log, 0, "could not build the %s, you should " "increase %s_bucket_size: %i", hinit->name, hinit->name, hinit->bucket_size); return NGX_ERROR; } } //分配2*max_size个字节的空间保存hash数据(该内存分配操作不在nginx的内存池中进行,因为test只是临时的) test = ngx_alloc(hinit->max_size * sizeof(u_short), hinit->pool->log); if (test == NULL) { return NGX_ERROR; } bucket_size = hinit->bucket_size - sizeof(void *); //一般sizeof(void*)=4 start = nelts / (bucket_size / (2 * sizeof(void *))); // start = start ? start : 1; if (hinit->max_size > 10000 && hinit->max_size / nelts < 100) { start = hinit->max_size - 1000; } for (size = start; size < hinit->max_size; size++) { ngx_memzero(test, size * sizeof(u_short)); //标记1:此块代码是检查bucket大小是否够分配hash数据 for (n = 0; n < nelts; n++) { if (names[n].key.data == NULL) { continue; } //计算key和names中所有name长度,并保存在test[key]中 key = names[n].key_hash % size; //若size=1,则key一直为0 test[key] = (u_short) (test[key] + NGX_HASH_ELT_SIZE(&names[n])); if (test[key] > (u_short) bucket_size) {//若超过了桶的大小,则到下一个桶重新计算 goto next; } } goto found; next: continue; } //若没有找到合适的bucket,退出 ngx_log_error(NGX_LOG_EMERG, hinit->pool->log, 0, "could not build the %s, you should increase " "either %s_max_size: %i or %s_bucket_size: %i", hinit->name, hinit->name, hinit->max_size, hinit->name, hinit->bucket_size); ngx_free(test); return NGX_ERROR; found: //找到合适的bucket for (i = 0; i < size; i++) { //将test数组前size个元素初始化为4 test[i] = sizeof(void *); } /** 标记2:与标记1代码基本相同,但此块代码是再次计算所有hash数据的总长度(标记1的检查已通过) 但此处的test[i]已被初始化为4,即相当于后续的计算再加上一个void指针的大小。 */ for (n = 0; n < nelts; n++) { if (names[n].key.data == NULL) { continue; } //计算key和names中所有name长度,并保存在test[key]中 key = names[n].key_hash % size; //若size=1,则key一直为0 test[key] = (u_short) (test[key] + NGX_HASH_ELT_SIZE(&names[n])); } //计算hash数据的总长度 len = 0; for (i = 0; i < size; i++) { if (test[i] == sizeof(void *)) {//若test[i]仍为初始化的值4,即没有变化,则继续 continue; } //对test[i]按ngx_cacheline_size对齐(32位平台,ngx_cacheline_size=32) test[i] = (u_short) (ngx_align(test[i], ngx_cacheline_size)); len += test[i]; } if (hinit->hash == NULL) {//在内存池中分配hash头及buckets数组(size个ngx_hash_elt_t*结构) hinit->hash = ngx_pcalloc(hinit->pool, sizeof(ngx_hash_wildcard_t) + size * sizeof(ngx_hash_elt_t *)); if (hinit->hash == NULL) { ngx_free(test); return NGX_ERROR; } //计算buckets的启示位置(在ngx_hash_wildcard_t结构之后) buckets = (ngx_hash_elt_t **) ((u_char *) hinit->hash + sizeof(ngx_hash_wildcard_t)); } else { //在内存池中分配buckets数组(size个ngx_hash_elt_t*结构) buckets = ngx_pcalloc(hinit->pool, size * sizeof(ngx_hash_elt_t *)); if (buckets == NULL) { ngx_free(test); return NGX_ERROR; } } //接着分配elts,大小为len+ngx_cacheline_size,此处为什么+32?——下面要按32字节对齐 elts = ngx_palloc(hinit->pool, len + ngx_cacheline_size); if (elts == NULL) { ngx_free(test); return NGX_ERROR; } //将elts地址按ngx_cacheline_size=32对齐 elts = ngx_align_ptr(elts, ngx_cacheline_size); for (i = 0; i < size; i++) { //将buckets数组与相应elts对应起来 if (test[i] == sizeof(void *)) { continue; } buckets[i] = (ngx_hash_elt_t *) elts; elts += test[i]; } for (i = 0; i < size; i++) { //test数组置0 test[i] = 0; } for (n = 0; n < nelts; n++) { //将传进来的每一个hash数据存入hash表 if (names[n].key.data == NULL) { continue; } //计算key,即将被hash的数据在第几个bucket,并计算其对应的elts位置 key = names[n].key_hash % size; elt = (ngx_hash_elt_t *) ((u_char *) buckets[key] + test[key]); //对ngx_hash_elt_t结构赋值 elt->value = names[n].value; elt->len = (u_short) names[n].key.len; ngx_strlow(elt->name, names[n].key.data, names[n].key.len); //计算下一个要被hash的数据的长度偏移 test[key] = (u_short) (test[key] + NGX_HASH_ELT_SIZE(&names[n])); } for (i = 0; i < size; i++) { if (buckets[i] == NULL) { continue; } //test[i]相当于所有被hash的数据总长度 elt = (ngx_hash_elt_t *) ((u_char *) buckets[i] + test[i]); elt->value = NULL; } ngx_free(test); //释放该临时空间 hinit->hash->buckets = buckets; hinit->hash->size = size; return NGX_OK; }
所谓的hash数据长度即指ngx_hash_elt_t结构被赋值后的长度。nelts个元素存放在names数组中,调用该函数对hash进行初始化之后,这nelts个元素被保存在size个hash桶指向的ngx_hash_elts_t数据区,这些数据区中共保存了nelts个hash元素。即hash桶(buckets)存放的是ngx_hash_elt_t数据区的起始地址,以该起始地址开始的数据区存放的是经hash之后的hash元素,每个hash元素的最后是以name[0]为开始的字符串,该字符串就是names数组中某个元素的key,即键值对<key,value>中的key,然后该字符串之后会有几个字节的因对齐产生的padding。
一个典型的经初始化后的hash物理结构如下。具体的可参考后文的例子。
2.4 hash查找
hash查找操作由ngx_hash_find()函数完成,代码如下。//后的注释为笔者所加。
//由key,name,len信息在hash指向的hash table中查找该key对应的value void * ngx_hash_find(ngx_hash_t *hash, ngx_uint_t key, u_char *name, size_t len) { ngx_uint_t i; ngx_hash_elt_t *elt; elt = hash->buckets[key % hash->size];//由key找到所在的bucket(该bucket中保存其elts地址) if (elt == NULL) { return NULL; } while (elt->value) { if (len != (size_t) elt->len) { //先判断长度 goto next; } for (i = 0; i < len; i++) { if (name[i] != elt->name[i]) { //接着比较name的内容(此处按字符匹配) goto next; } } return elt->value; //匹配成功,直接返回该ngx_hash_elt_t结构的value字段 next: //注意此处从elt->name[0]地址处向后偏移,故偏移只需加该elt的len即可,然后在以4字节对齐 elt = (ngx_hash_elt_t *) ngx_align_ptr(&elt->name[0] + elt->len, sizeof(void *)); continue; } return NULL; }
查找操作相当简单,由key直接计算所在的bucket,该bucket中保存其所在ngx_hash_elt_t数据区的起始地址;然后根据长度判断并用name内容匹配,匹配成功,其ngx_hash_elt_t结构的value字段即是所求。
3. 一个例子
本节给出一个创建内存池并从中分配hash结构、hash桶、hash元素并将键-值对<key,value>加入该hash结构的简单例子。
在该例中,将完成这样一个应用,将给定的多个url及其ip组成的二元组<url,ip>作为<key,value>,初始化时对这些<url,ip>进行hash,然后根据给定的url查找其对应的ip地址,若没有找到,则给出相关提示信息。以此向读者展示nginx的hash使用方法。
3.1代码
/** * ngx_hash_t test * in this example, it will first save URLs into the memory pool, and IPs saved in static memory. * then, give some examples to find IP according to a URL. */ #include <stdio.h> #include "ngx_config.h" #include "ngx_conf_file.h" #include "nginx.h" #include "ngx_core.h" #include "ngx_string.h" #include "ngx_palloc.h" #include "ngx_array.h" #include "ngx_hash.h" #define Max_Num 7 #define Max_Size 1024 #define Bucket_Size 64 //256, 64 #define NGX_HASH_ELT_SIZE(name) \ (sizeof(void *) + ngx_align((name)->key.len + 2, sizeof(void *))) /* for hash test */ static ngx_str_t urls[Max_Num] = { ngx_string("www.baidu.com"), //220.181.111.147 ngx_string("www.sina.com.cn"), //58.63.236.35 ngx_string("www.google.com"), //74.125.71.105 ngx_string("www.qq.com"), //60.28.14.190 ngx_string("www.163.com"), //123.103.14.237 ngx_string("www.sohu.com"), //219.234.82.50 ngx_string("www.abo321.org") //117.40.196.26 }; static char* values[Max_Num] = { "220.181.111.147", "58.63.236.35", "74.125.71.105", "60.28.14.190", "123.103.14.237", "219.234.82.50", "117.40.196.26" }; #define Max_Url_Len 15 #define Max_Ip_Len 15 #define Max_Num2 2 /* for finding test */ static ngx_str_t urls2[Max_Num2] = { ngx_string("www.china.com"), //60.217.58.79 ngx_string("www.csdn.net") //117.79.157.242 }; ngx_hash_t* init_hash(ngx_pool_t *pool, ngx_array_t *array); void dump_pool(ngx_pool_t* pool); void dump_hash_array(ngx_array_t* a); void dump_hash(ngx_hash_t *hash, ngx_array_t *array); ngx_array_t* add_urls_to_array(ngx_pool_t *pool); void find_test(ngx_hash_t *hash, ngx_str_t addr[], int num); /* for passing compiling */ volatile ngx_cycle_t *ngx_cycle; void ngx_log_error_core(ngx_uint_t level, ngx_log_t *log, ngx_err_t err, const char *fmt, ...) { } int main(/* int argc, char **argv */) { ngx_pool_t *pool = NULL; ngx_array_t *array = NULL; ngx_hash_t *hash; printf("--------------------------------\n"); printf("create a new pool:\n"); printf("--------------------------------\n"); pool = ngx_create_pool(1024, NULL); dump_pool(pool); printf("--------------------------------\n"); printf("create and add urls to it:\n"); printf("--------------------------------\n"); array = add_urls_to_array(pool); //in fact, here should validate array dump_hash_array(array); printf("--------------------------------\n"); printf("the pool:\n"); printf("--------------------------------\n"); dump_pool(pool); hash = init_hash(pool, array); if (hash == NULL) { printf("Failed to initialize hash!\n"); return -1; } printf("--------------------------------\n"); printf("the hash:\n"); printf("--------------------------------\n"); dump_hash(hash, array); printf("\n"); printf("--------------------------------\n"); printf("the pool:\n"); printf("--------------------------------\n"); dump_pool(pool); //find test printf("--------------------------------\n"); printf("find test:\n"); printf("--------------------------------\n"); find_test(hash, urls, Max_Num); printf("\n"); find_test(hash, urls2, Max_Num2); //release ngx_array_destroy(array); ngx_destroy_pool(pool); return 0; } ngx_hash_t* init_hash(ngx_pool_t *pool, ngx_array_t *array) { ngx_int_t result; ngx_hash_init_t hinit; ngx_cacheline_size = 32; //here this variable for nginx must be defined hinit.hash = NULL; //if hinit.hash is NULL, it will alloc memory for it in ngx_hash_init hinit.key = &ngx_hash_key_lc; //hash function hinit.max_size = Max_Size; hinit.bucket_size = Bucket_Size; hinit.name = "my_hash_sample"; hinit.pool = pool; //the hash table exists in the memory pool hinit.temp_pool = NULL; result = ngx_hash_init(&hinit, (ngx_hash_key_t*)array->elts, array->nelts); if (result != NGX_OK) return NULL; return hinit.hash; } void dump_pool(ngx_pool_t* pool) { while (pool) { printf("pool = 0x%x\n", pool); printf(" .d\n"); printf(" .last = 0x%x\n", pool->d.last); printf(" .end = 0x%x\n", pool->d.end); printf(" .next = 0x%x\n", pool->d.next); printf(" .failed = %d\n", pool->d.failed); printf(" .max = %d\n", pool->max); printf(" .current = 0x%x\n", pool->current); printf(" .chain = 0x%x\n", pool->chain); printf(" .large = 0x%x\n", pool->large); printf(" .cleanup = 0x%x\n", pool->cleanup); printf(" .log = 0x%x\n", pool->log); printf("available pool memory = %d\n\n", pool->d.end - pool->d.last); pool = pool->d.next; } } void dump_hash_array(ngx_array_t* a) { char prefix[] = " "; if (a == NULL) return; printf("array = 0x%x\n", a); printf(" .elts = 0x%x\n", a->elts); printf(" .nelts = %d\n", a->nelts); printf(" .size = %d\n", a->size); printf(" .nalloc = %d\n", a->nalloc); printf(" .pool = 0x%x\n", a->pool); printf(" elements:\n"); ngx_hash_key_t *ptr = (ngx_hash_key_t*)(a->elts); for (; ptr < (ngx_hash_key_t*)(a->elts + a->nalloc * a->size); ptr++) { printf(" 0x%x: {key = (\"%s\"%.*s, %d), key_hash = %-10ld, value = \"%s\"%.*s}\n", ptr, ptr->key.data, Max_Url_Len - ptr->key.len, prefix, ptr->key.len, ptr->key_hash, ptr->value, Max_Ip_Len - strlen(ptr->value), prefix); } printf("\n"); } /** * pass array pointer to read elts[i].key_hash, then for getting the position - key */ void dump_hash(ngx_hash_t *hash, ngx_array_t *array) { int loop; char prefix[] = " "; u_short test[Max_Num] = {0}; ngx_uint_t key; ngx_hash_key_t* elts; int nelts; if (hash == NULL) return; printf("hash = 0x%x: **buckets = 0x%x, size = %d\n", hash, hash->buckets, hash->size); for (loop = 0; loop < hash->size; loop++) { ngx_hash_elt_t *elt = hash->buckets[loop]; printf(" 0x%x: buckets[%d] = 0x%x\n", &(hash->buckets[loop]), loop, elt); } printf("\n"); elts = (ngx_hash_key_t*)array->elts; nelts = array->nelts; for (loop = 0; loop < nelts; loop++) { char url[Max_Url_Len + 1] = {0}; key = elts[loop].key_hash % hash->size; ngx_hash_elt_t *elt = (ngx_hash_elt_t *) ((u_char *) hash->buckets[key] + test[key]); ngx_strlow(url, elt->name, elt->len); printf(" buckets %d: 0x%x: {value = \"%s\"%.*s, len = %d, name = \"%s\"%.*s}\n", key, elt, (char*)elt->value, Max_Ip_Len - strlen((char*)elt->value), prefix, elt->len, url, Max_Url_Len - elt->len, prefix); //replace elt->name with url test[key] = (u_short) (test[key] + NGX_HASH_ELT_SIZE(&elts[loop])); } } ngx_array_t* add_urls_to_array(ngx_pool_t *pool) { int loop; char prefix[] = " "; ngx_array_t *a = ngx_array_create(pool, Max_Num, sizeof(ngx_hash_key_t)); for (loop = 0; loop < Max_Num; loop++) { ngx_hash_key_t *hashkey = (ngx_hash_key_t*)ngx_array_push(a); hashkey->key = urls[loop]; hashkey->key_hash = ngx_hash_key_lc(urls[loop].data, urls[loop].len); hashkey->value = (void*)values[loop]; /** for debug printf("{key = (\"%s\"%.*s, %d), key_hash = %-10ld, value = \"%s\"%.*s}, added to array\n", hashkey->key.data, Max_Url_Len - hashkey->key.len, prefix, hashkey->key.len, hashkey->key_hash, hashkey->value, Max_Ip_Len - strlen(hashkey->value), prefix); */ } return a; } void find_test(ngx_hash_t *hash, ngx_str_t addr[], int num) { ngx_uint_t key; int loop; char prefix[] = " "; for (loop = 0; loop < num; loop++) { key = ngx_hash_key_lc(addr[loop].data, addr[loop].len); void *value = ngx_hash_find(hash, key, addr[loop].data, addr[loop].len); if (value) { printf("(url = \"%s\"%.*s, key = %-10ld) found, (ip = \"%s\")\n", addr[loop].data, Max_Url_Len - addr[loop].len, prefix, key, (char*)value); } else { printf("(url = \"%s\"%.*s, key = %-10d) not found!\n", addr[loop].data, Max_Url_Len - addr[loop].len, prefix, key); } } }
3.2如何编译
请参考nginx-1.0.4源码分析—内存池结构ngx_pool_t及内存管理一文。本文编写的makefile文件如下。
CXX = gcc CXXFLAGS += -g -Wall -Wextra NGX_ROOT = /usr/src/nginx-1.0.4 TARGETS = ngx_hash_t_test TARGETS_C_FILE = $(TARGETS).c CLEANUP = rm -f $(TARGETS) *.o all: $(TARGETS) clean: $(CLEANUP) CORE_INCS = -I. \ -I$(NGX_ROOT)/src/core \ -I$(NGX_ROOT)/src/event \ -I$(NGX_ROOT)/src/event/modules \ -I$(NGX_ROOT)/src/os/unix \ -I$(NGX_ROOT)/objs \ NGX_PALLOC = $(NGX_ROOT)/objs/src/core/ngx_palloc.o NGX_STRING = $(NGX_ROOT)/objs/src/core/ngx_string.o NGX_ALLOC = $(NGX_ROOT)/objs/src/os/unix/ngx_alloc.o NGX_ARRAY = $(NGX_ROOT)/objs/src/core/ngx_array.o NGX_HASH = $(NGX_ROOT)/objs/src/core/ngx_hash.o $(TARGETS): $(TARGETS_C_FILE) $(CXX) $(CXXFLAGS) $(CORE_INCS) $(NGX_PALLOC) $(NGX_STRING) $(NGX_ALLOC) $(NGX_ARRAY) $(NGX_HASH) $^ -o $@
3.3 运行结果
3.3.1 bucket_size=64字节
bucket_size=64字节时,运行结果如下。
# ./ngx_hash_t_test -------------------------------- create a new pool: -------------------------------- pool = 0x8870020 .d .last = 0x8870048 .end = 0x8870420 .next = 0x0 .failed = 0 .max = 984 .current = 0x8870020 .chain = 0x0 .large = 0x0 .cleanup = 0x0 .log = 0x0 available pool memory = 984 -------------------------------- create and add urls to it: -------------------------------- array = 0x8870048 .elts = 0x887005c .nelts = 7 .size = 16 .nalloc = 7 .pool = 0x8870020 elements: 0x887005c: {key = ("www.baidu.com" , 13), key_hash = 270263191 , value = "220.181.111.147"} 0x887006c: {key = ("www.sina.com.cn", 15), key_hash = 1528635686, value = "58.63.236.35" } 0x887007c: {key = ("www.google.com" , 14), key_hash = -702889725, value = "74.125.71.105" } 0x887008c: {key = ("www.qq.com" , 10), key_hash = 203430122 , value = "60.28.14.190" } 0x887009c: {key = ("www.163.com" , 11), key_hash = -640386838, value = "123.103.14.237" } 0x88700ac: {key = ("www.sohu.com" , 12), key_hash = 1313636595, value = "219.234.82.50" } 0x88700bc: {key = ("www.abo321.org" , 14), key_hash = 1884209457, value = "117.40.196.26" } -------------------------------- the pool: -------------------------------- pool = 0x8870020 .d .last = 0x88700cc .end = 0x8870420 .next = 0x0 .failed = 0 .max = 984 .current = 0x8870020 .chain = 0x0 .large = 0x0 .cleanup = 0x0 .log = 0x0 available pool memory = 852 -------------------------------- the hash: -------------------------------- hash = 0x88700cc: **buckets = 0x88700d8, size = 3 0x88700d8: buckets[0] = 0x8870100 0x88700dc: buckets[1] = 0x8870140 0x88700e0: buckets[2] = 0x8870180 buckets 1: 0x8870140: {value = "220.181.111.147", len = 13, name = "www.baidu.com" } buckets 2: 0x8870180: {value = "58.63.236.35" , len = 15, name = "www.sina.com.cn"} buckets 1: 0x8870154: {value = "74.125.71.105" , len = 14, name = "www.google.com" } buckets 2: 0x8870198: {value = "60.28.14.190" , len = 10, name = "www.qq.com" } buckets 0: 0x8870100: {value = "123.103.14.237" , len = 11, name = "www.163.com" } buckets 0: 0x8870114: {value = "219.234.82.50" , len = 12, name = "www.sohu.com" } buckets 0: 0x8870128: {value = "117.40.196.26" , len = 14, name = "www.abo321.org" } -------------------------------- the pool: -------------------------------- pool = 0x8870020 .d .last = 0x88701c4 .end = 0x8870420 .next = 0x0 .failed = 0 .max = 984 .current = 0x8870020 .chain = 0x0 .large = 0x0 .cleanup = 0x0 .log = 0x0 available pool memory = 604 -------------------------------- find test: -------------------------------- (url = "www.baidu.com" , key = 270263191 ) found, (ip = "220.181.111.147") (url = "www.sina.com.cn", key = 1528635686) found, (ip = "58.63.236.35") (url = "www.google.com" , key = -702889725) found, (ip = "74.125.71.105") (url = "www.qq.com" , key = 203430122 ) found, (ip = "60.28.14.190") (url = "www.163.com" , key = -640386838) found, (ip = "123.103.14.237") (url = "www.sohu.com" , key = 1313636595) found, (ip = "219.234.82.50") (url = "www.abo321.org" , key = 1884209457) found, (ip = "117.40.196.26") (url = "www.china.com" , key = -1954599725) not found! (url = "www.csdn.net" , key = -1667448544) not found!
以上结果是bucket_size=64字节的输出。由该结果可以看出,对于给定的7个url,程序将其分到了3个bucket中,详见该结果。该例子的hash物理结构图如下。
3.3.2 bucket_size=256字节
bucket_size=256字节时,运行结果如下。# ./ngx_hash_t_test -------------------------------- create a new pool: -------------------------------- pool = 0x8b74020 .d .last = 0x8b74048 .end = 0x8b74420 .next = 0x0 .failed = 0 .max = 984 .current = 0x8b74020 .chain = 0x0 .large = 0x0 .cleanup = 0x0 .log = 0x0 available pool memory = 984 -------------------------------- create and add urls to it: -------------------------------- array = 0x8b74048 .elts = 0x8b7405c .nelts = 7 .size = 16 .nalloc = 7 .pool = 0x8b74020 elements: 0x8b7405c: {key = ("www.baidu.com" , 13), key_hash = 270263191 , value = "220.181.111.147"} 0x8b7406c: {key = ("www.sina.com.cn", 15), key_hash = 1528635686, value = "58.63.236.35" } 0x8b7407c: {key = ("www.google.com" , 14), key_hash = -702889725, value = "74.125.71.105" } 0x8b7408c: {key = ("www.qq.com" , 10), key_hash = 203430122 , value = "60.28.14.190" } 0x8b7409c: {key = ("www.163.com" , 11), key_hash = -640386838, value = "123.103.14.237" } 0x8b740ac: {key = ("www.sohu.com" , 12), key_hash = 1313636595, value = "219.234.82.50" } 0x8b740bc: {key = ("www.abo321.org" , 14), key_hash = 1884209457, value = "117.40.196.26" } -------------------------------- the pool: -------------------------------- pool = 0x8b74020 .d .last = 0x8b740cc .end = 0x8b74420 .next = 0x0 .failed = 0 .max = 984 .current = 0x8b74020 .chain = 0x0 .large = 0x0 .cleanup = 0x0 .log = 0x0 available pool memory = 852 -------------------------------- the hash: -------------------------------- hash = 0x8b740cc: **buckets = 0x8b740d8, size = 1 0x8b740d8: buckets[0] = 0x8b740e0 buckets 0: {value = "220.181.111.147", len = 13, name = "www.baidu.com" } buckets 0: {value = "58.63.236.35" , len = 15, name = "www.sina.com.cn"} buckets 0: {value = "74.125.71.105" , len = 14, name = "www.google.com" } buckets 0: {value = "60.28.14.190" , len = 10, name = "www.qq.com" } buckets 0: {value = "123.103.14.237" , len = 11, name = "www.163.com" } buckets 0: {value = "219.234.82.50" , len = 12, name = "www.sohu.com" } buckets 0: {value = "117.40.196.26" , len = 14, name = "www.abo321.org" } -------------------------------- the pool: -------------------------------- pool = 0x8b74020 .d .last = 0x8b7419c .end = 0x8b74420 .next = 0x0 .failed = 0 .max = 984 .current = 0x8b74020 .chain = 0x0 .large = 0x0 .cleanup = 0x0 .log = 0x0 available pool memory = 644 -------------------------------- find test: -------------------------------- (url = "www.baidu.com" , key = 270263191 ) found, (ip = "220.181.111.147") (url = "www.sina.com.cn", key = 1528635686) found, (ip = "58.63.236.35") (url = "www.google.com" , key = -702889725) found, (ip = "74.125.71.105") (url = "www.qq.com" , key = 203430122 ) found, (ip = "60.28.14.190") (url = "www.163.com" , key = -640386838) found, (ip = "123.103.14.237") (url = "www.sohu.com" , key = 1313636595) found, (ip = "219.234.82.50") (url = "www.abo321.org" , key = 1884209457) found, (ip = "117.40.196.26") (url = "www.china.com" , key = -1954599725) not found! (url = "www.csdn.net" , key = -1667448544) not found!
以上结果是bucket_size=256字节的输出。由给结果可以看出,对于给定的7个url,程序将其放到了1个bucket中,即ngx_hash_init()函数中的size=1,因这7个url的总长度只有140,因此,只需size=1个bucket,即buckets[0]。
下表是ngx_hash_init()函数在计算过程中的一些数据。物理结构图省略,可参考上图。
url |
计算长度 |
test[0]的值 |
www.baidu.com |
4+ngx_align(13+2,4)=20 |
20 |
www.sina.com.cn |
4+ngx_align(15+2,4)=24 |
44 |
www.google.com |
4+ngx_align(14+2,4)=20 |
64 |
www.qq.com |
4+ngx_align(10+2,4)=16 |
80 |
www.163.com |
4+ngx_align(11+2,4)=20 |
100 |
www.sohu.com |
4+ngx_align(12+2,4)=20 |
120 |
www.abo321.org |
4+ngx_align(14+2,4)=20 |
140 |
4. 小结
本文针对nginx-1.0.4的hash结构进行了较为全面的分析,包括hash结构、hash元素结构、hash初始化结构等,hash操作主要包括hash初始化、hash查找等。最后通过一个简单例子向读者展示nginx的hash使用方法,并给出详细的运行结果,且画出hash的物理结构图,以此向图这展示hash的设计、原理;同时借此向读者展示编译测试nginx代码的方法。
敬请关注后续的分析。谢谢!
Reference
Nginx代码研究计划 (RainX1982)
nginx-1.0.4源码分析—模块及其初始化 (阿波)
nginx-1.0.4源码分析—内存池结构ngx_pool_t及内存管理 (阿波)
nginx-1.0.4源码分析—数组结构ngx_array_t (阿波)
nginx-1.0.4源码分析—链表结构ngx_list_t (阿波)
nginx-1.0.4源码分析—队列结构ngx_queue_t (阿波)
nginx-1.0.4源码分析—模块及其初始化 (阿波)
nginx-1.0.4源码分析—内存池结构ngx_pool_t及内存管理 (阿波)
nginx-1.0.4源码分析—数组结构ngx_array_t (阿波)
nginx-1.0.4源码分析—链表结构ngx_list_t (阿波)
nginx-1.0.4源码分析—队列结构ngx_queue_t (阿波)