大部分人使用vector构造字符串数组,都是用vector<string>,但是也不乏人使用vector<char*> ,用vctor<char*>不会遇到任何问题,但是既然用了C的东西,很多东西需要自己来控制,不像C++的string那样: std::string is a string class, encapsulating all the required data that makes up a string, along with allocation and deallocation functionality.
可以看下面的这段代码:
1 int main(int argc, const char *argv[]) 2 { 3 vector<char*> vctTemp; 4 char* pchA = (char*)malloc(5); 5 char* pchB = (char*)malloc(5); 6 memcpy(pchA, "hello", 5); 7 memcpy(pchB, "world", 5); 8 vctTemp.push_back(pchA); 9 vctTemp.push_back(pchB); 10 vector<char*>::iterator iter; 11 for (iter=vctTemp.begin(); iter!=vctTemp.end(); iter++) { 12 cout<<*iter<<endl; 13 } 14 return 0; 15 }
使用valgrind来check下内存泄漏情况.
==3317== LEAK SUMMARY: ==3317== definitely lost: 10 bytes in 2 blocks ==3317== indirectly lost: 0 bytes in 0 blocks ==3317== possibly lost: 0 bytes in 0 blocks ==3317== still reachable: 0 bytes in 0 blocks ==3317== suppressed: 0 bytes in 0 blocks
很明显内存泄漏了10Bytes . 只能通过手动free .
上面的代码可以改成:
1 for (iter=vctTemp.begin(); iter!=vctTemp.end(); iter++) { 2 cout<<*iter<<endl; 3 free(*iter); //手动进行free 4 }
再来check下
==3343== HEAP SUMMARY: ==3343== in use at exit: 0 bytes in 0 blocks ==3343== total heap usage: 4 allocs, 4 frees, 22 bytes allocated
显然string类更好,该不需要我们做的,都帮我们做了,但是如果你很注意性能的话请使用C-style的字符串. 有人会说string效率比C-Style高(是的在某方面,但是C完全可以避免, 因为C++也是C写的, 只是加了一层数据结构的封装,网上的string效率比C风格代码高的代码有些有问题的.或者说不公平的测试代码)
比如(网上很多论坛上不公平的测试代码):
C++ string:
3 int main(int argc, const char *argv[]) 4 { 5 INIT_TIME; 6 START_TIME; 7 string str("a very larg literal string"); 8 int i; 9 int errors = 0; 10 for (i = 0; i < 10000000; i++) { 11 int len = str.size(); 12 string str2 = str; 13 if (str != str2) { 14 ++errors; 15 } 16 } 17 STOP_TIME; 18 PRINT_TIME; 19 return 0; 20 }
2 int main(int argc, const char *argv[]) 3 { 4 INIT_TIME; 5 START_TIME; 6 int i ; 7 char* pc = "a very larg literal string"; 8 int errors = 0; 9 for (i = 0; i < 10000000; i++) { 10 int len = strlen(pc); 11 char* p2 = malloc(len+1); 12 strcpy(p2, pc); 13 if (strcmp(p2, pc)) { 14 ++errors; 15 } 16 free(p2); 17 } 18 STOP_TIME; 19 PRINT_TIME; 20 return 0; 21 }
让我们来看下不公平的C代码:
2 int main(int argc, const char *argv[]) 3 { 4 INIT_TIME; 5 START_TIME; 6 int i ; 7 char* pc = "a very larg literal string"; 8 int errors = 0; 9 for (i = 0; i < 10000000; i++) { 10 int len = strlen(pc); //计算长度需要放在for循环内吗? string是显示的记住size()大小的,strlen是需要把字符串遍历一遍! 11 char* p2 = malloc(len+1); //malloc更不需要放在for循环内 12 strcpy(p2, pc); 13 if (strcmp(p2, pc)) { 14 ++errors; 15 } 16 free(p2);//这个也不需要放在里面了. 17 } 18 STOP_TIME; 19 PRINT_TIME; 20 return 0; 21 }
改后:
2 int main(int argc, const char *argv[]) 3 { 4 INIT_TIME; 5 START_TIME; 6 int i ; 7 char* pc = "a very larg literal string"; 8 int errors = 0; 9 int len = strlen(pc); 10 char* p2 = malloc(len+1); 11 for (i = 0; i < 10000000; i++) { 12 strcpy(p2, pc); 13 if (strcmp(p2, pc)) { 14 ++errors; 15 } 16 } 17 free(p2); 18 STOP_TIME; 19 PRINT_TIME; 20 return 0; 21 }
好,我们来谈谈C++ 的string的优势: 引用计数 和 COW
引用计数:
3 int main(int argc, const char *argv[]) 4 { 5 string str1 = "hello"; 6 string str2(str1); 7 8 printf("&str1=%p\n", str1.c_str()); 9 printf("&str2=%p\n", str2.c_str()); 10 11 return 0; 12 }
Result:
&str1=0x8cb3014 &str2=0x8cb3014
两个字符串地址相同,string拷贝构造函数并非真正的进行字符串拷贝, 只是将字符串的引用值加 1 , str1 和 str2 指向的是同一个字符串.也就是说,上面那个例子中的for循环内, 并没有真实的进行拷贝操作,而是进行了引用计数的递增. 那为什么刚才和C-Style的运行时间差不多? 其实差很多,因为我们的例子中测试的字符串长度太小了, 比如如果是这样的字符串:
char* p = "hello";
有人会问, 既然string 字符串拷贝都引用同一地址,那修改某个字符串怎么办呢? 一个修改,不是其它都修改了吗? 当然不是 ,这里C++使用了COW 也就是写时拷贝,
7 string str1 = "hello"; 8 string str2(str1); 9 str1[1] = 'w'; 10 printf("%p\n", str1.c_str()); 11 printf("%p\n", str2.c_str());
7 string str1 = "hello"; 8 string str2(str1); 9 char* p = &str1[1]; 10 string str3(str1); 11 printf("%p\n", str1.c_str()); //0x8bbf02c 12 printf("%p\n", str3.c_str()); //0x8bbf044