摘要:标准库中的string类的常用函数
C语言中,字符串是以'\0'结尾的一些字符的集合,为了操作方便,C标准库中提供了一些str系列的库函数, 但是这些库函数与字符串是分离开的,不太符合OOP(面向对象)的思想,而且底层空间需要用户自己管理,稍不留神可能还会越界访问。
在OJ中,有关字符串的题目基本以string类的形式出现,而且在常规工作中,为了简单、方便、快捷,基本 都使用string类,很少有人去使用C库中的字符串操作函数。
(string 类不属于 STL 【C++】-7- STL简介,属于标准库)下面介绍 string类 中比较常用、重要的函数。string类的接口设计繁多,需要时查一下文档即可。cplusplus.com/reference/string/string/
关于构造函数不多赘述,参考文档可以很清楚的了解这些构造函数。→ https://cplusplus.com/reference/string/string/string/
补充说明: npos 为 string类 中的静态成员变量,类型为 无符号整型。static const size_t npos = -1 ,-1表示为无符号整型的最大值1111 1111 1111 1111 1111 1111 1111 1111 → 4,294,967,295。
operator[] |
Get character of string (public member function) |
at |
Get character in string (public member function) |
back |
Access last character (public member function) |
front |
Access first character (public member function) |
像普通数组一样,以[下标]的方式访问string类对象中的成员是最常用、便捷的一种方式。同样的,这种访问方式也支持修改:
#include
#include
int main()
{
std::string s1("Hello!");
for (int i = 0; i < s1.size(); ++i)
{
std::cout << s1[i] << " ";//访问
}
std::cout << std::endl;
for (int i = 0; i < s1.size(); ++i)
{
std::cout << ++s1[i] << " ";//修改
}
return 0;
}
注意: operator[] 越界,程序直接终止(assert断言处理)
at:越界抛异常
#include
#include
int main()
{
std::string s1("Hello!");
for (int i = 0; i < s1.size(); ++i)
{
std::cout << s1.at(i) << " ";
}
std::cout << std::endl;
return 0;
}
迭代器是更通用、主流的遍历方式——不是所有的容器都适用operator[],譬如链表——空间按地址不连续。为了方便理解,可以把迭代器看作指针(虽然实际底层实现可能是指针也可能不是)。
begin |
Return iterator to beginning (public member function) |
end |
Return iterator to end (public member function) |
rbegin |
Return reverse iterator to reverse beginning (public member function) |
rend |
Return reverse iterator to reverse end (public member function) |
cbegin |
Return const_iterator to beginning (public member function) |
cend |
Return const_iterator to end (public member function) |
crbegin |
Return const_reverse_iterator to reverse beginning (public member function) |
crend |
Return const_reverse_iterator to reverse end (public member function) |
#include
#include
int main()
{
std::string s2("Hello World!");
std::string::iterator it = s2.begin();
while (it != s2.end())
{
std::cout << *it << " ";
++it;
}
std::cout << std::endl;
return 0;
}
#include
#include
int main()
{
for (auto e : s2)
{
std::cout << e << " ";
}
return 0;
}
范围for 本质上就是迭代器,编译器会在编译的时候替换成迭代器。
int main()
{
const std::string s3("RoundBottle");
std::string::const_iterator c_it = s3.cbegin();
while (c_it != s3.cend())
{
std::cout << *c_it << " ";
++c_it;
}
std::cout << std::endl;
std::string::const_reverse_iterator cr_it = s3.crbegin();
while (cr_it != s3.crend())
{
std::cout << *cr_it << " ";
++cr_it;
}
std::cout << std::endl;
return 0;
}
如上代码,const 对象调用 std::string::const_iterator 和 std::string::const_reverse_iterator 。
ps.可以用 auto 自动识别类型——auto cr_it = s3.crbegin();
size |
Return length of string (public member function) |
length |
Return length of string (public member function) |
max_size |
Return maximum size of string (public member function) |
resize |
Resize string (public member function) |
capacity |
Return size of allocated storage (public member function) |
reserve |
Request a change in capacity (public member function) |
clear |
Clear string (public member function) |
empty |
Test if string is empty (public member function) |
shrink_to_fit |
Shrink to fit (public member function) |
reserve:提前预留空间,因为频繁的扩容是有代价的,提前预留空间可以提高效率(一般不缩容)。另外,不同平台下实际实现方案有所不同,譬如 vs编译器 下有一些对齐的规则,最终开出来的空间会比 reserve 指定的空间大小要大一点;g++平台下一般是按指定的空间大小开空间。
关于不同平台扩容的不同规则:vs平台一般是1.5倍扩容,g++平台一般是2倍扩容。
operator+= |
Append to string (public member function) |
append |
Append to string (public member function) |
push_back |
Append character to string (public member function) |
assign |
Assign content to string (public member function) |
insert |
Insert into string (public member function) |
erase |
Erase characters from string (public member function) |
replace |
Replace portion of string (public member function) |
swap |
Swap string values (public member function) |
pop_back |
Delete last character (public member function) |
使用示例:
int main()
{
std::string s2("Hello,Round Bottle");
s2 += 'x';
s2 += "llllll";
s2 += "321";
s2 += '!';
std::cout << s2;
s2.push_back('7');
std::cout << s2;
s2.append("aaaaaa");
s2.append(3, '0');
s2.append("alison", 2);
std::cout << s2;
return 0;
}
int main()
{
std::string s2("Hello,Round Bottle");
std::string s3 = s2 + "777";
std::cout << s3 << std::endl;
return 0;
}
①std库中提供了 swap 函数模板:(3次深拷贝——1次拷贝构造+两次赋值——效率低)
②害怕成本太高,std库中又提供了现成的针对 string类对象的:(就是 Non-member function overloads 表格中所展示的 swap 函数)
③ string类中自己有 swap 成员函数:(就是 Modifiers 表格中所展示的 swap 函数)
swap 成员函数使用示例:
int main()
{
std::string s1("nothing");
std::string s2("Hello,Round Bottle");
std::cout << "s1:" << s1 << std::endl;
std::cout << "s2:" << s2 << std::endl;
s1.swap(s2);
std::cout << "--------------swap---------------" << std::endl;
std::cout << "s1:" << s1 << std::endl;
std::cout << "s2:" << s2 << std::endl;
return 0;
}
综上,针对交换 string类 的对象,建议使用 string类自己的成员函数 swap 进行交换——效率更高。
c_str |
Get C string equivalent (public member function) |
data | Get string data (public member function) |
get_allocator | Get allocator (public member function) |
copy | Copy sequence of characters from string (public member function) |
find | Find content in string (public member function) |
rfind | Find last occurrence of content in string (public member function) |
find_first_of | Find character in string (public member function) |
find_last_of | Find character in string from the end (public member function) |
find_first_not_of | Find absence of character in string (public member function) |
find_last_not_of | Find non-matching character in string from the end (public member function) |
substr | Generate substring (public member function) |
compare | Compare strings (public member function) |
c_str:与C语言接口兼容。使用示例如下:
int main()
{
std::string s1("nothing");
printf("%s", s1.c_str());
return 0;
}
编码:值和符号一一映射对应的关系 → 编码表 (e.g. ASCII)
Unicode:万国码 ⇨ UTF
string | String class (class) ⇨ UTF-8 |
u16string | String of 16-bit characters (class) |
u32string | String of 32-bit characters (class) |
wstring | Wide string (class) |
适应不同的编码,为了更好的表示世界上的各种语言。
乱码:(数)值 通过不同的编码表 得出了不同的符号——存储方式与解释方式不匹配。
GBK:GBK字库_百度百科 (baidu.com)
END