【C++初阶7-string】真方便,真舒服

前言

本期浅学一下STL的string。

内容概览:

  • STL
  • string
    • 是什么
    • 为什么
    • 怎么用(接口介绍及使用)

博主水平有限,不足之处望请斧正!

先导

STL

C++中非常重要的一个东西,STL(Standard Template Library) 标准模版库,是C++标准库中的一部分。STL主要包含

  • 可复用的组件库
  • 数据结构和算法的软件框架

分为六个部分的话:

  • 容器(数据结构)
  • 算法
  • 迭代器
  • 函数对象
  • 适配器
  • 内存分配器

几乎所有代码都采用了类模版和函数模版,大大提高代码复用性。

主要版本有:

  • 原始版本
    • HP版:由Alexandar Stepanov和Meng Lee在惠普实验室开发。
  • 衍生版本
    • SGI版:由Alexandar Stepanov和Matt Austern在SGI开发,代码风格好,开源,且任何人都能修改和销售。Linux的g++采用。
    • P.J.版:由 P.J.Plauger在他自己的三个人的公司开发,实现比较复杂,易读性不是特别高,且并不开源。微软的vs系列采用。
    • RW版:由 Rouge Wave 公司开发,C++builder采用

string

是什么

【C++初阶7-string】真方便,真舒服_第1张图片

string 是通过 basic_string 这个类模版,使用实例化出来的类。本质是动态增长的字符数组

嗯?字符串不就都是char组成的吗,

【为什么还要搞个模版,让我们指定类型实例化呢?】

这里涉及到编码的问题:

编码

是什么

把要放进计算机的文字符号映射为二进制。(计算机中只有0和1)

其实我们在学习C语言的时候就接触过——ascii码。计算机最早是从英国过去美国的人搞的,也就用英语,那怎么通过计算机显示英语呢?

【C++初阶7-string】真方便,真舒服_第2张图片

英语仅由较少的字母和符号即可显示,所以1~127就能映射所有文字符号。

比如,要存储 “abc”,映射进计算机就是 97 98 99的二进制。

int main()
{
    char str[] = "abc";
    printf("%d %d %d\n", str[0], str[1], str[2]);
    return 0;
}
97 98 99

如果正常打印,计算机发现这是char类型,就拿着值到表里找,97对应’a’…

那别的国家咋办呢,中文的文字这么多。

unicode

unicode,统一码,也叫万国码。用统一的标准表示了很多国家的语言。

有 UTF-8、UTF-16、UTF-32三种,日常使用最多的是UTF-8,用2个字节表示,可以表示常用的汉字,兼容ASCII。

但对中文还是不太够,我们自己搞了GBK(国标),用2个字节表示

int main()
{
    char str[] = "培根";
    
    cout << str << endl;
    cout << "size:" << sizeof(str) << endl;
  	//对“培”的第二个字节++会怎样?
    ++str[2];
    cout << str << endl;
    ++str[2];
    cout << str << endl;
    ++str[2];
    cout << str << endl;
    --str[2];
    cout << str << endl;
    --str[2];
    cout << str << endl;
    --str[2];
    cout << str << endl;
    
    return 0;
}
培根
size:5
培郭
培葫
培基
培葫
培郭
培根

可以看到,此处的编码按照字音来,有一个场景就是帮助净化网络环境(敏感词的谐音字也能变成“***”)。

到这里我们也能理解为什么string要写成模版了,不同的编码就指定不同的类型【C++初阶7-string】真方便,真舒服_第3张图片

如果我们要存UTF-16的字符串,就用u16stirng。

【C++初阶7-string】真方便,真舒服_第4张图片

如果我们要存UTF-16的字符串,就用u32stirng。

【C++初阶7-string】真方便,真舒服_第5张图片

如果我们要存宽字符串,就用wstirng。

string就对应UTF-8的编码规则,也是最常用的。

所以,一种可能的框架:

template <class T>
class basic_string
{
public:
    //...
private:
    T* _str;
    size_t _size;
    size_t _capacity;
};

basic_string<char> s_8;
basic_string<char16_t> s_16;
basic_string<char32_t> s_32;
basic_string<wchar_t> s_w;

*推荐给一个大家查看C++文档的网站(非官方,但好用)


为什么

C语言中,字符串是以\0结尾的字符的集合,为了操作方便,C标准库也提供了字符串操作的函数。但这些库函数和字符串是分离开的,通过char*指针来操作,不符合OOP(面向对象程序设计)的思想。而且底层的内存需要自己管理,不注意还会越界访问。所以C++就封装了一个stirng类。


怎么用

有些接口会略过,知道有这个接口就行,用的时候查下文档简简单单。

Constructor (构造)

描述 接口
default (1) string();
copy (2) string (const string& str);
substring (3) string (const string& str, size_t pos, size_t len = npos);
from c-string (4) string (const char* s);
from sequence (5) string (const char* s, size_t n);
fill (6) string (size_t n, char c);
range (7) template string (InputIterator first, InputIterator last);

(1) default:默认空字符串。

int main()
{
    string s;
    cout << s << endl << "111" << endl;
    return 0;
}

111

(2) copy:拷贝构造。

int main()
{
    string s1("bacon");
    string s2(s1);
    cout << s2 << endl;
    return 0;
}
bacon

(3) substring:用pos开始的n个字符构成的子字符串来构造。

int main()
{
    string s1("123 456 bacon", 0, 7);
    cout << s1 << endl;
    return 0;
}
123 456

(4) from c-string:通过c式字符串构造。

int main()
{
    string s("bacon");
    cout << s << endl;
    return 0;
}
bacon

(5) from sequence:用长度为n的字符序列构造。

(6) fill:用n个c构造。

int main()
{
    string s(10, '!');
    cout << s << endl;
    return 0;
}
!!!!!!!!!!

(7) range:用迭代器区间构造(后面讲)。

int main()
{
    string s1(10, '!');
    string s2(s1.begin(), s1.end());
    cout << s2 << endl;
    return 0;
}
!!!!!!!!!!

Destructor (析构)

释放空间

Operator = (赋值运算符重载)

描述 接口
string (1) string& operator= (const string& str);
c-string (2) string& operator= (const char* s);
character (3) string& operator= (char c);
int main()
{
    string s1, s2, s3;
    s1 = "it's "; // c-string
    s2 = "a string."; // single character
    s3 = s1 + s2; // string
    
    cout << s3 << endl;
    return 0;
}

Non-Member function overloads (非成员函数重载)

(1)operator+

​ Concatenate strings (function )

描述 接口
string (1) string operator+ (const string& lhs, const string& rhs);
c-string (2) string operator+ (const string& lhs, const char* rhs);
string operator+ (const char* lhs, const string& rhs);
character (3) string operator+ (const string& lhs, char rhs);
string operator+ (char lhs, const string& rhs);

(2)relational operators

​ Relational operators for string (function )

就是一些比较符号,string 和 string / char* / char 都能比较

(3)swap

​ Exchanges the values of two strings (function )

void swap (string& x, string& y);

(4)operator>>

​ Extract string from stream (function )

(5)operator<<

​ Insert string into stream (function )

(6)getline

Get line from stream into string (function )

int main()
{
    string s1;
    string s2;
    getline(cin, s1);
    getline(cin, s2);
    
    string s3 = s1 + s2;
    cout << s3 << endl;
    
    string s4 = "!!!";
    s4.swap(s3);
    cout << s3 << endl;
    cout << s4 << endl;
    
    return 0;
}
123//输入
abc//输入
123abc
!!!
123abc

诶?好像有个算法库,里面不是有个swap吗?

确实,是这样实现的:

template <class T> void swap ( T& a, T& b )
{
  T c(a); a=b; b=c;
}

可以发现,需要构造一个临时对象c。但是对于我们动态增长的stirng,会涉及深拷贝的问题。

那我们void swap (string& x, string& y);又做了什么呢?

其实给string写的swap,只是交换了stirng的成员,深拷贝的问题也直接解决了。

Iterators (迭代器)

啥是iterator?

部分成员类型 描述
iterator a random access iterator to char (convertible to const_iterator)
const_iterator a random access iterator to const char
reverse_iterator reverse_iterator
const_reverse_iterator reverse_iterator

string的迭代器,底层大概是一个char指针。在string阶段可以暂时浅显地把迭代器理解为指针。

(1)begin

​ Return iterator to beginning (public member function )

iterator begin();

const_iterator begin() const;

(2)end

​ Return iterator to end (public member function )

iterator end();

const_iterator end() const;

(3)rbegin

​ Return reverse iterator to reverse beginning (public member function )

reverse_iterator rbegin();
const_reverse_iterator rbegin() const;

(4)rend

​ Return reverse iterator to reverse end (public member function )

reverse_iterator rend();
const_reverse_iterator rend() const;

(5)cbegin

​ Return const_iterator to beginning (public member function )

const_iterator cbegin() const noexcept; (noexcept先不管,后面会讲的哈)

(6)cend

​ Return const_iterator to end (public member function )

const_iterator cbegin() const noexcept;

(7)crbegin

​ Return const_reverse_iterator to reverse beginning (public member function )

const_reverse_iterator crbegin() const noexcept;

(8)crend

​ Return const_reverse_iterator to reverse end (public member function )

const_reverse_iterator crend() const noexcept;

  • r:反向
  • c:const对象的迭代器

以上迭代器不混用

这些接口返回的都是一个iterator(迭代器),我们得用同样类型接收返回值,而iterator是成员类型,所以要指定类域。

int main()
{
    string s("it's a string.");
    string::iterator it = s.begin();
    while(it != s.end())
    {
        cout << *it << ' ';
        ++it;
    }
    cout << endl;
    return 0;
}
i t ' s   a   s t r i n g . 
int main()
{
    string s("it's a string.");
    string::reverse_iterator it = s.rbegin();
    while(it != s.rend())
    {
        cout << *it << ' ';
        ++it;
    }
    cout << endl;
    return 0;
}
int main()
{
    const string s("it's a string.");
    string::const_reverse_iterator it = s.crbegin();
    while(it != s.crend())
    {
        cout << *it << ' ';
        ++it;
    }
    cout << endl;
    return 0;
}
. g n i r t s   a   s ' t i 

其实,范围for就是利用迭代器,将范围for的代码替换成用迭代器遍历的代码。之前说范围for对于自定义类型,要有begin和end方法也是这样。

int main()
{
    string s("it's a string.");
    
    for(char ch : s)
    {
        cout << ch << ' ';
    }
    cout << endl;
    return 0;
}
i t ' s   a   s t r i n g . 

Capacity (容量)

(1)size

​ Return length of string (public member function )

size_t size() const;

(2)length

​ Return length of string (public member function )

size_t length() const;

(3)max_size

​ Return maximum size of string (public member function )

size_t max_size() const;

(4)resize

​ Resize string (public member function )

void resize (size_t n);

void resize (size_t n, char c);

如果 n < _size,会减小\_size删除数据(不缩容)
如果 n > _size,会增大\_size,可传一个c,自动将增加的部分填充成c

(5)capacity

​ Return size of allocated storage (public member function )

size_t capacity() const;

(6)reserve

Request a change in capacity (public member function )

void reserve (size_t n = 0);

(7)clear

​ Clear string (public member function )

void clear();

(8)empty

​ Test if string is empty (public member function )

bool empty() const;

(9)shrink_to_fit

​ Shrink to fit (public member function ) 不建议频繁使用,因为得异地开辟并拷贝。

void shrink_to_fit();

size + capacity + length + max_size + clear + empty:

int main()
{
    string s = "123456";
    cout << "size: " << s.size() << endl;
    cout << "length: " << s.length() << endl;
    cout << "capacity: " << s.capacity() << endl;
    cout << "max_size: " << s.max_size() << endl;

    s.clear();
    cout << "----string cleared----" << endl;
    cout << "size: " << s.size() << endl;
    cout << "capacity: " << s.capacity() << endl;
    return 0;
}
size: 6
length: 6
capacity: 22
max_size: 18446744073709551599
----string cleared----
size: 0
capacity: 22

resize + reserve + shrink_to_fit:

int main()
{
    string s = "123456";
    cout << "size: " << s.size() << endl;
    cout << "capacity: " << s.capacity() << endl;

    s.resize(3);
    cout << "----string resized to 3----" << endl;
    cout << "size: " << s.size() << endl;
    cout << "capacity: " << s.capacity() << endl;
    
    s.reserve(20);
    cout << "----string reserved to 20----" << endl;
    cout << "size: " << s.size() << endl;
    cout << "capacity: " << s.capacity() << endl;
    
    s.shrink_to_fit();
    cout << "----string shrinked_to_fit----" << endl;
    cout << "size: " << s.size() << endl;
    cout << "capacity: " << s.capacity() << endl;
    
    return 0;
}
size: 6
capacity: 22
----string resized to 3----
size: 3
capacity: 22
----string reserved to 20----
size: 3
capacity: 22
----string shrinked_to_fit----
size: 3
capacity: 22

诶?shrink_to_fit的功能不是把capacity fit (适应) 至size吗?这里咋没动。这是编译器干的事,为什么它不让我缩容?

因为编译器自己对string的capacity有最小限度,比如我的XCode,就规定string的capacity最小是22。

对于所有缩容操作:编译器对string的capacity有自己的最小限度。

int main()
{
    string s(100, '?');
    cout << s.capacity() << endl;
    
    s.resize(10);
    s.shrink_to_fit();
    cout << s.capacity() << endl;
    return 0;
}
111
22//虽然size是10,但capacity最多只能适应至22

Elements access (元素的访问)

(1) operator[]

​ Get character of string (public member function )

char& operator[] (size_t pos);
const char& operator[] (size_t pos) const;

(2)at

​ Get character in string (public member function )

char& at (size_t pos);

const char& at (size_t pos) const;

(3)back

​ Access last character (public member function )

char& back();
const char& back() const;

(4)front

​ Access first character (public member function )

char& front();

const char& front() const;

int main()
{
    string s("it's a string.");
    
    for(size_t i = 0; i < s.size(); ++i)
    {
//        cout << s[i] << ' ';
        cout << s.at(i) << ' ';
    }
    cout << endl;
    
    cout << "front:" << s.front() << endl;
    cout << "back:" << s.back() << endl;
    
    return 0;
}

Modifiers

(1)operator+=

​ Append to string (public member function )

描述 接口
string (1) string& operator+= (const string& str);
c-string (2) string& operator+= (const char* s);
character (3) string& operator+= (char c);

(2)append

​ Append to string (public member function )

描述 接口
string (1) string& append (const string& str);
substring (2) string& append (const string& str, size_t subpos, size_t sublen);
c-string (3) string& append (const char* s);
buffer (4) string& append (const char* s, size_t n);
fill (5) string& append (size_t n, char c);
range (6) template string& append (InputIterator first, InputIterator last);

(3)push_back

​ Append character to string (public member function )

void push_back (char c);

(4)assign

​ Assign content to string (public member function )

描述 接口
string (1) string& assign (const string& str);
substring (2) string& assign (const string& str, size_t subpos, size_t sublen);
c-string (3) string& assign (const char* s);
buffer (4) string& assign (const char* s, size_t n);
fill (5) string& assign (size_t n, char c);
range (6) template string& assign (InputIterator first, InputIterator last);

(5)insert

​ Insert into string (public member function )

描述 接口
string (1) string& insert (size_t pos, const string& str);
substring (2) string& insert (size_t pos, const string& str, size_t subpos, size_t sublen);
c-string (3) string& insert (size_t pos, const char* s);
buffer (4) string& insert (size_t pos, const char* s, size_t n);
fill (5) string& insert (size_t pos, size_t n, char c);
void insert (iterator p, size_t n, char c);
single character (6) iterator insert (iterator p, char c);
range (7) template void insert (iterator p, InputIterator first, InputIterator last);

(6)erase

​ Erase characters from string (public member function )

描述 接口
sequence (1) string& erase (size_t pos = 0, size_t len = npos);
character (2) iterator erase (iterator p);
range (3) iterator erase (iterator first, iterator last);

(7)replace

​ Replace portion of string (public member function )

string& replace (size_t pos, size_t len, const string& str);

​ 用得不多。

(8)swap

​ Swap string values (public member function )

void swap (string& str);

(9)pop_back

​ Delete last character (public member function )

void pop_back();

+= + push_back + pop_back:

int main()
{
    string s = "Jay chou is ";
    cout << s << endl;
    
    s += "cool";
    cout << s << endl;
    
    s.push_back('!');
    cout << s << endl;
    
    s.pop_back();
    cout << s << endl;
    
    
    return 0;
}
Jay chou is 
Jay chou is cool
Jay chou is cool!
Jay chou is cool

append + assign + insert + erase:

int main()
{
    string s = "Jay chou is ";
    cout << s << endl;
    
    string tmp = "very handsome";
    s.append(tmp);
    cout << s << endl;
    
    string s2;
    s2.assign(s);
    cout << s << endl;
    
    size_t pos = 0;
    s2.insert(pos, "----");
    cout << s2 << endl;
    
    s2.erase(pos, 7);
    cout << s2 << endl;
    
    return 0;
}
Jay chou is 
Jay chou is very handsome
Jay chou is very handsome
----Jay chou is very handsome
 chou is very handsome

String operations (字符串操作)

(1)c_str

​ Get C string equivalent (public member function )

const char* c_str() const;

​ 这个接口还是很重要的,很多接口如Linux的系统调用,就只能接收C式字符串。

(2)data

​ Get string data (public member function )

const char* data() const;

(3)get_allocator

​ Get allocator (public member function ) (后面学)

(4)copy

​ Copy sequence of characters from string (public member function )

(5)find

​ Find content in string (public member function )

描述 接口
string (1) size_t find (const string& str, size_t pos = 0) const;
c-string (2) size_t find (const char* s, size_t pos = 0) const;
buffer (3) size_t find (const char* s, size_t pos, size_t n) const;
character (4) size_t find (char c, size_t pos = 0) const;

(6)rfind

​ Find last occurrence of content in string (public member function )

(7)find_first_of

​ Find character in string (public member function )

(8)find_last_of

​ Find character in string from the end (public member function )

(9)find_first_not_of

​ Find absence of character in string (public member function )

(10)find_last_not_of

​ Find non-matching character in string from the end (public member function )

(11)substr

​ Generate substring (public member function )

(12)compare

​ Compare strings (public member function )

c_str + find + rfind:

int main()
{
    string s = "code is beautiful.";
    
    size_t pos = s.find(' ');
    printf("%s\n", s.c_str() + pos);
    
    pos = s.rfind(' ');
    printf("%s\n", s.c_str() + pos);
    
    return 0;
}
 is beautiful.
 beautiful.

Member constants (成员常量)

npos

Maximum value for size_t (public static member constant )

int main()
{
    string s = "code is beautiful.";
    
    cout << string::npos << endl;
    
    return 0;
}
18446744073709551615

今天的分享就到这里啦

这里是培根的blog,与你一同进步!

下期见~

你可能感兴趣的:(C++,c++)