资料:
stl-iterators
[The C++ Standard Library: Tutorial & Reference]
迭代器分类
深入学习之前,我对下面这几个问题感到迷茫:
- ++i vs. i++
- 迭代器的合法增长方向
- 迭代器的失效
- 迭代器 vs. 下标运算符
Phase_1 系统地过一遍
- 原理
其实上面那个图已经很清楚地显示了不同分类(categories)的迭代器之间的关系和其合法操作了.
(根据上面的一系列定义定义的5个级别的迭代器,每种都有自己支持的操作)
- Input Iterator
- Output Iterator
- Forward Iterator
- Bidirectional Iterator
- Random Access Iterator
- 实现
上面那些只是一些定义,C++使用所谓的trait class
来实现区分这几种迭代器的.(effective c++ RULE no. 47),一种泛型上的概念,即类型的信息
.
简单地说,你可以自己实现自己的类的Iterator(随便取你的迭代器名字),然后在你的迭代器定义中显式
地加入一些typedef
, 比如value_type
、iterator_catagory
;这几种别名是std里约定好了,在std里的一些算法将根据这些别名做不同的处理,比如你想说明自己的Iterator是个随机Acess的iterator,则在你的iterator中加入一个typedef std::random_access_iterator_tag iterator_category
;到了STL的算法内部会根据你的Iterator中的这些typedef来指定不同的算法实现.
利用tag-dispatching的原因有个比较重要的就是,在编译阶段利用函数的重载机制决定采用哪个do_algorthm来实现.
这里可能会有点晕,附2
的目标就是把附1实现的Iterator变成STL-portable的.(可能有点丑,但可以说明问题)
STL如何实现tag-dispatching的,在另一篇里细细来讲(其实也没啥好讲的一但理解了,就很简单).这里有些资料:
An-Introduction-to-Iterator-Traits
algorithmSelection
how-does-iterator-category-in-c-work
-
how-do-traits-classes-work - NO MAGIC!!!
--------------------废话的分割线--------------------
经过一整晚的潜心研究,我终于大致搞懂了iterator的原理和相关的实现(雾)。
上面的描述顶多只能算概念上的描述,实际的实现灵活并奇怪得多.
附1中的Iterator顶多只能算个base_Iterator, 为了让它能够与STL的算法库兼容(如find, distance, copy等),还需要一个wapper-like的模板类包装一下(使这个Iterator实现所谓的tag-dispatching),其原理实际上就是编译期间绑定,明天继续!明天将参考STLPort实现来进行wrap并使其能支持STL算法库.
(关键词:effective c++_42/46, typedef, overloading, iterator_traits)
实际上几乎拖了一周才更新
--------------------废话的分割线--------------------
Phase_2 回过头来解决最初的问题
- ++i vs. i++
++i
is better.
make sure you create a habit of using pre-increment with iterators in loops. i.e. use ++it and not it++. The pre-increment will not create unnecessary temporaries.
- 我认为其实根本原因还是 operator重载中,
a@
将调用a.operator(0)
,而@a
将调用a.operator@()
,前者将增加一个临时对象, 因此效率会比较低.
(附1.)
<下面3个问题应该是属于使用方面的问题,在另一篇专门来研究如何使用>
迭代器的(合法)增长方向
迭代器的失效
迭代器 vs. 下标运算符
附1. 实现了一个带迭代器的字符串类
#include
#include
#include
#include
class MyString{
public:
MyString() : data_(NULL), size_(0){ }
MyString(const char* s);
MyString(const MyString &b) : size_(b.size_), data_(NULL){
//deep copy
if(size_){
data_ = new char [BufSize(size_)];
memcpy(data_, b.data_, BufSize(size_));
}
}
~MyString(){
if(data_){
delete [] data_;
}
}
MyString& operator=(const MyString &);
bool operator!=(const MyString &);
bool operator==(const MyString &);
MyString operator+(const MyString &);
MyString& operator+=(const MyString &);
char& operator[](const int);
class Iterator{
public:
Iterator() : ptr(NULL){ }
Iterator& operator++(){
ptr++;
return *this;
}
Iterator& operator--(){
ptr--;
return *this;
}
Iterator& operator+(int x){
ptr += x;
return *this;
}
Iterator& operator-(int x){
ptr -= x;
return *this;
}
bool operator!=(const Iterator &s){
return s.ptr != ptr;
}
char& operator*(){
return *ptr;
}
private:
friend MyString;
Iterator(char *s) : ptr(s){ }
char* ptr;
};
Iterator begin(){
return Iterator(data_);
}
Iterator end(){
return Iterator(data_ + size_);
}
int size(){ return size_; }
private:
char* data_;//end with '\0'
int size_;
int BufSize(const int s) const{ return s + 1; }
char* ReSizeBuf(int s){
// std::cout << "[DBG]\n";
// std::cout << s << size_ << std::endl;
if(s > size_){
if(data_){ delete [] data_; }
data_ = new char [BufSize(s)];
}
size_ = s;
return data_;
}
friend std::ostream & operator<<(std::ostream &out, const MyString& s);
};
MyString::MyString(const char* s)
: size_(strlen(s)),
data_(NULL)
{
if(size_){
data_ = new char [BufSize(size_)];
memcpy(data_, s, BufSize(size_));
}
}
MyString& MyString::operator=(const MyString &b)
{
//deep copy
//origin data is overwrote
if(&b != this){
ReSizeBuf(b.size_);
memcpy(data_, b.data_, BufSize(size_));
}
return *this;
}
bool MyString::operator!=(const MyString & b)
{
return !(*this == b);
}
bool MyString::operator==(const MyString & b)
{
if(b.size_ == size_){
return memcmp(b.data_, data_, size_) == 0;
}
return false;
}
//It's not good to do this because it will do 2 alloc.s & dealloc.s(temp. var in c++)
MyString MyString::operator+(const MyString &b)//will concat the two string.
{
MyString tmp;
memcpy(tmp.ReSizeBuf(size_ + b.size_), data_, size_);
memcpy(tmp.data_ + size_, b.data_, BufSize(b.size_));
return tmp;
}
MyString& MyString::operator+=(const MyString &b)
{
char* tmp = BufSize(size_) < BufSize(size_ + b.size_) ?
new char [BufSize(size_ + b.size_)] : data_ ;
if(tmp != data_){
memcpy(tmp, data_, size_);
}
memcpy(tmp + size_, b.data_, BufSize(b.size_));
if(tmp != data_){
delete [] data_;
}
data_ = tmp;
size_ = size_ + b.size_;
return *this;
}
char& MyString::operator[](const int idx)
{
assert(idx < size_);
return *(data_ + idx);
}
std::ostream & operator<<(std::ostream &out, const MyString& s)
{
return s.data_ ? out << s.data_ : out << "";
}
void test1(){
std::string ss = "12345";
std::string::iterator itr;
for(itr = ss.begin(); itr != ss.end(); itr++)
{
printf("%c ", *itr);
}
printf("\n");
}
void test_MyString()
{
MyString s1;
MyString s2("Hello");
MyString s3 = "1234";
MyString s4(s3);
std::cout << s1 << s2 << s3 << s4 << std::endl;
}
void test_operator()
{
MyString s1 = "Hello";
MyString s2 = "Hi";
MyString s3 = ", there";
MyString s4, s5("fff");
std::cout << s4 << " " << s4.size() << ";" << s5 << " " << s5.size() < itr++ ==> will call MyString::Iterator::operator++(0)
pre-increment.cc:195:6: error: no ‘operator++(int)’ declared for postfix ‘++’ [-fpermissive]
itr++){
^
for(MyString::Iterator itr = ssx.begin();
itr != ssx.end();
itr++){
std::cout << *itr << " " << std::endl;
}
*/
for(MyString::Iterator itr = ssx.begin();
itr != ssx.end();
++itr){
std::cout << *itr << " ";
}
std::cout << std::endl;
}
int main()
{
// char s[4] = {'1','2','4','3'};
// std::cout << s << std::endl;
test_iterator();
return 0;
}
附2. 令附1的代码能支持std::copy(),std::distance()等
代码丑陋 有待优化,但主要目的,令自己写的Iterator可以支持STL的algorithm的目的已经达到了.
#include
#include
#include
#include
#include //for the tags
#include //we need to support, for exmaple: std::find() std::copy()
class MyString{
public:
MyString() : data_(NULL), size_(0){ }
MyString(const char* s);
MyString(const MyString &b) : size_(b.size_), data_(NULL){
//deep copy
if(size_){
data_ = new char [BufSize(size_)];
memcpy(data_, b.data_, BufSize(size_));
}
}
~MyString(){
if(data_){
delete [] data_;
}
}
MyString& operator=(const MyString &);
bool operator!=(const MyString &);
bool operator==(const MyString &);
MyString operator+(const MyString &);
MyString& operator+=(const MyString &);
char& operator[](const int);
class Iterator{
public:
typedef typename std::random_access_iterator_tag iterator_category;
typedef char value_type;
typedef int difference_type;
typedef char* pointer;
typedef char& reference;
Iterator() : ptr(NULL){ printf("[DBG] Iterator() is called\n"); }
Iterator& operator++(){
ptr++;
return *this;
}
Iterator& operator--(){
ptr--;
return *this;
}
Iterator& operator+(int x){
ptr += x;
return *this;
}
Iterator& operator-(int x){
ptr -= x;
return *this;
}
//error: no match for ‘operator-’ (operand types are ‘MyString::Iterator’ and ‘MyString::Iterator’)
//we have to impl the operator-(Iterator)
difference_type operator-(const Iterator& rhs){
return ptr - rhs.ptr;
}
bool operator!=(const Iterator &s){
return s.ptr != ptr;
}
bool operator==(const Iterator &s){
return !(*this != s);
}
//std::find() will use this operator
bool operator==(value_type c){
return *ptr == c;
}
char& operator*(){
return *ptr;
}
private:
friend MyString;
Iterator(char *s) : ptr(s){
// printf("[DBG] Iterator(char*) is called\n");
}
protected:
char* ptr;
};
class OuputIterator : public Iterator{
public:
char& operator*(){
if(ptr == mptr_->end().ptr){
int offset = mptr_->size_;
mptr_->ReSizeCopyBuf((mptr_->size_ + 1) * 2);
ptr = mptr_->data_ + offset;
}
return *ptr;
}
void print(){
printf("[DBG] %p\n", ptr);
}
private:
friend MyString;//friend is not inherited
MyString* mptr_;
OuputIterator(MyString* me) : mptr_(me){ }
OuputIterator(char *s) : Iterator(s) { /*printf("[DBG] OuputIterator(char*) is called\n"); */}
};
Iterator begin(){
return Iterator(data_);
}
Iterator end(){
return Iterator(data_ + size_);
}
OuputIterator obegin(){
return OuputIterator(data_);
}
OuputIterator oend(){
return OuputIterator(data_ + size_);
}
int size(){ return size_; }
private:
char* data_;//end with '\0'
int size_;
int BufSize(const int s) const{ return s + 1; }
char* ReSizeBuf(int s){
// std::cout << "[DBG]\n";
// std::cout << s << size_ << std::endl;
if(s > size_){
if(data_){ delete [] data_; }
data_ = new char [BufSize(s)];
}
size_ = s;
return data_;
}
char* ReSizeCopyBuf(int s){
if(s > size_){
char* new_data_ = new char [BufSize(s)];
if(data_){
memcpy(new_data_, data_, BufSize(size_));
delete [] data_;
}
data_ = new_data_;
}
size_ = s;
return data_;
}
friend OuputIterator;
friend std::ostream & operator<<(std::ostream &out, const MyString& s);
};
MyString::MyString(const char* s)
: size_(strlen(s)),
data_(NULL)
{
if(size_){
data_ = new char [BufSize(size_)];
memcpy(data_, s, BufSize(size_));
}
}
MyString& MyString::operator=(const MyString &b)
{
//deep copy
//origin data is overwrote
if(&b != this){
ReSizeBuf(b.size_);
memcpy(data_, b.data_, BufSize(size_));
}
return *this;
}
bool MyString::operator!=(const MyString & b)
{
return !(*this == b);
}
bool MyString::operator==(const MyString & b)
{
if(b.size_ == size_){
return memcmp(b.data_, data_, size_) == 0;
}
return false;
}
//It's not good to do this because it will do 2 alloc.s & dealloc.s(temp. var in c++)
MyString MyString::operator+(const MyString &b)//will concat the two string.
{
MyString tmp;
memcpy(tmp.ReSizeBuf(size_ + b.size_), data_, size_);
memcpy(tmp.data_ + size_, b.data_, BufSize(b.size_));
return tmp;
}
MyString& MyString::operator+=(const MyString &b)
{
char* tmp = BufSize(size_) < BufSize(size_ + b.size_) ?
new char [BufSize(size_ + b.size_)] : data_ ;
if(tmp != data_){
memcpy(tmp, data_, size_);
}
memcpy(tmp + size_, b.data_, BufSize(b.size_));
if(tmp != data_){
delete [] data_;
}
data_ = tmp;
size_ = size_ + b.size_;
return *this;
}
char& MyString::operator[](const int idx)
{
assert(idx < size_);
return *(data_ + idx);
}
std::ostream & operator<<(std::ostream &out, const MyString& s)
{
return s.data_ ? out << s.data_ : out << "";
}
void test1(){
std::string ss = "12345";
std::string::iterator itr;
for(itr = ss.begin(); itr != ss.end(); itr++)
{
printf("%c ", *itr);
}
printf("\n");
}
void test_MyString()
{
MyString s1;
MyString s2("Hello");
MyString s3 = "1234";
MyString s4(s3);
std::cout << s1 << s2 << s3 << s4 << std::endl;
}
void test_operator()
{
MyString s1 = "Hello";
MyString s2 = "Hi";
MyString s3 = ", there";
MyString s4, s5("fff");
std::cout << s4 << " " << s4.size() << ";" << s5 << " " << s5.size() < itr++ ==> will call MyString::Iterator::operator++(0)
pre-increment.cc:195:6: error: no ‘operator++(int)’ declared for postfix ‘++’ [-fpermissive]
itr++){
^
for(MyString::Iterator itr = ssx.begin();
itr != ssx.end();
itr++){
std::cout << *itr << " " << std::endl;
}
*/
for(MyString::Iterator itr = ssx.begin();
itr != ssx.end();
++itr){
std::cout << *itr << " ";
}
std::cout << std::endl;
}
void test_STL_algor()
{
MyString testss = "Hi, My Name is REM.";
std::cout << std::distance(testss.begin(), testss.end()) <
std::cout << std::find(testss.begin(), testss.end(), 'R') - testss.begin() << std::endl;
//std::find() is included at algorithm
MyString dst;
std::copy(testss.begin(), testss.end(), dst.obegin());
std::cout<< dst << std::endl;
//... you can test other function of STL-algorithm
}
void testobegin()
{
MyString testss = "Hi, I am Rem.Nice to meet you.";
MyString::OuputIterator oitr = testss.obegin();
oitr.print();
}
int main()
{
test_STL_algor();
return 0;
}
TODO
- 3个遗留问题
- 附2优化
- 实现一个模拟STL的tag-dispatching.