[STL deep dive]迭代器的实现探究1

资料:
stl-iterators
[The C++ Standard Library: Tutorial & Reference]
迭代器分类


[STL deep dive]迭代器的实现探究1_第1张图片
from [here](http://www.cplusplus.com/reference/iterator/)
[STL deep dive]迭代器的实现探究1_第2张图片
from C++STL Tutorial & Ref.

深入学习之前,我对下面这几个问题感到迷茫:

  1. ++i vs. i++
  1. 迭代器的合法增长方向
  2. 迭代器的失效
  3. 迭代器 vs. 下标运算符

Phase_1 系统地过一遍

  • 原理
    其实上面那个图已经很清楚地显示了不同分类(categories)的迭代器之间的关系和其合法操作了.
    (根据上面的一系列定义定义的5个级别的迭代器,每种都有自己支持的操作)
  1. Input Iterator
  2. Output Iterator
  3. Forward Iterator
  4. Bidirectional Iterator
  5. Random Access Iterator
  • 实现
    上面那些只是一些定义,C++使用所谓的trait class来实现区分这几种迭代器的.(effective c++ RULE no. 47),一种泛型上的概念,即类型的信息.

简单地说,你可以自己实现自己的类的Iterator(随便取你的迭代器名字),然后在你的迭代器定义中显式地加入一些typedef, 比如value_typeiterator_catagory;这几种别名是std里约定好了,在std里的一些算法将根据这些别名做不同的处理,比如你想说明自己的Iterator是个随机Acess的iterator,则在你的iterator中加入一个typedef std::random_access_iterator_tag iterator_category;到了STL的算法内部会根据你的Iterator中的这些typedef来指定不同的算法实现.

利用tag-dispatching的原因有个比较重要的就是,在编译阶段利用函数的重载机制决定采用哪个do_algorthm来实现.

这里可能会有点晕,附2的目标就是把附1实现的Iterator变成STL-portable的.(可能有点丑,但可以说明问题)

STL如何实现tag-dispatching的,在另一篇里细细来讲(其实也没啥好讲的一但理解了,就很简单).这里有些资料:

  • An-Introduction-to-Iterator-Traits

  • algorithmSelection

  • how-does-iterator-category-in-c-work

  • how-do-traits-classes-work - NO MAGIC!!!
    --------------------废话的分割线--------------------
    经过一整晚的潜心研究,我终于大致搞懂了iterator的原理和相关的实现(雾)。
    上面的描述顶多只能算概念上的描述,实际的实现灵活并奇怪得多.
    附1中的Iterator顶多只能算个base_Iterator, 为了让它能够与STL的算法库兼容(如find, distance, copy等),还需要一个wapper-like的模板类包装一下(使这个Iterator实现所谓的tag-dispatching),其原理实际上就是编译期间绑定,明天继续!明天将参考STLPort实现来进行wrap并使其能支持STL算法库.
    (关键词:effective c++_42/46, typedef, overloading, iterator_traits)

    --------------Will be Updated SOON---------------

    实际上几乎拖了一周才更新
    --------------------废话的分割线--------------------

Phase_2 回过头来解决最初的问题

  1. ++i vs. i++

++i is better.
make sure you create a habit of using pre-increment with iterators in loops. i.e. use ++it and not it++. The pre-increment will not create unnecessary temporaries.

  • 我认为其实根本原因还是 operator重载中,a@将调用a.operator(0),而@a将调用a.operator@(),前者将增加一个临时对象, 因此效率会比较低.
    (附1.)

<下面3个问题应该是属于使用方面的问题,在另一篇专门来研究如何使用>

  1. 迭代器的(合法)增长方向

  2. 迭代器的失效

  3. 迭代器 vs. 下标运算符

附1. 实现了一个带迭代器的字符串类

#include 
#include 
#include 
#include 
class MyString{
    public:
        MyString() : data_(NULL), size_(0){ }
        MyString(const char* s);
        MyString(const MyString &b) : size_(b.size_), data_(NULL){
            //deep copy
            if(size_){
                data_ = new char [BufSize(size_)];
                memcpy(data_, b.data_, BufSize(size_));
            }
        }
        ~MyString(){
            if(data_){
                delete [] data_;
            }
        }

        MyString& operator=(const MyString &);
        bool operator!=(const MyString &);
        bool operator==(const MyString &);
        MyString operator+(const MyString &);
        MyString& operator+=(const MyString &);
        char& operator[](const int);
        
        class Iterator{
            public:
                Iterator() : ptr(NULL){ }
                Iterator& operator++(){
                    ptr++;
                    return *this;
                }
                Iterator& operator--(){
                    ptr--;
                    return *this;
                }
                Iterator& operator+(int x){
                    ptr += x;
                    return *this;
                }
                Iterator& operator-(int x){
                    ptr -= x;
                    return *this;
                }
                bool operator!=(const Iterator &s){
                    return s.ptr != ptr;
                }
                char& operator*(){
                    return *ptr;
                }

            private:
                friend MyString;
                Iterator(char *s) :  ptr(s){ }
                char* ptr;
        };
        Iterator begin(){
            return Iterator(data_);
        }
        Iterator end(){
            return Iterator(data_ + size_);
        }
        
        int size(){ return size_; }
    private:
        char* data_;//end with '\0'
        int size_;
        int BufSize(const int s) const{ return s + 1; }
        char* ReSizeBuf(int s){
//          std::cout << "[DBG]\n";
//          std::cout << s << size_ << std::endl;
            if(s > size_){
                if(data_){ delete [] data_; }
                data_ = new char [BufSize(s)];
            }
            size_ = s;
            return data_;
        }
        friend std::ostream & operator<<(std::ostream &out, const MyString& s);
};
MyString::MyString(const char* s) 
 : size_(strlen(s)),
 data_(NULL)
{
    if(size_){
        data_ = new char [BufSize(size_)];
        memcpy(data_, s, BufSize(size_));
    }
}

MyString& MyString::operator=(const MyString &b)
{
    //deep copy
    //origin data is overwrote
    if(&b != this){
        ReSizeBuf(b.size_);
        memcpy(data_, b.data_, BufSize(size_));
    }
    return *this;
}
bool MyString::operator!=(const MyString & b)
{
    return !(*this == b);
}
bool MyString::operator==(const MyString & b)
{
    if(b.size_ == size_){
        return memcmp(b.data_, data_, size_) == 0;
    }
    return false;
}
//It's not good to do this because it will do 2 alloc.s & dealloc.s(temp. var in c++)
MyString MyString::operator+(const MyString &b)//will concat the two string. 
{
    MyString tmp;
    memcpy(tmp.ReSizeBuf(size_ + b.size_), data_, size_);
    memcpy(tmp.data_ + size_, b.data_, BufSize(b.size_));
    return tmp;
}
MyString& MyString::operator+=(const MyString &b)
{
    char* tmp = BufSize(size_) < BufSize(size_ + b.size_) ? 
        new char [BufSize(size_ + b.size_)] : data_ ;
    if(tmp != data_){
        memcpy(tmp, data_, size_);
    }
    memcpy(tmp + size_, b.data_, BufSize(b.size_));
    if(tmp != data_){
        delete [] data_;
    }
    data_ = tmp;
    size_ = size_ + b.size_;
    return *this;
}

char& MyString::operator[](const int idx)
{
    assert(idx < size_);
    return *(data_ + idx);
}
std::ostream & operator<<(std::ostream &out, const MyString& s)
{
    return s.data_ ? out << s.data_ : out << "";
}


void test1(){
    std::string ss = "12345";
    std::string::iterator itr;
    for(itr = ss.begin(); itr != ss.end(); itr++)
    {
        printf("%c ", *itr);
    }
    printf("\n");
}
void test_MyString()
{
    MyString s1;
    MyString s2("Hello");
    MyString s3 = "1234";
    MyString s4(s3);
    std::cout << s1 << s2 << s3 << s4 << std::endl;
}

void test_operator()
{ 
    MyString s1 = "Hello";
    MyString s2 = "Hi";
    MyString s3 = ", there";
    MyString s4, s5("fff");
    std::cout << s4 << " " << s4.size() << ";" << s5 << " " << s5.size() < itr++ ==> will call MyString::Iterator::operator++(0) 
pre-increment.cc:195:6: error: no ‘operator++(int)’ declared for postfix ‘++’ [-fpermissive]
   itr++){
      ^ 
    for(MyString::Iterator itr = ssx.begin();
        itr != ssx.end();
        itr++){
            std::cout << *itr << " " << std::endl;
        }
*/
    for(MyString::Iterator itr = ssx.begin(); 
        itr != ssx.end();
        ++itr){
            std::cout << *itr << " ";
        }
        std::cout << std::endl;
}


int main()
{
//  char s[4] = {'1','2','4','3'};
//  std::cout << s << std::endl;
    test_iterator();
    return 0;
}

附2. 令附1的代码能支持std::copy(),std::distance()等

代码丑陋 有待优化,但主要目的,令自己写的Iterator可以支持STL的algorithm的目的已经达到了.

#include 
#include 
#include 
#include 
#include //for the tags
#include //we need to support, for exmaple: std::find() std::copy()

class MyString{
    public:
        MyString() : data_(NULL), size_(0){ }
        MyString(const char* s);
        MyString(const MyString &b) : size_(b.size_), data_(NULL){
            //deep copy
            if(size_){
                data_ = new char [BufSize(size_)];
                memcpy(data_, b.data_, BufSize(size_));
            }
        }
        ~MyString(){
            if(data_){
                delete [] data_;
            }
        }

        MyString& operator=(const MyString &);
        bool operator!=(const MyString &);
        bool operator==(const MyString &);
        MyString operator+(const MyString &);
        MyString& operator+=(const MyString &);
        char& operator[](const int);
        

        
        class Iterator{
            public:
                typedef typename std::random_access_iterator_tag iterator_category;
                typedef char value_type;
                typedef int difference_type;
                typedef char* pointer;
                typedef char& reference;
                
                
                Iterator() : ptr(NULL){ printf("[DBG] Iterator() is called\n"); }
                Iterator& operator++(){
                    ptr++;
                    return *this;
                }
                Iterator& operator--(){
                    ptr--;
                    return *this;
                }
                Iterator& operator+(int x){
                    ptr += x;
                    return *this;
                }
                Iterator& operator-(int x){
                    ptr -= x;
                    return *this;
                }
                //error: no match for ‘operator-’ (operand types are ‘MyString::Iterator’ and ‘MyString::Iterator’)
                //we have to impl the operator-(Iterator)
                difference_type operator-(const Iterator& rhs){
                    return ptr - rhs.ptr;
                }
                bool operator!=(const Iterator &s){
                    return s.ptr != ptr;
                }
                bool operator==(const Iterator &s){
                    return !(*this != s);
                }
                //std::find() will use this operator
                bool operator==(value_type c){
                    return *ptr == c;
                }
                char& operator*(){
                    return *ptr;
                }

            private:
                friend MyString;    
                Iterator(char *s) :  ptr(s){
//                  printf("[DBG] Iterator(char*) is called\n");
                }
            protected:
                char* ptr;
        };
        
        class OuputIterator : public Iterator{
            public:
                char& operator*(){
                    if(ptr == mptr_->end().ptr){
                        int offset = mptr_->size_;
                        mptr_->ReSizeCopyBuf((mptr_->size_ + 1) * 2);
                        ptr = mptr_->data_ + offset;
                    }
                    return *ptr;
                }
                void print(){
                    printf("[DBG] %p\n", ptr);
                }
            private:
                friend MyString;//friend is not inherited
                MyString* mptr_;
                OuputIterator(MyString* me) : mptr_(me){ }
                OuputIterator(char *s) : Iterator(s) { /*printf("[DBG] OuputIterator(char*) is called\n"); */}
        };
        
        Iterator begin(){
            return Iterator(data_);
        }
        Iterator end(){
            return Iterator(data_ + size_);
        }
        OuputIterator obegin(){
            return OuputIterator(data_);
        }
        OuputIterator oend(){
            return OuputIterator(data_ + size_);
        }
        
        int size(){ return size_; }
    private:
        char* data_;//end with '\0'
        int size_;
        int BufSize(const int s) const{ return s + 1; }
        char* ReSizeBuf(int s){
//          std::cout << "[DBG]\n";
//          std::cout << s << size_ << std::endl;
            if(s > size_){
                if(data_){ delete [] data_; }
                data_ = new char [BufSize(s)];
            }
            size_ = s;
            return data_;
        }
        char* ReSizeCopyBuf(int s){
            if(s > size_){
                char* new_data_ = new char [BufSize(s)];
                if(data_){ 
                    memcpy(new_data_, data_, BufSize(size_)); 
                    delete [] data_;
                }
                data_ = new_data_;
            }
            size_ = s;
            return data_;
        }
        friend OuputIterator;
        friend std::ostream & operator<<(std::ostream &out, const MyString& s);
};
MyString::MyString(const char* s) 
 : size_(strlen(s)),
 data_(NULL)
{
    if(size_){
        data_ = new char [BufSize(size_)];
        memcpy(data_, s, BufSize(size_));
    }
}

MyString& MyString::operator=(const MyString &b)
{
    //deep copy
    //origin data is overwrote
    if(&b != this){
        ReSizeBuf(b.size_);
        memcpy(data_, b.data_, BufSize(size_));
    }
    return *this;
}
bool MyString::operator!=(const MyString & b)
{
    return !(*this == b);
}
bool MyString::operator==(const MyString & b)
{
    if(b.size_ == size_){
        return memcmp(b.data_, data_, size_) == 0;
    }
    return false;
}
//It's not good to do this because it will do 2 alloc.s & dealloc.s(temp. var in c++)
MyString MyString::operator+(const MyString &b)//will concat the two string. 
{
    MyString tmp;
    memcpy(tmp.ReSizeBuf(size_ + b.size_), data_, size_);
    memcpy(tmp.data_ + size_, b.data_, BufSize(b.size_));
    return tmp;
}
MyString& MyString::operator+=(const MyString &b)
{
    char* tmp = BufSize(size_) < BufSize(size_ + b.size_) ? 
        new char [BufSize(size_ + b.size_)] : data_ ;
    if(tmp != data_){
        memcpy(tmp, data_, size_);
    }
    memcpy(tmp + size_, b.data_, BufSize(b.size_));
    if(tmp != data_){
        delete [] data_;
    }
    data_ = tmp;
    size_ = size_ + b.size_;
    return *this;
}

char& MyString::operator[](const int idx)
{
    assert(idx < size_);
    return *(data_ + idx);
}
std::ostream & operator<<(std::ostream &out, const MyString& s)
{
    return s.data_ ? out << s.data_ : out << "";
}


void test1(){
    std::string ss = "12345";
    std::string::iterator itr;
    for(itr = ss.begin(); itr != ss.end(); itr++)
    {
        printf("%c ", *itr);
    }
    printf("\n");
}
void test_MyString()
{
    MyString s1;
    MyString s2("Hello");
    MyString s3 = "1234";
    MyString s4(s3);
    std::cout << s1 << s2 << s3 << s4 << std::endl;
}

void test_operator()
{ 
    MyString s1 = "Hello";
    MyString s2 = "Hi";
    MyString s3 = ", there";
    MyString s4, s5("fff");
    std::cout << s4 << " " << s4.size() << ";" << s5 << " " << s5.size() < itr++ ==> will call MyString::Iterator::operator++(0) 
pre-increment.cc:195:6: error: no ‘operator++(int)’ declared for postfix ‘++’ [-fpermissive]
   itr++){
      ^ 
    for(MyString::Iterator itr = ssx.begin();
        itr != ssx.end();
        itr++){
            std::cout << *itr << " " << std::endl;
        }
*/
    for(MyString::Iterator itr = ssx.begin(); 
        itr != ssx.end();
        ++itr){
            std::cout << *itr << " ";
        }
        std::cout << std::endl;
}
void test_STL_algor()
{
    MyString testss = "Hi, My Name is REM.";
    std::cout << std::distance(testss.begin(), testss.end()) < 
    
    std::cout << std::find(testss.begin(), testss.end(), 'R') - testss.begin() << std::endl;
    //std::find() is included at algorithm 
    
    MyString dst;
    std::copy(testss.begin(), testss.end(), dst.obegin());
    std::cout<< dst << std::endl;
    //... you can test other function of STL-algorithm
}


void testobegin()
{
    MyString testss = "Hi, I am Rem.Nice to meet you.";
    MyString::OuputIterator oitr = testss.obegin();
    oitr.print();
}
int main()
{
    test_STL_algor();
    return 0;
}

TODO

  • 3个遗留问题
  • 附2优化
  • 实现一个模拟STL的tag-dispatching.

你可能感兴趣的:([STL deep dive]迭代器的实现探究1)