Boost.Interprocess使用手册翻译之十:直接输入输出流(iostream)格式化:vectorstream 和bufferstream

十.直接输入输出流(iostream)格式化:vectorstream 和bufferstream

在你的字符向量(vector)中直接格式化:vectorstream

在你的字符缓冲区(buffer)中直接格式化:bufferstream

共享内存、内存映射文件和所有的Boost.Interprocess机制关注的是效率。为什么使用共享内存的原因是因为它是目前可用的最快的IPC(进程间通信)机制。当使用共享内存传输面向文本的消息时,需要格式化此消息。显然,C++为此提供了iostream框架。

一些程序员喜欢iostream的安全性,并且设计为内存格式化。但同时感觉stringstream家族效率较低,这种低效不是在格式化时,而是在获取格式化的数据至一个字符串或是在设置流解压数据字符串时。看一个例子:

//Some formatting elements
std::string my_text = "...";
int number;
 
//Data reader
std::istringstream input_processor;
 
//This makes a copy of the string. If not using a
//reference counted string, this is a serious overhead.
input_processor.str(my_text);
 
//Extract data
while(/*...*/){
   input_processor >> number;
}
 
//Data writer
std::ostringstream output_processor;
 
//Write data
while(/*...*/){
   output_processor << number;
}
 
//This returns a temporary string. Even with return-value
//optimization this is expensive.
my_text = input_processor.str();

如果字符串是一个共享内存字符串时,情况会更糟,因为对于解压数据,我们必须首先从共享内存拷贝至std::string,然后至std::stringstream。要在共享内存字符串中编码数据,我们需要从std::stringstream拷贝数据至std::string,然后至共享内存字符串。

因为这种开销,Boost.Interprocess提供了一种方式用于格式化内存字符串(在共享内存、内存映射文件或任何其他的内存片段中),它能避免所有不需要的字符串拷贝和内存分配/释放,同时使用所有的iostream便利。Boost.Interprocess的vectorstream和bufferstream对输入输出流使用基于向量和基于固定大小缓冲区的存储支持,并且所有的格式化/语言环境等工作都被标准的std::basic_streambuf<>和std::basic_iostream<>类族完成。

在你的字符向量中直接格式化:vectorstream

Vectorstream类族(basic_vectorbuf, basic_ivectorstream ,basic_ovectorstream和basic_vectorstream)是一个直接在字符向量中获取格式化的读/写的有效方式。这样,如果一个共享内存向量被使用,数据被解压/写从/至共享内存向量,而不需要额外的拷贝/分配。我们可以看看 basic_vectorstream的声明:

//!A basic_iostream class that holds a character vector specified by CharVector
//!template parameter as its formatting buffer. The vector must have
//!contiguous storage, like std::vector, boost::interprocess::vector or
//!boost::interprocess::basic_string
template <class CharVector, class CharTraits =
         std::char_traits<typename CharVector::value_type> >
class basic_vectorstream
: public std::basic_iostream<typename CharVector::value_type, CharTraits>
 
{
   public:
   typedef CharVector                                                   vector_type;
   typedef typename std::basic_ios
      <typename CharVector::value_type, CharTraits>::char_type          char_type;
   typedef typename std::basic_ios<char_type, CharTraits>::int_type     int_type;
   typedef typename std::basic_ios<char_type, CharTraits>::pos_type     pos_type;
   typedef typename std::basic_ios<char_type, CharTraits>::off_type     off_type;
   typedef typename std::basic_ios<char_type, CharTraits>::traits_type  traits_type;
 
   //!Constructor. Throws if vector_type default constructor throws.
   basic_vectorstream(std::ios_base::openmode mode
                     = std::ios_base::in | std::ios_base::out);
 
   //!Constructor. Throws if vector_type(const Parameter ¶m) throws.
   template<class Parameter>
   basic_vectorstream(const Parameter ¶m, std::ios_base::openmode mode
                     = std::ios_base::in | std::ios_base::out);
 
   ~basic_vectorstream(){}
 
   //!Returns the address of the stored stream buffer.
   basic_vectorbuf<CharVector, CharTraits>* rdbuf() const;
 
   //!Swaps the underlying vector with the passed vector. 
   //!This function resets the position in the stream.
   //!Does not throw.
   void swap_vector(vector_type &vect);
 
   //!Returns a const reference to the internal vector.
   //!Does not throw.
   const vector_type &vector() const;
 
   //!Preallocates memory from the internal vector.
   //!Resets the stream to the first position.
   //!Throws if the internals vector's memory allocation throws.
   void reserve(typename vector_type::size_type size);
};

向量类型被模板化,以便我们可以使用任意类型的向量:std::vector,boost::interprocess::vector等等,但存储必须连续,我们不可以使用队列(deque)。我们甚至可以使用boost::interprocess::basic_string,因为它具有向量vector接口以及连续存储。我们不能使用std::string,因为尽管一些std::string的实现是基于vector的,但另外一些具有优化和引用计数实现。

用户可以使用函数vector_type vector() const 获得内存向量的一个常量引用,并且他也能通过调用void swap_vector(vector_type &vect)调换内存向量和外部向量。调换函数充值流位置。此函数允许高效的方式来获取格式化的数据,避免了所有的分配和数据拷贝。

让我们看看一个如何使用vectorstream的例子:

#include <boost/interprocess/containers/vector.hpp>
#include <boost/interprocess/containers/string.hpp>
#include <boost/interprocess/allocators/allocator.hpp>
#include <boost/interprocess/managed_shared_memory.hpp>
#include <boost/interprocess/streams/vectorstream.hpp>
#include <iterator>
 
using namespace boost::interprocess;
 
typedef allocator<int, managed_shared_memory::segment_manager>
   IntAllocator;
typedef allocator<char, managed_shared_memory::segment_manager>
   CharAllocator;
typedef vector<int, IntAllocator>   MyVector;
typedef basic_string
   <char, std::char_traits<char>, CharAllocator>   MyString;
typedef basic_vectorstream<MyString>               MyVectorStream;
 
int main ()
{
   //Remove shared memory on construction and destruction
   struct shm_remove
   {
      shm_remove() { shared_memory_object::remove("MySharedMemory"); }
      ~shm_remove(){ shared_memory_object::remove("MySharedMemory"); }
   } remover;
 
   managed_shared_memory segment(
      create_only,
      "MySharedMemory", //segment name
      65536);           //segment size in bytes
 
   //Construct shared memory vector
   MyVector *myvector =
      segment.construct<MyVector>("MyVector")
      (IntAllocator(segment.get_segment_manager()));
 
   //Fill vector
   myvector->reserve(100);
   for(int i = 0; i < 100; ++i){
      myvector->push_back(i);
   }
 
   //Create the vectorstream. To create the internal shared memory
   //basic_string we need to pass the shared memory allocator as
   //a constructor argument
   MyVectorStream myvectorstream(CharAllocator(segment.get_segment_manager()));
 
   //Reserve the internal string
   myvectorstream.reserve(100*5);
 
   //Write all vector elements as text in the internal string
   //Data will be directly written in shared memory, because
   //internal string's allocator is a shared memory allocator
   for(std::size_t i = 0, max = myvector->size(); i < max; ++i){
      myvectorstream << (*myvector)[i] << std::endl;
   }
 
   //Auxiliary vector to compare original data
   MyVector *myvector2 =
      segment.construct<MyVector>("MyVector2")
      (IntAllocator(segment.get_segment_manager()));
 
   //Avoid reallocations
   myvector2->reserve(100);
 
   //Extract all values from the internal 
   //string directly to a shared memory vector.
   std::istream_iterator<int> it(myvectorstream), itend;
   std::copy(it, itend, std::back_inserter(*myvector2));
 
   //Compare vectors
   assert(std::equal(myvector->begin(), myvector->end(), myvector2->begin()));
 
   //Create a copy of the internal string
   MyString stringcopy (myvectorstream.vector());
 
   //Now we create a new empty shared memory string...
   MyString *mystring =
      segment.construct<MyString>("MyString")
      (CharAllocator(segment.get_segment_manager()));
 
   //...and we swap vectorstream's internal string
   //with the new one: after this statement mystring
   //will be the owner of the formatted data.
   //No reallocations, no data copies
   myvectorstream.swap_vector(*mystring);
 
   //Let's compare both strings
   assert(stringcopy == *mystring);
 
   //Done, destroy and delete vectors and string from the segment
   segment.destroy_ptr(myvector2);
   segment.destroy_ptr(myvector);
   segment.destroy_ptr(mystring);
   return 0;
}

在你的字符缓冲区中直接格式化:bufferstream

如上述,vectorstream为高效输出输出流格式化提供了一个简单而安全的方式,但许多时候我们必须读或写格式化数据从/至一个固定大小的字符缓冲区(静态缓冲区、c-string或其他)。由于stringstream的开销,许多开发者(特别是在嵌入系统中)选择sprintf家族。Bufferstream类提供iostream接口,它能在一个固定大小的内存缓冲区上直接格式化,且带保护以防止缓冲区溢出。下面是接口:

//!A basic_iostream class that uses a fixed size character buffer
//!as its formatting buffer.
template <class CharT, class CharTraits = std::char_traits<CharT> >
class basic_bufferstream
   : public std::basic_iostream<CharT, CharTraits>
 
{
   public:                         // Typedefs
   typedef typename std::basic_ios
      <CharT, CharTraits>::char_type          char_type;
   typedef typename std::basic_ios<char_type, CharTraits>::int_type     int_type;
   typedef typename std::basic_ios<char_type, CharTraits>::pos_type     pos_type;
   typedef typename std::basic_ios<char_type, CharTraits>::off_type     off_type;
   typedef typename std::basic_ios<char_type, CharTraits>::traits_type  traits_type;
 
   //!Constructor. Does not throw.
   basic_bufferstream(std::ios_base::openmode mode
                     = std::ios_base::in | std::ios_base::out);
 
   //!Constructor. Assigns formatting buffer. Does not throw.
   basic_bufferstream(CharT *buffer, std::size_t length,
                     std::ios_base::openmode mode
                        = std::ios_base::in | std::ios_base::out);
 
   //!Returns the address of the stored stream buffer.
   basic_bufferbuf<CharT, CharTraits>* rdbuf() const;
 
   //!Returns the pointer and size of the internal buffer. 
   //!Does not throw.
   std::pair<CharT *, std::size_t> buffer() const;
 
   //!Sets the underlying buffer to a new value. Resets 
   //!stream position. Does not throw.
   void buffer(CharT *buffer, std::size_t length);
};
 
//Some typedefs to simplify usage
typedef basic_bufferstream<char>     bufferstream;
typedef basic_bufferstream<wchar_t>  wbufferstream;
// ...

当从一个固定大小的缓冲区中读数据时,如果我们尝试读一个超出缓冲区末尾的地址,

则bufferstream激活结束位标记。当向一个固定大小的缓冲区写数据时,如果将发生缓冲区溢出,bufferstream将激活坏位标记,同时阻止写动作。这样,通过bufferstream进行固定大小缓冲区格式化是安全而高效的,并且对sprintf/sscanf函数提供一个良好的替代。让我们看一个例子:

#include <boost/interprocess/managed_shared_memory.hpp>
#include <boost/interprocess/streams/bufferstream.hpp>
#include <vector>
#include <iterator>
#include <cstddef>
 
using namespace boost::interprocess;
 
int main ()
{
   //Remove shared memory on construction and destruction
   struct shm_remove
   {
      shm_remove() { shared_memory_object::remove("MySharedMemory"); }
      ~shm_remove(){ shared_memory_object::remove("MySharedMemory"); }
   } remover;
 
   //Create shared memory
   managed_shared_memory segment(create_only,
                                 "MySharedMemory",  //segment name
                                 65536);
 
   //Fill data
   std::vector<int> data;
   data.reserve(100);
   for(int i = 0; i < 100; ++i){
      data.push_back(i);
   }
   const std::size_t BufferSize = 100*5;
 
   //Allocate a buffer in shared memory to write data
   char *my_cstring =
      segment.construct<char>("MyCString")[BufferSize](0);
   bufferstream mybufstream(my_cstring, BufferSize);
 
   //Now write data to the buffer
   for(int i = 0; i < 100; ++i){
      mybufstream << data[i] << std::endl;
   }
 
   //Check there was no overflow attempt
   assert(mybufstream.good());
 
   //Extract all values from the shared memory string
   //directly to a vector.
   std::vector<int> data2;
   std::istream_iterator<int> it(mybufstream), itend;
   std::copy(it, itend, std::back_inserter(data2));
 
   //This extraction should have ended will fail error since 
   //the numbers formatted in the buffer end before the end
   //of the buffer. (Otherwise it would trigger eofbit)
   assert(mybufstream.fail());
 
   //Compare data
   assert(std::equal(data.begin(), data.end(), data2.begin()));
 
   //Clear errors and rewind
   mybufstream.clear();
   mybufstream.seekp(0, std::ios::beg);
 
   //Now write again the data trying to do a buffer overflow
   for(int i = 0, m = data.size()*5; i < m; ++i){
      mybufstream << data[i%5] << std::endl;
   }
 
   //Now make sure badbit is active
   //which means overflow attempt.
   assert(!mybufstream.good());
   assert(mybufstream.bad());
   segment.destroy_ptr(my_cstring);
   return 0;
}

如上述,bufferstream提供一个高效的方式来格式化数据,而不需要任何分配和额外的拷贝。这在嵌入式系统,或在时间密集型循环内格式化中是非常有帮助的,在这之中字符串流的额外拷贝可能会导致开销过大。与sprintf/sscanf不同,它具有保护以防止缓冲区溢出。像我们所知道的一样,按照C++性能技术报告(Technical Report on C++Performance)的说法,为嵌入式系统设计高效的输入输出流是可行的,因此bufferstream类在格式化数据至堆栈、静态或共享的内存缓冲区的使用上是得心应手的。

你可能感兴趣的:(vector,String,manager,basic,buffer,iostream)