


该bug由[Bash-autocompletion] Add support for static analyzer flags引入,引起的bug见Revert r311552: [Bash-autocompletion] Add support for static analyzer flags,最终由Keep an instance of COFFOptTable alive as long as InputArgList is alive解决。导致该bug的原因是引用了某个局部对象的vector类型成员变量元素的地址,当该局部对象析构时,调用该了成员变量对应vector的析构函数,因此最初引用的地址就失效了,当再次访问该地址时引发了内存错误。

该成员变量最初是llvm::ArrayRef,而[Bash-autocompletion] Add support for static analyzer flags将其改为了std::vector<>类型,那么为什么引用llvm::ArrayRef的元素地址就不存在该问题?在介绍原因之前,先费口舌记录一下该bug的具体场景。

template<typename T>
class ArrayRef {
    /// The start of the array, in an external buffer.
    const T *Data = nullptr;

    /// The number of elements.
    size_type Length = 0;
    operator std::vector() const {
        return std::vector(Data, Data + Length);
// --------------------------------------------
static const OptTable::Info InfoTable[] = {
    // Option List

class OptTable {
    struct Info {
        // details

    // Implicit conversion `Array => std::vector` occurred here, 
    OptTable(ArrayRef OptionInfos) : OptionInfos(OptionInfos) {}

    const Info& getInfo(unsigned id) const {
        return Options[id - 1];
    /// \brief The option information table.
    std::vector OptionInfos;
    /// ...
const OptTable* CompilerInvocation::CreateFromArgs() {
    // Local object
    auto Opts = std::make_unique(InfoTable);

    // Reference the address of Opts.OptionInfos[0]
    const OptTable::Info *Ptr = Opts.getInfo(1);

    // ...

    return Ptr;
}  // <---- calling `~OptTable()` on `Opts` and `~vector` on `Opts.OptionInfos`


template< class InputIt >
vector( InputIt first, InputIt last, 
        const Allocator& alloc = Allocator() );


4) Constructs the container with the contents of the range [first, last).

This constructor has the same effect as

  • vector(static_cast(first), static_cast(last), a). if InputIt is an integral type. (until C++11)

This overload only participates in overload resolution if InputIt satisfies InputIterator, to avoid ambiguity with the overload (2). (since C++11)


393        *  @brief  Builds a %vector from a range.
394        *  @param  __first  An input iterator.
395        *  @param  __last  An input iterator.
396        *  @param  __a  An allocator.
397        *
398        *  Create a %vector consisting of copies of the elements from
399        *  [first,last).
400        *
401        *  If the iterators are forward, bidirectional, or
402        *  random-access, then this will call the elements' copy
403        *  constructor N times (where N is distance(first,last)) and do
404        *  no memory reallocation.  But if only input iterators are
405        *  used, then this will do at most 2N calls to the copy
406        *  constructor, and logN memory reallocations.
407        */
408 #if __cplusplus >= 201103L
409       template410            typename = std::_RequireInputIter<_InputIterator>>
411     vector(_InputIterator __first, _InputIterator __last,
412            const allocator_type& __a = allocator_type())
413     : _Base(__a)
414     { _M_initialize_dispatch(__first, __last, __false_type()); }
415 #else
416       template
417     vector(_InputIterator __first, _InputIterator __last,
418            const allocator_type& __a = allocator_type())
419     : _Base(__a)
420     {
421       // Check whether it's an integral type.  If so, it's not an iterator.
422       typedef typename std::__is_integer<_InputIterator>::__type _Integral;
423       _M_initialize_dispatch(__first, __last, _Integral());
424     }
425 #endif





/// ArrayRef - Represent a constant reference to an array (0 or more elements
/// consecutively in memory), i.e. a start pointer and a length. It allows
/// various APIs to take consecutive elements easily and conveniently.
/// This class does not own the underlying data, it is expected to be used in
/// situations where the data resides in some other buffer, whose lifetime
/// extends past that of the ArrayRef. For this reason, it is not in general
/// safe to store an ArrayRef.
/// This is intended to be trivially copyable, so it should be passed by 
/// value.
template<typename T>
class ArrayRef {
    /// The start of the array, in an external buffer.
    const T *Data = nullptr;

    /// The number of elements.
    size_type Length = 0;

    /// Construct an ArrayRef from a single element.
    /*implicit*/ ArrayRef(const T &OneElt)
    : Data(&OneElt), Length(1) {}

    /// Construct an ArrayRef from a pointer and length.
    /*implicit*/ ArrayType(const T *data, size_t length)
    : Data(data), Length(length) {}

    /// Construct an ArrayRef from a range.
    ArrayRef(const T *begin, const T *end)
    : Data(begin), Length(end - begin) {}

    /// Construct an ArrayRef from a std::vector.
    template<typename A>
    /*implicit*/ ArrayRef(const std::vector &Vec)
    : Data(Vec.data()), Length(Vec.size()) {}

    /// Construct an ArrayRef from an std::array
    /*implicit*/ constexpr ArrayRef(const std::array &Arr)
    : Data(Arr.data()), Length(N) {}

    /// Construct an ArrayRef from a C array
    /*implicit*/ constexpr ArrayRef(const T (&Arr)[N]) : Data(Arr), Length(N) {}

    /// Construct an Array from std::initializer_list
    /*implicit*/ ArrayRef(const std::initializer_list &Vec)
    : Data(Vec.begin() == Vec.end() ? (T*)nullptr : Vec.begin()),
    Length(Vec.size()) {}


  • llvm::ArrayRef表示的是一组连续内存区域,核心是start pointerlength
  • llvm::ArrayRef提供了很多简单方便的API供使用
  • llvm::ArrayRef并不拥有这些数据,这些数据存放在其他buffer中,并且这些buffer的生命周期比llvm::ArrayRef要长
  • 通常来说存储一个llvm::ArrayRef对象并不安全
  • ArrayRef提供了大量的构造函数,用于接受std::vectorstd::arraystd::initializer_list,数组

这里需要岔开一下话题介绍一下上述代码中关于C++11两点内容,constexpr constructor以及std::vector::data

constexpr constructor

The constexpr specifier declares that it is possible to evaluate the value of the function or variable at compile time. Such variables and functions can then be used where only compile time constant expressions are allowed (provided that appropriate function arguments are given).

使用constexpr修饰普通函数可以理解,就是让函数在compile-time evaluate该函数并得到其返回值,用constexpr用于修饰constructor有什么意义呢?


  • 减少运行时开销,constexpr constructor不会生成二进制代码,所有的初始化都在compile time完成
  • 能够使该类型成为literal type,也就是可以使该对象成为constexpr variable,从而可以用在non-type template argumentsarray sizes等地方,从另一个角度可以认为拥有constexpr construtor的自定义类型对象可以用来构成constant expression

这里有很多概念,例如constant expressionliteral type,限于自己C++知识的不足,就不胡说了。当然成为constexpr constructor也是有一定要求的,见Constexpr constructors (C++11)

The category of types that can be used for constexpr variables is called literal type. Most notably, literal types include classes that have constexpr constructors, so that values of the type can be initialized calling constexpr functions.

1. Why would you use a constexpr on a constructor?
2. Does specifying constexpr on constructor automatically makes all objects created from it to be constexpr?






llvm::ArrayRef Array({1, 2, 3, 4});
Array[0] = 10;

llvm::Array有相同功能的是llvm::StringRef,Purpose of ArrayRef中有一段描述比较精确,如下:

It’s the same idea behind std::string_view: to provide a general view to something, without managing it’s lifetime.

In the case of ArrayRef(which is a terrible name, ArrayView is much better IMHO), it can view other arrays type, including the non-object builtin array(C array).




/// StringRef - Represent a constant reference to a string, i.e. a character
/// array and a length, which need not be null terminated.
/// This class does not own the string data, it is expected to be used in
/// situations where the character data resides in some other buffer, whose
/// lifetime extends past that of the StringRef. For this reason, it is not in
/// general safe to store a StringRef
class StringRef {
    /// The start of the string, in an external buffer.
    const char *Data = nullptr;

    /// The length of the string.
    size_t Length = 0;
    /// Construct an empty string ref.
    /*implicit*/ StringRef() = default;

    /// Disable conversion from nullptr. This prevents things like
    /// if (S == nullptr)
    StringRef(std::nullptr_t) = delete;

    /// Construct a string ref from a cstring.
    /*implicit*/ StringRef(const char *Str)
    : Data(Str), Length(Str ? ::strlen(Str) : 0) {}

    /// Construct a string ref from a pointer and length.
    /*implicit*/ constexpr StringRef(const char *data, size_t length)
    : Data(data), Length(length) {}

    /// Construct a string ref from an std::string
    /*implicit*/ StringRef(const std::string &Str)
    : Data(data), Length(length) {}

llvm::StringRef可以直接使用c stringstd::string初始化,并且llvm::StringRef提供了一系列的API,填充了std::string的不足,例如我想判断某个字符串是否以某个子串结尾,就可以使用endswith()endswith_lower()来完成,如果是c string的话,需要自己实现相关的接口,并且std::string也并没有直接可以使用的API。




The class template basic_string_view describes an object that can refer to a constant contiguous sequence of char-like objects with the first element of the sequence at position zero.


  • 指向的对象的类型必须是char-like type,[strings.general]给出的描述也很含糊,如果谁能给出确切的定义还望告知。我在goldbolt试了一下,int什么的是没问题的。
  • 指向的也是constant的内容,这样描述的原因是由于std::basic_string_view中存储的指向该内容的指针是const的
  • 指向的必须是contiguous的序列


Type Definition
std::experimental::string_view std::experimental::basic_string_view
std::experimental::wstring_view std::experimental::basic_string_view
std::experimental::u16string_view std::experimental::basic_string_view
std::experimental::u32string_view std::experimental::basic_string_view


class basic_string_view {
    const_pointer data_;
    size_type size_;


从上面可以看到std::basic_string_view对象非常简单,所占内存也比较小。但是提供了巨多的接口,除了涵盖std::string的接口之外,还另外提供了很多API,例如下面三个API,更多的见C++ Standard。

constexpr void remove_prefix(size_type n);
1. Requires: n<= size
2. Effects: Equivalent to data_ += n; size -= n

constexpr void remove_suffix(size_type n)
1. Requires: n <= size()
2. Effects: Equivalent to size -= n

constexpr void swap(basic_string_view& s) noexcept
1. Effects: Exchange the values of *this and s


关于std::string_view比较好的入门资料见CppCon 2015: Marshall Clow “string_view”,string_view: a non-owning reference to a string, revision 4。


std::string_view的优势就是在处理字符串时,提供了比原先效率更高的解决方案,毕竟绝大部分情况下处理std::string_view比直接处理std::string要更高效。盗取CppCon 2015: Marshall Clow “string_view”中的例子如下:

string extract_part(const string &bar) {
    return bar.substr(2, 3);
if (extract_part("ABCDEFG").front() == ''C) {
    /* do something */


string_view extract_part(string_view bar) {
    return bar.substr(2, 3);
if (extract_part("ABCDEFG").front() == 'C') {
    /* do something */

std::string_view中的substr就只是简单的指针的加减,C++ Standard中也明确到std::string_view的成员方法的复杂度都是O(1)。

关于std::string_view的优势What is string_view?中有所提及,我摘取一部分。

The purpose of any and all kinds of “string reference” and “array reference” proposals is to avoid copying data which is already owned somewhere else and of which only a non-mutating view is required.

Such a view-handle class could be passed around cheaply by value and would offer cheap substringing operations (which can be implemented as simple pointer increments and size adjustments).

另外关于std::string_view vs const std::string&的讨论也有很多没例如How exactly std::string_view is faster than const std::string&?



  • lifetime,std::string_view超出了其所描述数据的声明周期,例如将std::string_view像普通的std::string一样,从函数中返回或者是来回拷贝,这些操作都是危险的
  • bound check,std::string_view存在与std::string的越界问题
  • null-terminator(猜测),std::string_view不需要null-terminated,有可能会引发相应的安全问题,但是我还没有找到合适的例子
