简单介绍
这是最近Swift开发人员对Swift String进行的一个优化,PR地址SIL optimizer: Add a new string optimization #33128,根据描述可以看到有一下几个方面的优化:
- 当表达式
x.append(y)
的x是空的时候,用表达式x = y
代替它。 - 移除
x.append("")
。 - 当表达式
x.append(y)
的 x 和 y都是常亮字符串时,用表达式x = x + y
代替 。 - 如果T是静态已知的,则用常量字符串替换
_typeName(T.self)
。
SIL分析
因为这个优化是用过添加一个SIL Pass完成的,也就是在SIL层面进行的优化,我们需要简单了解一下String append相关的一些SIL指令。
来看一个简单的例子,创建一个String.swift文件,然后添加如下代码:
func stringTest() -> String {
var string = "Hello"
string.append("Roy")
return string
}
代码很简单,就是创建一个字符串,然后通过string.append向字符串尾部添加另一个字符串。
swiftc -emit-sil String.swift > String.sil
然后通过上面命令将上面的swift源码转成SIL代码,代码并不多。
// stringTest()
sil hidden @$s6String10stringTestSSyF : $@convention(thin) () -> @owned String {
bb0:
%0 = alloc_stack $String, var, name "string" // users: %7, %24, %23, %14, %19
%1 = string_literal utf8 "Hello" // user: %6
%2 = integer_literal $Builtin.Word, 5 // user: %6
%3 = integer_literal $Builtin.Int1, -1 // user: %6
%4 = metatype $@thin String.Type // user: %6
// function_ref String.init(_builtinStringLiteral:utf8CodeUnitCount:isASCII:)
%5 = function_ref @$sSS21_builtinStringLiteral17utf8CodeUnitCount7isASCIISSBp_BwBi1_tcfC : $@convention(method) (Builtin.RawPointer, Builtin.Word, Builtin.Int1, @thin String.Type) -> @owned String // user: %6
%6 = apply %5(%1, %2, %3, %4) : $@convention(method) (Builtin.RawPointer, Builtin.Word, Builtin.Int1, @thin String.Type) -> @owned String // user: %7
store %6 to %0 : $*String // id: %7
%8 = string_literal utf8 "Roy" // user: %13
%9 = integer_literal $Builtin.Word, 3 // user: %13
%10 = integer_literal $Builtin.Int1, -1 // user: %13
%11 = metatype $@thin String.Type // user: %13
// function_ref String.init(_builtinStringLiteral:utf8CodeUnitCount:isASCII:)
%12 = function_ref @$sSS21_builtinStringLiteral17utf8CodeUnitCount7isASCIISSBp_BwBi1_tcfC : $@convention(method) (Builtin.RawPointer, Builtin.Word, Builtin.Int1, @thin String.Type) -> @owned String // user: %13
%13 = apply %12(%8, %9, %10, %11) : $@convention(method) (Builtin.RawPointer, Builtin.Word, Builtin.Int1, @thin String.Type) -> @owned String // users: %18, %16
%14 = begin_access [modify] [static] %0 : $*String // users: %17, %16
// function_ref String.append(_:)
%15 = function_ref @$sSS6appendyySSF : $@convention(method) (@guaranteed String, @inout String) -> () // user: %16
%16 = apply %15(%13, %14) : $@convention(method) (@guaranteed String, @inout String) -> ()
end_access %14 : $*String // id: %17
release_value %13 : $String // id: %18
%19 = begin_access [read] [static] %0 : $*String // users: %20, %22
%20 = load %19 : $*String // users: %25, %21
retain_value %20 : $String // id: %21
end_access %19 : $*String // id: %22
destroy_addr %0 : $*String // id: %23
dealloc_stack %0 : $*String // id: %24
return %20 : $String // id: %25
} // end sil function '$s6String10stringTestSSyF'
// String.init(_builtinStringLiteral:utf8CodeUnitCount:isASCII:)
sil [serialized] [always_inline] [readonly] [_semantics "string.makeUTF8"] @$sSS21_builtinStringLiteral17utf8CodeUnitCount7isASCIISSBp_BwBi1_tcfC : $@convention(method) (Builtin.RawPointer, Builtin.Word, Builtin.Int1, @thin String.Type) -> @owned String
// String.append(_:)
sil [_semantics "string.append"] @$sSS6appendyySSF : $@convention(method) (@guaranteed String, @inout String) -> ()
下面来简单分析分析。
%0 = alloc_stack $String, var, name "string" // users: %7, %24, %23, %14, %19
%1 = string_literal utf8 "Hello" // user: %6
%2 = integer_literal $Builtin.Word, 5 // user: %6
%3 = integer_literal $Builtin.Int1, -1 // user: %6
%4 = metatype $@thin String.Type // user: %6
-
alloc_stack T
在堆栈上分配(未初始化的)内存以包含T,并返回分配的内存的地址。 -
$String
我们为其分配内存为String类型,SIL中的类型以开头$。 -
%1 = string_literal utf8 "Hello"
在全局字符串表中创建对字符串的引用。结果是指向数据的指针。引用的字符串始终以空值结尾。字符串文字值是使用Swift的字符串文字语法指定的。编码为utf8。 -
integer_literal $Builtin.Word, 5
创建一个integer_literal,类型是Builtin.Word,值为5。这是我们要为其分配内存的字符串大小,因为字符串"Hello"长度为5。 -
integer_literal $Builtin.Int1, -1
创建一个integer_literal,类型是Builtin.Int1,值为-1。SIL中bool类型也是Builtin.Int1,表示是否是ASCII。 -
metatype $T.Type
创建对type的元类型对象的引用T,在这里,我们得到对类型的引用String。请注意,这是实际类型,因为它没有任何占位符类型。
// function_ref String.init(_builtinStringLiteral:utf8CodeUnitCount:isASCII:)
%5 = function_ref @$sSS21_builtinStringLiteral17utf8CodeUnitCount7isASCIISSBp_BwBi1_tcfC : $@convention(method) (Builtin.RawPointer, Builtin.Word, Builtin.Int1, @thin String.Type) -> @owned String // user: %6
%6 = apply %5(%1, %2, %3, %4) : $@convention(method) (Builtin.RawPointer, Builtin.Word, Builtin.Int1, @thin String.Type) -> @owned String // user: %7
store %6 to %0 : $*String // id: %7
- 寄存器%5是一个函数的引用,函数是
String.init(_builtinStringLiteral:utf8CodeUnitCount:isASCII:)
,即String的初始化方法。这个方法共有4个参数,分别是Builtin.RawPointer
,Builtin.Word,
,Builtin.Int1
和@thin String.Type
类型,返回值是String类型。 -
apply
是函数调用,调用的是%5,传入的参数是%1, %2, %3, %4。执行完返回结果存储在寄存器%6。 -
store
是内存访问指令,将值%6存储到地址%0的内存中。%0的类型是* String,也就是一个String类型指针,%6的类型是String,它将覆盖%0处的内存。
%8 = string_literal utf8 "Roy" // user: %13
....
%13 = apply %12(%8, %9, %10, %11) : $@convention(method) (Builtin.RawPointer, Builtin.Word, Builtin.Int1, @thin String.Type) -> @owned String // users: %18, %16
这段指令和前面讲过的一样,目的是初始化字符串"Roy" 。
%14 = begin_access [modify] [static] %0 : $*String // users: %17, %16
// function_ref String.append(_:)
%15 = function_ref @$sSS6appendyySSF : $@convention(method) (@guaranteed String, @inout String) -> () // user: %16
%16 = apply %15(%13, %14) : $@convention(method) (@guaranteed String, @inout String) -> ()
end_access %14 : $*String // id: %17
release_value %13 : $String // id: %18
%19 = begin_access [read] [static] %0 : $*String // users: %20, %22
-
begin_access
获取对%0的内存访问权限,权限是[modify]
修改。 - %15是一个函数引用,函数是
String.append(_:)
,即String的append方法。这个方法共有4个参数,分别是@guaranteed String
和@inout String
,没有返回值。 -
apply
是函数调用,调用的是%15,传入的参数是%13, %14。其中%13是"Roy",%14是"Hello"
这里需要解释一下两个参数是@guaranteed
和 @inout
类型 以及begin_access
指令。
@guaranteed
SIL Ownership 模型允许表达静态生命周期不变式并由SIL IR沿SSA边实施。是SSA的派生形式,用于表达沿def-use边的所有权不变式。
ValueOwnershipKind有三种,其中一种就是Guaranteed。具有@guaranteed
所有权的值是一个有范围的生命周期的不可变值,它是由@owned
值得出的。@guaranteed
值的生存期是对@owned
值的生存期的约束,并静态地防止@owned
值被破坏,直到@guaranteed
值的生存期结束。
从SIL中可以看到 %13 中的函数返回是的@owned String,与上面的描述刚好符合。
@inout参数
以下是master/docs/SIL.rst#inout-arguments 中的解释
@inout参数通过地址传递到函数入口点。被调用方不拥有所引用内存的所有权。引用的内存必须在函数进入和退出时进行初始化。如果@inout变量引用的是易碎的物理变量,则自变量是该变量的地址。如果@inout变量引用逻辑属性,则自变量是调用者拥有的写回缓冲区的地址。调用者有责任在调用函数之前通过存储属性getter的结果来初始化缓冲区,并在返回时通过从缓冲区加载并用最终值调用setter来写回属性。
func inout(_ x:inout Int){
x = 1
}
比如这个方法,就可以在方法内部修改传入的参数的值。
begin_access
begin_access
和 end_access
是内存访问指令,begin_access
开始访问内存,end_access
结束访问内存,并且访问必须在每个控制流路径上唯一结束。
StringOptimizationPass
StringOptimizationPass类继承自SILFunctionTransform是一个SIL Pass。
/// The StringOptimization function pass.
class StringOptimizationPass : public SILFunctionTransform {
public:
void run() override {
SILFunction *F = getFunction();
if (!F->shouldOptimize())
return;
LLVM_DEBUG(llvm::dbgs() << "*** StringOptimization on function: "
<< F->getName() << " ***\n");
StringOptimization stringOptimization;
bool changed = stringOptimization.run(F);
if (changed) {
invalidateAnalysis(SILAnalysis::InvalidationKind::CallsAndInstructions);
}
}
};
这个类比较简单,void run()
是入口方法,这个方法没太多好讲的,通过getFunction()
获得SILFunction对象,如果不需要优化就退出,需要优化就进入StringOptimization类的run
方法,参数是SILFunction对象。
StringOptimization
run
/// 优化的主要入口.
bool StringOptimization::run(SILFunction *F) {
/// 找到字符串声明,因为只有在本方法中声明的字符串才能进行判断进行优化
NominalTypeDecl *stringDecl = F->getModule().getASTContext().getStringDecl();
/// 如果没找到声明就返回false
if (!stringDecl)
return false;
stringType = SILType::getPrimitiveObjectType(
CanType(stringDecl->getDeclaredType()));
/// 创建临时变量来保存是否修改了SIL代码,也就是是否进行了优化,初始值为false
bool changed = false;
/// 遍历SILFunction的SILBasicBlock,对SILBasicBlock进行优化
for (SILBasicBlock &block : *F) {
changed |= optimizeBlock(block);
}
return changed;
}
optimizeBlock
/// 对basic block进行优化
bool StringOptimization::optimizeBlock(SILBasicBlock &block) {
bool changed = false;
/// 一个DenseMap类型的Map表,将可识别的对象(alloc_stack,inout参数)映射到
/// 存储在这些对象中的string values
llvm::DenseMap storedStrings;
/// 遍历SILBasicBlock中的SILInstruction
for (auto iter = block.begin(); iter != block.end();) {
SILInstruction *inst = &*iter++;
/// 找到StoreInst
if (StoreInst *store = isStringStoreToIdentifyableObject(inst)) {
/// storedStrings存储store instruction中需要存储的值和存储的目的地址的值映射关系
storedStrings[store->getDest()] = store->getSrc();
continue;
}
/// 找到string.append的apply instruction,参数2个
if (ApplyInst *append = isSemanticCall(inst, semantics::STRING_APPEND, 2)) {
/// 优化String.append
if (optimizeStringAppend(append, storedStrings)) {
changed = true;
continue;
}
}
/// 找到typeName的apply instruction,参数2个
if (ApplyInst *typeName = isSemanticCall(inst, semantics::TYPENAME, 2)) {
if (optimizeTypeName(typeName)) {
changed = true;
continue;
}
}
// 如果inst覆盖(或可能覆盖)可识别对象中存储的String,则从storedStrings中删除项目。
invalidateModifiedObjects(inst, storedStrings);
}
return changed;
}
StoreInst *store = isStringStoreToIdentifyableObject(inst)
中的store是store %6 to %0 : $*String // id: %
这个store指令。
storedStrings[store->getDest()] = store->getSrc();
enum {
/// the value being stored
Src,
/// the lvalue being stored to
Dest
};
SILValue getSrc() const { return Operands[Src].get(); }
SILValue getDest() const { return Operands[Dest].get(); }
store->getDest()
是获得store 指令的目的值,也就是%0。store-> getSrc()
是获得store 指令需要存储的值,也就是%6。
ApplyInst *append = isSemanticCall(inst, semantics::STRING_APPEND, 2)
semantics::STRING_APPEND是SEMANTICS_ATTR(STRING_APPEND, "string.append"),在include/swift/AST/SemanticAttrs.def中,SIL中定义则是:
// String.append(_:)
sil [_semantics "string.append"] @$sSS6appendyySSF : $@convention(method) (@guaranteed String, @inout String) -> ()
所以这个appen实际是%16。
semantics::TYPENAME是这个PR新添加的,是SEMANTICS_ATTR(TYPENAME, "typeName")
,也在在include/swift/AST/SemanticAttrs.def中。是开始所说的优化中的一个,这个优化就不细讲了。
isStringStoreToIdentifyableObject
StoreInst *StringOptimization::
isStringStoreToIdentifyableObject(SILInstruction *inst) {
auto *store = dyn_cast(inst);
/// 判断是StoreInst类型
if (!store)
return nullptr;
/// 判断StoreInst需要存储的数据是字符串类型
if (store->getSrc()->getType() != stringType)
return nullptr;
SILValue destAddr = store->getDest();
/// 我们只处理alloc_stack的间接函数参数。仅通过检查所有users就可以确保它们没有别名。
/// 也就是存储目的是直接的AllocStackInst
if (!isa(destAddr) && !isExclusiveArgument(destAddr))
return nullptr;
/// 如果有cache,直接返回。没有的话cache之后返回
if (identifyableObjectsCache.count(destAddr) != 0) {
return identifyableObjectsCache[destAddr] ? store : nullptr;
}
/// 检查它是否是"identifyable"的对象。这是一种case,它仅拥有我们可以通过简单方式跟踪的users:stores和applies。
for (Operand *use : destAddr->getUses()) {
SILInstruction *user = use->getUser();
switch (user->getKind()) {
case SILInstructionKind::DebugValueAddrInst:
case SILInstructionKind::DeallocStackInst:
case SILInstructionKind::LoadInst:
break;
default:
if (!mayWriteToIdentifyableObject(inst)) {
// We don't handle user. It is some instruction which may write to
// destAddr or let destAddr "escape" (like an address projection).
identifyableObjectsCache[destAddr] = false;
return nullptr;
}
break;
}
}
identifyableObjectsCache[destAddr] = true;
return store;
}
store->getSrc()->getType() != stringType
store需要存储的是字符串类型,也就是%6,String.init
初始化的字符串。
isa(destAddr)
要求目的地址是AllocStackInst类型寄存器,也就是0%,%0 = alloc_stack $String, var, name "string"
。
isSemanticCall
如果\ p
inst是具有语义属性\ p
attr和正好\ p
numArgs参数的函数的调用,则返回apply指令。
ApplyInst *StringOptimization::isSemanticCall(SILInstruction *inst,
StringRef attr, unsigned numArgs) {
auto apply = dyn_cast(inst);
if (!apply || apply->getNumArguments() != numArgs)
return nullptr;
SILFunction *callee = apply->getReferencedFunctionOrNull();
if (callee && callee->hasSemanticsAttr(attr))
return apply;
return nullptr;
}
optimizeStringAppend
优化最开始提到的几种String.append
bool StringOptimization::optimizeStringAppend(ApplyInst *appendCall,
llvm::DenseMap &storedStrings) {
/// 得到appendCall的参数0
SILValue rhs = appendCall->getArgument(0);
/// 获得参数0的字符串信息
StringInfo rhsString = getStringInfo(rhs);
// 如果lhs.append(rhs)中rhs是空的,则移除appendCall。是需要优化的第二种case。
if (rhsString.isEmpty()) {
appendCall->eraseFromParent();
return true;
}
/// 得到appendCall的参数1
SILValue lhsAddr = appendCall->getArgument(1);
/// 获得storedStrings[lhsAddr]的字符串信息
StringInfo lhsString = getStringInfo(storedStrings[lhsAddr]);
// The following two optimizations are a trade-off: Performance-wise it may be
// benefitial to initialize an empty string with reserved capacity and then
// append multiple other string components.
// Removing the empty string (with the reserved capacity) might result in more
// allocations.
// So we just do this optimization up to a certain capacity limit (found by
// experiment).
if (lhsString.reservedCapacity > 50)
return false;
// 如果 lhs.append(rhs) 中lhs是空,用 'lhs = rhs' 代替。是需要优化的第一种case。
if (lhsString.isEmpty()) {
// 用rhs替换String.append指令
replaceAppendWith(appendCall, rhs, /*copyNewValue*/ true);
storedStrings[lhsAddr] = rhs;
return true;
}
// 如果lhs.append(rhs)中lhs 和 rhs是常量字符串,用 "lhs = lhs + rhs" 代替
if (lhsString.isConstant() && rhsString.isConstant()) {
std::string concat = lhsString.str;
/// 字符串相加
concat += rhsString.str;
// 创建字符串初始化函数调用指令
if (ApplyInst *stringInit = createStringInit(concat, appendCall)) {
// 指令替换,用字符串调用指令替换String.append指令,并返回true
replaceAppendWith(appendCall, stringInit, /*copyNewValue*/ false);
storedStrings[lhsAddr] = stringInit;
return true;
}
}
return false;
}
ApplyInst *appendCall
是SIL节提到的16%。
// function_ref String.append(_:)
%15 = function_ref @$sSS6appendyySSF : $@convention(method) (@guaranteed String, @inout String) -> () // user: %16
%16 = apply %15(%13, %14) : $@convention(method) (@guaranteed String, @inout String) -> ()
他有两个参数,为别为13%和14%。13%是"Roy"地址,14%是"Hello"的地址%0。
SILValue rhs = appendCall->getArgument(0);
StringInfo rhsString = getStringInfo(rhs);
if (rhsString.isEmpty()) {
appendCall->eraseFromParent();
return true;
}
rhs是13%,也是"Roy"地址,rhsString是"Roy的字符串详情,如果这个字符串为空就移除appendCall。即移除x.append("")
。
SILValue lhsAddr = appendCall->getArgument(1);
StringInfo lhsString = getStringInfo(storedStrings[lhsAddr]);
lhsAddr是%14,也是%0的地址。
store %6 to %0 : $*String
再结合store 指令可以知道storedStrings[lhsAddr]是%6,也就是"Hello"地址,lhsString"Hello"字符串详情。
replaceAppendWith
/// Replace a String.append() with a store of \p newValue to the destination.
void StringOptimization::replaceAppendWith(ApplyInst *appendCall,
SILValue newValue, bool copyNewValue) {
SILBuilder builder(appendCall);
/// 获得appendCall的SILLocation
SILLocation loc = appendCall->getLoc();
/// 获得appendCall参数1
SILValue destAddr = appendCall->getArgument(1);
if (appendCall->getFunction()->hasOwnership()) {
if (copyNewValue)
newValue = builder.createCopyValue(loc, newValue);
builder.createStore(loc, newValue, destAddr,
StoreOwnershipQualifier::Assign);
} else {
if (copyNewValue)
builder.createRetainValue(loc, newValue, builder.getDefaultAtomicity());
builder.createDestroyAddr(loc, destAddr);
builder.createStore(loc, newValue, destAddr,
StoreOwnershipQualifier::Unqualified);
}
appendCall->eraseFromParent();
}
通过SILBuilder构建替换指令
appendCall->getFunction()->hasOwnership()
/// Returns true if this function has qualified ownership instructions in it.
bool hasOwnership() const { return HasOwnership; }
如果appendCall的调用函数有ownership 指令,返回true。而这里第一个参数是@guaranteed
,属于ownership 指令。
builder.createDestroyAddr(loc, destAddr);
这个目的是创建destroy_addr指令。而因为有@guaranteed
,所以可以到原SIL已经创建了释放destroy_addr ,来释放%0指向的内存地址,因此不需要重复创建。
destroy_addr %0 : $*String // id: %23
builder.createStore(loc, newValue, destAddr,
StoreOwnershipQualifier::Assign);
创建store指令,将newValue存储到destAddr,StoreOwnershipQualifier是Assign类型。
createStringInit
创建字符串初始化函数调用指令
/// Creates a call to a string initializer.
ApplyInst *StringOptimization::createStringInit(StringRef str,
SILInstruction *beforeInst) {
SILBuilder builder(beforeInst);
SILLocation loc = beforeInst->getLoc();
SILModule &module = beforeInst->getFunction()->getModule();
ASTContext &ctxt = module.getASTContext();
if (!makeUTF8Func) {
// Find the String initializer which takes a string_literal as argument.
ConstructorDecl *makeUTF8Decl = ctxt.getMakeUTF8StringDecl();
if (!makeUTF8Decl)
return nullptr;
auto Mangled = SILDeclRef(makeUTF8Decl, SILDeclRef::Kind::Allocator).mangle();
makeUTF8Func = module.findFunction(Mangled, SILLinkage::PublicExternal);
if (!makeUTF8Func)
return nullptr;
}
auto *literal = builder.createStringLiteral(loc, str,
StringLiteralInst::Encoding::UTF8);
auto *length = builder.createIntegerLiteral(loc,
SILType::getBuiltinWordType(ctxt),
literal->getCodeUnitCount());
auto *isAscii = builder.createIntegerLiteral(loc,
SILType::getBuiltinIntegerType(1, ctxt),
intmax_t(ctxt.isASCIIString(str)));
SILType stringMetaType = SILType::getPrimitiveObjectType(
CanType(MetatypeType::get(stringType.getASTType(),
MetatypeRepresentation::Thin)));
auto *metaTypeInst = builder.createMetatype(loc, stringMetaType);
auto *functionRef = builder.createFunctionRefFor(loc, makeUTF8Func);
return builder.createApply(loc, functionRef, SubstitutionMap(),
{ literal, length, isAscii, metaTypeInst });
}
这个就不细讲了,结合SIL中的两次创建字符串初始化调用指令能看明白了。分别是%1到%6和 %8到%13。
getStringInfo
返回字符串详情,如果它是常量字符串
/// Returns information about value if it's a constant string.
StringOptimization::StringInfo StringOptimization::getStringInfo(SILValue value) {
// Start with a non-constant result.
StringInfo result;
auto *apply = dyn_cast_or_null(value);
if (!apply)
return result;
SILFunction *callee = apply->getReferencedFunctionOrNull();
if (!callee)
return result;
// 如果是初始化空字符串,设置result.numCodeUnits = 0;
if (callee->hasSemanticsAttr(semantics::STRING_INIT_EMPTY)) {
result.numCodeUnits = 0;
return result;
}
// 如果初始化空大小的字符串,并且设置了容量大小,设置result.numCodeUnits = 0;
if (callee->hasSemanticsAttr(semantics::STRING_INIT_EMPTY_WITH_CAPACITY)) {
result.numCodeUnits = 0;
result.reservedCapacity = std::numeric_limits::max();
if (apply->getNumArguments() > 0) {
if (Optional capacity = getIntConstant(apply->getArgument(0)))
// result.reservedCapacity 为初始化的容量大小
result.reservedCapacity = capacity.getValue();
}
return result;
}
// 如果是string literal initializer
if (callee->hasSemanticsAttr(semantics::STRING_MAKE_UTF8)) {
SILValue stringVal = apply->getArgument(0);
auto *stringLiteral = dyn_cast(stringVal);
SILValue lengthVal = apply->getArgument(1);
auto *intLiteral = dyn_cast(lengthVal);
if (intLiteral && stringLiteral &&
// For simplicity, we only support UTF8 string literals.
stringLiteral->getEncoding() == StringLiteralInst::Encoding::UTF8) {
result.str = stringLiteral->getValue();
result.numCodeUnits = intLiteral->getValue().getSExtValue();
return result;
}
}
return result;
}
semantics::STRING_INIT_EMPTY
是SEMANTICS_ATTR(STRING_INIT_EMPTY, "string.init_empty")
,在include/swift/AST/SemanticAttrs.def中。
semantics::STRING_INIT_EMPTY_WITH_CAPACITY
是SEMANTICS_ATTR(STRING_INIT_EMPTY_WITH_CAPACITY, "string.init_empty_with_capacity")
,在include/swift/AST/SemanticAttrs.def中。表示创建空的字符串,但设置了容量大小的初始化方法。
semantics::STRING_MAKE_UTF8
,是string literal initializer。SIL如下:
// String.init(_builtinStringLiteral:utf8CodeUnitCount:isASCII:)
sil [serialized] [always_inline] [readonly] [_semantics "string.makeUTF8"] @$sSS21_builtinStringLiteral17utf8CodeUnitCount7isASCIISSBp_BwBi1_tcfC : $@convention(method) (Builtin.RawPointer, Builtin.Word, Builtin.Int1, @thin String.Type) -> @owned String
apply指令如下:
%6 = apply %5(%1, %2, %3, %4) : $@convention(method) (Builtin.RawPointer, Builtin.Word, Builtin.Int1, @thin String.Type) -> @owned String // user: %7
SILValue stringVal = apply->getArgument(0);
,参数0是%1,也就是IntegerLiteralInst。SILValue lengthVal = apply->getArgument(1);
,参数1是%2,字符串长度5。result.str = stringLiteral->getValue();
,获得字符串,也就是"Hello"。
ASTContext.cpp
ASTContext.cpp 中增加了一个getMakeUTF8StringDecl()
方法获得MakeUTF8StringDec,目的是在createStringInit方法中使用,来手动创建string初始化call。
ConstructorDecl *ASTContext::getMakeUTF8StringDecl() const {
if (getImpl().MakeUTF8StringDecl)
return getImpl().MakeUTF8StringDecl;
// 获得初始化
auto initializers =
getStringDecl()->lookupDirect(DeclBaseName::createConstructor());
for (Decl *initializer : initializers) {
auto *constructor = cast(initializer);
auto Attrs = constructor->getAttrs();
for (auto *A : Attrs.getAttributes()) {
if (A->Value != semantics::STRING_MAKE_UTF8)
continue;
auto ParamList = constructor->getParameters();
if (ParamList->size() != 3)
continue;
ParamDecl *param = constructor->getParameters()->get(0);
if (param->getArgumentName().str() != "_builtinStringLiteral")
continue;
getImpl().MakeUTF8StringDecl = constructor;
return constructor;
}
}
return nullptr;
}
semantics::STRING_MAKE_UTF8
是 SEMANTICS_ATTR(STRING_MAKE_UTF8, "string.makeUTF8")
,在include/swift/AST/SemanticAttrs.def中。SIL代码如下:
// String.init(_builtinStringLiteral:utf8CodeUnitCount:isASCII:)
sil [serialized] [always_inline] [readonly] [_semantics "string.makeUTF8"] @$sSS21_builtinStringLiteral17utf8CodeUnitCount7isASCIISSBp_BwBi1_tcfC : $@convention(method) (Builtin.RawPointer, Builtin.Word, Builtin.Int1, @thin String.Type) -> @owned String
ParamList->size() != 3
判断参数是否为3,从SIL来看,参数确实是3。
ParamDecl *param = constructor->getParameters()->get(0);
if (param->getArgumentName().str() != "_builtinStringLiteral")
通过参数列表获得参数0的参数名称,判断是否是"_builtinStringLiteral",从SIL来看也是这样。
PassManager中添加Pass
最后在include/swift/SILOptimizer/PassManager/Passes.def中添加Pass
PASS(StringOptimization, "string-optimization",
"Optimization for String operations")