Recently there was a discussion on one of our internal email aliases in which this problem came up. I though it was an interesting problem so I posed this Quiz to assorted people I work with to see what kinds of things they would say.
Considering these three options:
Option 1:
sw.WriteLine(subject + ": " + message);
Option 2:
sw.WriteLine("{0}: {1}", subject, message);
Option 3:
sw.Write(subject);
sw.Write(": ");
sw.WriteLine(message);
Answer these questions:
Q1. Which of these choices offers the best performance?
Q2: Can you characterize the memory usage of each of these?
Q3: Which would the performance team generally recommend, and why?
Q4: What special factors might alter this recommendation?
Q5: What did you have do assume about "sw" to answer the question?
I encourage you to think about this for a few minutes (more is good too) before reading past the line. SPOILERS/ANSWERS follow.
For everyone that took the time to think about this at all, thank you very much.
In the course of my job I'm often asked to comment of the probable performance of an assortment of solutions to give guidance, or at least suggest what measurements should be made. So I approached this quiz the same way and made my best guesses/recomendations as I would if I could not do the measurements. Then I went back and did the actual measurements.
Here are my own answers:
Q1. Which of these choices offers the best performance?
Q2: Can you characterize the memory usage of each of these?
(These answers all proved to be "close but no cigar" due to the unusual WriteLine behavior discussed in the real analysis, see below)
All three probably have allocations associated with buffering the stream, ignoring those as invariant, the allocations unique to each choice are:
#1 single concat operation, one temporary string
#2 assorted allocations, including string builder, underlying string buffer, vararg array (I was close)
#3 no allocations
Q3: Which would the performance team generally recommend, and why?
Even though it's the worst performing, and we knew that much in advance, both of your CLR Performance Architects concur that #2 should be the default choice. In the highly unlikely event that it becomes a perf problem the issue is readily addressable with only modest local changes. Normally you're just cashing-in on some nice maintainability. We do not alter our choice given the data below.
Q4: What special factors might alter this recommendation?
Specific measurements indicating that the code path had become a hotspot.
Q5: What did you have do assume about "sw" to answer the question?
Only that the stream did not have exotic behavior (such as weird cryptographic features that make the cost model very complex) and that it was buffered. In the event of an unbuffered stream of one type or another there are signficant semantic differences between (1 or 2) and (3) and potentially huge perf differences too.
OK, time to see how we did.
To do the analysis below I used the following benchmark program and CLR Profiler, which is one way to look at this data.
namespace Test |
{ |
using System; |
using System.IO; |
class Test |
{ |
static private String s1 = "Hello"; |
static private String s2 = "Good bye"; |
static private int iterations = 5; |
static private volatile int foo = 0; |
static private MemoryStream ms = new MemoryStream(100000); |
static private StreamWriter sw = new StreamWriter(ms); |
public static void Main(string[] args) |
{ |
int i; |
for (i=0;i<iterations;i++) Test1(); |
for (i=0;i<iterations;i++) Test2(); |
for (i=0;i<iterations;i++) Test3(); |
} |
public static void Test1() |
{ |
sw.WriteLine(s1+": "+s2); |
foo++; |
} |
public static void Test2() |
{ |
sw.WriteLine("{0}: {1}", s1,s2); |
foo++; |
} |
public static void Test3() |
{ |
sw.Write(s1); |
sw.Write(": "); |
sw.WriteLine(s2); |
foo++; |
} |
} |
} |
The results below are in the form of an execution trace showing functions and allocations for each of the three options as reported by CLRProfiler.
Results for Option 1 | |||||||||
Calls | Bytes | Allocs | |||||||
static void Test.Test::Test1() | 4 | 94 | 2 | ||||||
static String System.String::Concat(String,String,String) | 48 | 1 | |||||||
System.String(48 bytes) | 48 | ||||||||
void System.IO.TextWrite::WriteLine(String) | 2 | 46 | 1 | ||||||
System.Char[] (46 bytes) | 46 | ||||||||
void System.String.CopyTo(int32,wchar[], int32, int32) | |||||||||
void System.IO.StreamWriter::Write(wchar[], int32, int32) | |||||||||
So option 1 gives 4 calls, 2 allocations, for 94 bytes. Things to note:
Results for Option 2 |
|||||||||
Calls |
Bytes |
Allocs |
|||||||
static void Test.Test::Test2() | 30 | 184 | 5 | ||||||
void System.IO.TextWriter::WriteLine(String, Object, Object) | 29 | 184 | 5 | ||||||
System.Object[] (24 bytes) | 24 | ||||||||
static String System.String::Format(System.IFormatProvider, String, Object[]) | 25 | 114 | 3 | ||||||
System.Text.StringBuilder (20 bytes) | 20 | ||||||||
void System.Text.StringBuilder::.ctor(String, int32) | 3 | 66 | 1 | ||||||
void System.Text.StringBuilder::.ctor(String, int32, int32) | 2 | 66 | 1 | ||||||
static String System.String::GetStringForStringBuilder(String, int32, int32, int32) | 1 | 66 | 1 | ||||||
System.String (66 bytes) | 66 | ||||||||
void System.String::NullTerminate() | |||||||||
System.Text.StringBuilder System.Text.StringBuilder::AppendFormat(System.IFormatProvider, String, Object[]) | 16 | 28 | 1 | ||||||
wchar[] System.String::ToCharArray(int32, int32) | 28 | 1 | |||||||
System.Char[] (28 bytes) | 28 | ||||||||
String System.String::ToString() | |||||||||
System.Text.StringBuilder System.Text.StringBuilder::Append(String) | 3 | ||||||||
static bool System.IntPtr::op_Inequality(int,int) | |||||||||
void System.String::AppendInPlace(String,int32) | 1 | ||||||||
void System.String::NullTerminate() | |||||||||
System.Text.StringBuilder System.Text.StringBuilder::Append(wchar[], int32, int32) | 4 | ||||||||
String System.Text.StringBuilder::GetThreadSafeString(int&) | 1 | ||||||||
static bool System.IntPtr::op_Equality(int,int) | |||||||||
void System.String::AppendInPlace(wchar[], int32, int32, int32) | 1 | ||||||||
void System.String::NullTerminate() | |||||||||
String System.Text.StringBuilder::ToString() | 3 | ||||||||
static bool System.IntPtr::op_InEquality(int,int) | |||||||||
static bool System.IntPtr::op_InEquality(int,int) | |||||||||
void System.String::ClearPostNullChar() | |||||||||
void System.IO.TextWriter::WriterLine(String) | 2 | 46 | 1 | ||||||
System.Char[] (46 bytes) | 46 | ||||||||
void System.String.CopyTo(int32,wchar[], int32, int32) | |||||||||
void System.IO.StreamWriter::Write(wchar[], int32, int32) | |||||||||
Many people expected Option 2 to be the best for perf. Ahem. It isn't :) 30 function calls. 5 allocations. 184 bytes allocated. Things to Note:
Results for Option 3 |
|||||||||
Calls |
Bytes |
Allocs |
|||||||
static void Test.Test::Test3() | 7 | 32 | 1 | ||||||
void System.IO.StreamWriter::Write(String) | 1 | ||||||||
void System.String::CopyTo(int32, wchar[], int32, int32) | |||||||||
void System.IO.StreamWriter::Write(String) | 1 | ||||||||
void System.String::CopyTo(int32, wchar[], int32, int32) | |||||||||
void System.IO.StreamWriter::WriteLine(String) | 2 | 32 | 1 | ||||||
System.Char[] (32 bytes) | 32 | ||||||||
void System.String::CopyTo(int32, wchar[], int32, int32) | |||||||||
void System.IO.StreamWriter::Write(wchar[], int32, int32) | |||||||||
7 function calls. 1 allocation. 32 bytes allocated. Things to Note:
|
This is already too long but I thought I'd end by sharing a quick summary of the nature of the responses I got internally.