Table of Contents
- Introduction
- Screenshot
- The problem
- Background about Roslyn
- The solution
- Why would you use it?
- Why don't you use reflection for this?
- How fast is it
- How to use it
- Syntax Highlighting with Fast Colored TextBox
- Multiple Searches using KRBTabControl
- The implementation
- About the Source Code
- Future of this project
- History
Introduction
This article is about a tool using Roslyn that can search through a large codebase in 4 ways:
- Search text in methods
- Search calls to certain methods
- Search for methods with certain names
- Search for properties with certain names
- Search for classes with certain names
Screenshot
Here's a screenshot of C# and VB.NET Code Searcher in action.
The problem
Recently I had an assignment that required a lot of searching through the source code of a large legacy codebase (61 solutions, C# code). A field had to be moved from one table to another table. It was a change that would impact some parts of the codebase. To find out I had to find methods in the data layer where Stored procedures were called. Then I had to go bottom-up through the codebase to see where these methods were called, and what the impact was on the code.
At first I used the freeware tool "TextCrawler 2" for that (http://www.digitalvolcano.co.uk/content/index.php). This is quite a fast text search utility. But the problem is, it doesn't "know" anything about the C# language. For example, if you search for method calls to a certain method, TextCrawler will happily find files for you that have the method calls commented out. Another problem was, it wasn't fast enough (searching through 61 solutions can take some time..). I also used the Microsoft Desktop Search tool, this was fast but also not "intelligent" with the source code.
Since I read about Roslyn I thought of ways I could make it useful for this purpose.
Background about Roslyn
Roslyn is Microsoft’s project to open up the VB and C# compilers through APIs, and provide easy access to the information it gathers during the different stages of the compilation process.
To get started on what Roslyn is about, you can read about it here:
- Introducing the Microsoft “Roslyn” CTP
- Microsoft's Roslyn: Reinventing the compiler as we know it
- Download Roslyn CTP
- MSDN Roslyn CTP Page and Related Resources
Or if you'd prefer to take a deeper dive into Roslyn, here's a whitepaper from Microsoft:
- The Roslyn Project - Exposing the C# and VB compiler’s code analysis
This article is not meant to give you an introduction to Roslyn, there are a couple of good CodeProject articles that do that:
- Roslyn CTP: Three Introductory Projects
- C# as a Scripting Language in Your .NET Applications Using Roslyn
I also found out that after installing the Microsoft Roslyn CTP - June 2012 there were lots of sample projects installed in my Documents folder.
The solution
So I thought I'd give Roslyn a try to see if I could create a tool that could search through code faster. I think I succeeded in this. I use it all the time now! I created a Windows Forms application that has 5 ways of searching through C# and VB.NET code:
- Search text in methods
- Search calls to certain methods
- Search for methods with certain names
- Search for properties with certain names
- Search for classes with certain names
I decided to share this with the world so everyone can enjoy it. By posting this article, I hope that:
- People will find this useful too.
- I get valuable feedback so the tool can be improved.
- People will extend / adapt the tool or parts of it in ways I haven't thought of yet.
Why would you use it?
For example:
"Go To Definition" for other solutions
Let's say you're in a debugging session. You're debugging in solution X which calls a service that's in another solution Y. Now you see a method being called on a class in solution Y. In Visual Studio you can go to the definition of a method with right mouse click - "Go To Definition" or F12. But not when the method is in the other solution! So if you want to look up the definition of the method, the only ways to do that are:
- Step inside the method during the debug session
- Open solution Y and find the method you want to see.
With RoslynCodeSearcher, it's very easy to look up a method that's in another solution, just type its name in the search field, select "Search methods" and click [Search].
As a help during refactoring
Sometimes you want to know "What will happen if I remove this method, where is it called in the jungle of solutions?". You can do a text search or you can start a compile build to see where it breaks, but for some projects that have lots of solutions a full compile build just takes long. With RoslynCodeSearcher, you type the name of the method in the search field, select "Search calls" and click [Search]. Wait a second, et voila!
Why don't you use reflection for this?
The reason I don't use reflection for this is, I want to have access to the actual sourcecode of the solutions I search. I want to return the method body for example. Reflection can't do that, it can only work on metadata (Type of Classes, Name and Signature of Methods, etc.). Also, when I want to do a text search on pieces of text in a method, using Roslyn is faster than text search for a larger number of solutions. This is because the solutions are compiled in memory.
How fast is it
The first time you use it the tool will be slower, because it has to compile the solutions in memory (604 MB of memory in my situation). These compiled solutions will be available in IWorkspace objects. This happens at the startup of the tool everytime. A progress indicator will indicate the progress of the compilation. With a couple of solutions this compilation will be finished in a second or so. With a whole lot of solutions it will take longer. To give you an indication: On my computer it took about half a minute compiling 61 solutions in memory the first time. After the initial compilation the search will be very fast: a second to a few seconds for searching through 61 solutions, depending on how much will be found. This is because it already has the list of IWorkspace objects in memory. After I started using the .NET 4 Parallel.ForEach
keyword the performance has increased significantly (with a factor depending on the number of cores in the processor of your computer, Dual Core, Quad Core, etc.).
How to use it
Prerequisites
Make sure you have the following software installed in this order, otherwise the solution will not build:
- .NET Framework 4.0
- Visual Studio 2010 (not Express edition)
\ The C# language feature if you want to be able to search in C# source code
\ The VB.NET language feature if you want to be able to search in VB.NET source code
- Visual Studio 2010 SP1
- Visual Studio 2010 SP1 SDK
- Microsoft "Roslyn" CTP
Next: solutions.txt file
You have to provide the tool with a list of solutions to search through.
There are 2 ways you can do this:
- With a text file "solutions.txt" placed in the directory of the executable (or \bin\debug after you build the solution). The tool will read this on startup if it exists. This text file should contain full paths to the solutions. Each on it's own line.
- If the solutions.txt file doesn't exist yet, click on [Browse ...] and in the File dialog select a directory. Next click on [Update solution List]. The tool will then walk recursively down the directory structure, starting at the selected directory, looking for solution (.sln) files.
The result will be stored in the "solutions.txt" file in the directory of the executable. The existing "solutions.txt" file will be overwritten.
Next: search
- Type the text you want to search for in the textbox.
- Select one of the ways to search with by clicking one of the radio buttons.
- Click [Search].
The solutions from solutions.txt, all underlying projects, and all underlying source files will be searched through.
The result of the search consists of:
- The path to the source files containing the found methods.
- The body of the methods.
Including / decluding files
You can also specify words in the textboxes on the right that say:
"Do not include files containing words in filename. Separate by comma."
or
"Only include files containing word in filename. Separate by comma."
- "Do not include" means, the tool will not search in code files that have any of the words in the path.
- "Only include" means, the tool will only search in code files that have any of the words in the path.
These text boxes are mutually exlusive, they can not be used at the same time, " Do not include" takes precedence over "include".
Searching part of text
It is possible to type only part of the text you want to search. For example, if you want to search for all methods that contain the word "Save", like "SaveCustomer", "SaveOrder", then check the option checkbox "Search part of text". If you select the search option "Search text in method" the option will be set by default.
Syntax Highlighting with Fast Colored TextBox
To present the results of the code search I needed a text editor that could do Syntax Highlighting. I researched a couple of those, and decided to use the great "Fast Colored TextBox" from Pavel Torgashov in my project (also on CodeProject):Fast Colored TextBox for Syntax Highlighting. Which is fast indeed! It also supports searching in the textbox with Ctrl-F.
Multiple Searches using KRBTabControl
To be able to start multiple searches using a tabbed interface, I gladly used the excellent "KRBTabControl" from Burak299 in my project (also on CodeProject):KRBTabControl. This gave me the possibility to provide tabs that can be closed just like browser tabs. When there are too many tabs too display you will see two tiny arrows on the right so you can switch between tabs with the mouse.
The implementation
The code below is not entirely the same as the sourcecode itself, but this is meant to show you the basics of how the tool works.
When the Search button is clicked, a search is started using the selected Search method (the radio buttons).
public enum SearchType
{
SearchTextInMethod,
SearchCallers,
SearchMethods,
SearchProperties,
SearchClasses
}
private SearchType _searchType = new SearchType();
///
/// - Do some checks to see if the input is correct and the solutions.txt file exists
/// - Update the text of the tab to the text that is being searched
/// - Show a hourglass icon on the tab during the search
/// - Start a new worker that will do the search
///
///
///
private void btnSearch_Click(object sender, EventArgs e)
{
string searchText = txtTextToSearch.Text;
//Remove leading and trailing spaces
searchText = searchText.Trim();
if (searchText.Contains("(") || searchText.Contains(")"))
{
MessageBox.Show("Please specify searchtext without parentheses or parameters.");
return;
}
if (!File.Exists(Constants.BaseDirectorySolutionsTxtPath))
{
MessageBox.Show("There is no solutions.txt file in the directory where the .exe resides. Please click the [Browse] button to select a starting direcctory. Then click [Update solution List]");
}
else
{
SearchType searchType = new SearchType();
if (!String.IsNullOrEmpty(searchText))
{
TabController.UpdateSearchTextOnTab(searchText);
TabController.ShowHourGlass();
if (rbSearchTextInMethod.Checked)
{
searchType = SearchType.SearchTextInMethod;
}
else if (rbSearchCallers.Checked)
{
searchType = SearchType.SearchCallers;
}
else if (rbSearchMethods.Checked)
{
searchType = SearchType.SearchMethods;
}
else if (rbSearchProperties.Checked)
{
searchType = SearchType.SearchProperties;
}
else if (rbSearchClasses.Checked)
{
searchType = SearchType.SearchClasses;
}
//Create and start a new worker that will do the searching for us.
WorkerFactory.Start(searchType, searchText, txtExclude.Text, txtInclude.Text, TabController.SelectedTab.Guid);
}
else
{
MessageBox.Show("Please enter text to search");
}
}
}
The WorkerFactory.Start
method creates a new Worker
object every time you do a search.
public static class WorkerFactory
{
private static List _workerList = new List();
public static void Start(SearchType searchType, string searchText,
string filter, string include, Guid guid)
{
Worker worker;
worker = new Worker(searchType, searchText, filter, include, guid);
_workerList.Add(worker);
worker.Start();
}
///
/// Select a worker from the workerlist with a certain Guid.
///
///
///
private static Worker SelectWorker(Guid guid)
{
var selectWorker = from worker in _workerList
where worker.Guid == guid
select worker;
if (selectWorker != null && selectWorker.Count() == 1)
{
return (Worker)selectWorker.First();
}
return null;
}
///
/// If a tab is deleted the accompanying worker must be cancelled.
/// It won't be killed, but the results will not be written to a tab anymore.
/// If it's not needed anymore, doesn't matter because they will be cleaned up once the program quits.
///
/// The unique identifier of the worker
public static void Delete(Guid guid)
{
Worker selectWorker = SelectWorker(guid);
//Does the worker exist in the workerlist?
//Because, if a tab is deleted, but a worker was not started for that tab,
//there is no worker to delete.
if (selectWorker != null)
{
selectWorker.Cancel();
_workerList.Remove(selectWorker);
}
}
}
This Worker
uses a BackgroundWorker
to start a thread that starts a codesearch using Roslyn.
public class Worker
{
private CodeSearcher _searcher;
BackgroundWorker _worker;
private string _result;
private Guid _guid;
private bool _cancel;
public Worker(SearchType searchType, string searchText, string filter, string include, Guid guid)
{
_guid = guid;
_searcher = new CodeSearcher(searchType, searchText, filter, include);
_worker = new BackgroundWorker();
_worker.DoWork += new DoWorkEventHandler(worker_DoWork);
_worker.RunWorkerCompleted += new RunWorkerCompletedEventHandler(worker_RunWorkerCompleted);
}
public Guid Guid
{
get { return _guid; }
set { _guid = value; }
}
public void Start()
{
_worker.RunWorkerAsync();
}
///
/// Cancel means the backgroundworker will finish it's job,
/// but won't write the results to the tabcontroller anymore.
///
public void Cancel()
{
_cancel = true;
}
private void worker_RunWorkerCompleted(object sender, RunWorkerCompletedEventArgs e)
{
if (!_cancel)
{
TabController.WriteResults(_guid, _result);
}
}
private void worker_DoWork(object sender, DoWorkEventArgs e)
{
_result = _searcher.Search();
}
}
This Worker
uses a BackgroundWorker
to start a thread that starts a codesearch using Roslyn.
|
This Worker uses a BackgroundWorker to start a thread that starts a codesearch using Roslyn.
Collapse
| Copy Code
public class Worker
{
private CodeSearcher _searcher;
BackgroundWorker _worker;
private string _result;
private Guid _guid;
private bool _cancel;
public Worker(SearchType searchType, string searchText, string filter, string include, Guid guid)
{
_guid = guid;
_searcher = new CodeSearcher(searchType, searchText, filter, include);
_worker = new BackgroundWorker();
_worker.DoWork += new DoWorkEventHandler(worker_DoWork);
_worker.RunWorkerCompleted += new RunWorkerCompletedEventHandler(worker_RunWorkerCompleted);
}
public Guid Guid
{
get { return _guid; }
set { _guid = value; }
}
public void Start()
{
_worker.RunWorkerAsync();
}
public void Cancel()
{
_cancel = true;
}
private void worker_RunWorkerCompleted(object sender, RunWorkerCompletedEventArgs e)
{
if (!_cancel)
{
TabController.WriteResults(_guid, _result);
}
}
private void worker_DoWork(object sender, DoWorkEventArgs e)
{
_result = _searcher.Search();
}
}
If the worker is started with the Start method, it calls the worker_DoWork asynchronously, which calls the CodeSearcher.Search method that searches using 1 of 5 methods, depending on the selectedSearchType .
|
public class CodeSearcher
{
///
/// Search for the provided searchtext in the sourcecode files of the solutions.
/// Use the provided SearchType (method, callers, text in method).
/// Return the result in a string.
///
///
public string Search()
{
string result = "";
List excludes = CodeSearcher.GetFilters(_exclude);
List includes = CodeSearcher.GetFilters(_include);
if (CodeRepository.Workspaces.Count() == 0)
{
//Get the solutions from the solutions.txt file and load them into Workspaces
//If it doesn't exist, this will be checked at the moment user presses the [Search] button.
CodeRepository.Solutions = CodeRepository.GetSolutions(Constants.BaseDirectorySolutionsTxtPath);
CodeRepository.Workspaces = CodeRepository.GetWorkspaces(CodeRepository.Solutions);
}
if (_searchType == SearchType.SearchTextInMethod)
{
result = SearchMethodsForTextParallel(CodeRepository.Workspaces, _searchText, excludes, includes);
}
else if (_searchType == SearchType.SearchCallers)
{
result = SearchCallersParallel(CodeRepository.Workspaces, _searchText, excludes, includes);
}
else if (_searchType == SearchType.SearchMethods)
{
result = SearchMethodsParallel(CodeRepository.Workspaces, _searchText, excludes, includes);
}
else if (_searchType == SearchType.SearchProperties)
{
result = SearchPropertiesParallel(CodeRepository.Workspaces, _searchText, excludes, includes);
}
else if (_searchType == SearchType.SearchClasses)
{
result = SearchClassesParallel(CodeRepository.Workspaces, _searchText, excludes, includes);
}
return result;
}
If a solutions.txt file exists in the directory where the RoslynCodeSearcher.exe resides, the paths to the solutions will be put in a List and the Workspaces with the solutions will be loaded. A workspace is an active representation of your solution as a collection of projects, each with a collection of documents. The workspace provides access to the current model of the solution. You can read more about ithere.
In the CodeSearcher
class I have 5 search methods. This is where the searching happens. The searching makes use of the .NET 4 keywordParallel
.ForEach
to speed things up depending on the number of cores in the processor of your computer. I will show one of the search methods here, the other 4 you can see in the source code.
///
/// Search through the code for methods that contain the text textToSearch.
/// Return the resulting method bodies as a string.
/// excludes are used to exclude files that have paths that contain certain words.
/// includes are used to include files that have paths that contain certain words.
///
///
///
/// Projects / documents to exclude by name
/// Projects / documents to include by name
///
public string SearchMethodsForTextParallel(List workspaces, string textToSearch, List excludes, List includes)
{
StringBuilder result = new StringBuilder();
string language = "";
foreach (IWorkspace w in workspaces)
{
ISolution solution = w.CurrentSolution;
foreach (IProject project in solution.Projects)
{
language = project.LanguageServices.Language;
Parallel.ForEach(project.Documents, document =>
{
//Filter and include document names containing certain words
if (!excludes.Any(s => document.FilePath.ToUpper().Contains(s)) &&
(
includes.Count() == 0 || includes.Any(s => document.FilePath.ToUpper().Contains(s)))
)
{
if (language == LANG_CS)
{
result.Append(SearchMethodsForTextCSharp(document, textToSearch));
}
}
});
}
}
return result.ToString();
}
private string SearchMethodsForTextCSharp(IDocument document, string textToSearch)
{
StringBuilder result = new StringBuilder();
CommonSyntaxTree syntax = document.GetSyntaxTree();
var root = (Roslyn.Compilers.CSharp.CompilationUnitSyntax)syntax.GetRoot();
var syntaxNodes = from methodDeclaration in root.DescendantNodes()
.Where(x => x is MethodDeclarationSyntax || x is PropertyDeclarationSyntax)
select methodDeclaration;
if (syntaxNodes != null && syntaxNodes.Count() > 0)
{
foreach (MemberDeclarationSyntax method in syntaxNodes)
{
if (method != null)
{
string methodText = method.GetFullText();
if (methodText.ToUpper().Contains(textToSearch.ToUpper()))
{
result.Append(GetMethodOrPropertyTextCSharp(method, document));
}
}
}
}
return result.ToString();
}
When the text or call or method or property is found, the method
GetMethodOrPropertyText
is called to get the body of the method / property in which the searched item is found. The full text of the method /property will be returned, including the path to the
.cs file.
///
/// Get the full text of the method or property body.
///
///
///
///
private string GetMethodOrPropertyTextCSharp(Roslyn.Compilers.CSharp.SyntaxNode node, IDocument document)
{
StringBuilder resultStringBuilder = new StringBuilder();
string methodText = node.GetFullText();
bool isMethod = node is Roslyn.Compilers.CSharp.MethodDeclarationSyntax;
string methodOrPropertyDefinition = isMethod ? "Method: " : "Property: ";
object methodName = isMethod ? ((Roslyn.Compilers.CSharp.MethodDeclarationSyntax)node).Identifier.Value : ((Roslyn.Compilers.CSharp.PropertyDeclarationSyntax)node).Identifier.Value;
resultStringBuilder.AppendLine("//=====================================================================================");
resultStringBuilder.AppendLine(document.FilePath);
resultStringBuilder.AppendLine(methodOrPropertyDefinition + (string)methodName);
resultStringBuilder.AppendLine(methodText);
return resultStringBuilder.ToString();
}
Jumping back to the Worker object above, when the worker is finished and has the results, the
TabController.WriteResults
method will be called to update the
FastColoredTextBox
with the results.
public static class TabController
{
private static List
_fastColoredTextBoxes = new List();
///
/// The results of the search will be written to the tab specified with the guid
///
///
///
public static void WriteResults(Guid guid, string text)
{
//If another thread comes here, block it temporarily until this thread is finished.
lock (_lockobj)
{
var selectFastColoredTextBox = from fctb in _fastColoredTextBoxes
where fctb.Guid == guid
select fctb;
if (selectFastColoredTextBox != null && selectFastColoredTextBox.Count()==1)
{
FastColoredTextBox currentTextBox = (FastColoredTextBox)selectFastColoredTextBox.First();
currentTextBox.Text = text;
if (text == "") currentTextBox.Text = "Nothing found.";
//move caret to start text
currentTextBox.Selection.Start = Place.Empty;
currentTextBox.DoCaretVisible();
}
}
}
}
As you will see in the source code, there is much more to it then I have shown in this article. For example, it is possible to start multiple searches independently at the same time from different tabs. This uses some threading and proper handling / locking. Also, the tool itself can search in both C# and VB.NET sourcecode.
About the Source Code
The projects attached will open up and build in Visual Studio 2010 SP1. In paragraph"How to use it" I explain the prerequisites that are necessary to use the tool.
Future of this project
Some thoughts about the direction this project might go in the future:
Regular Expression Support
I want to be able to search using regular expressions, for example:
- Give me all the methods that are named "SaveCustomer" or "InsertCustomer". You would have to type a regular expression like this
(Save|Insert)Customer
Visual Studio Extension
This could be reworked as a Visual Studio extension. That way it could make use of the C# code editor and other parts of Visual Studio. That could make it even more powerful and accessible to more people.
Advanced stuff
To make refactoring source code through multiple solutions friendlier, it would be nice if you could do some type of "queries" on your sourcecode, just like LINQ. Something likehttp://www.ndepend.com/Doc_CQLinq_Syntax.aspx orhttp://www.codeproject.com/Articles/408663/Using-NRefactory-for-analyzing-Csharp-code.
To make these queries strongly typed and not dynamic, that would need IntelliSense in a kind of interactive window. Maybe the new Roslyn "C# Interactive window" could be of use for this. But probably this would be easier to realize as a Visual Studio Extension.
Output
Let the user define what the output should contain, for example:
- The whole code file
- A graphical view of connections between methods / classes / solutions etc.
History
21-08-2012
- Fixed a bug in searching callers; some callers were not found.
05-08-2012
- Added "precompile" option to compile the solutions in memory at program startup to speed things up
04-08-2012
- Added classname search ability and "part of text" search
01-08-2012
- Used Parallel.ForEach for searching. +/- 2x as fast with 2 cores, 4 cores not able to test, but probably 4x as fast.
28-07-2012
- More unittests (TabController)
- Use .Any() instead of Count() > 0
- Unittest to test performance of @"A".ToUpper().Contains(@"B".ToUpper()) versus @"A".IndexOf(@"B", StringComparison.OrdinalIgnoreCase)
24-07-2012
- Added unittests
- Able to search in VB.NET code also
18-07-2012
- Added property search ability
- Input check on search textbox
- Remove leading / trailing spaces on text from search textbox when click [Search]
- Show "Method:" or "Property:" depending on which searchtype is selected
16-07-2012
- Fixed ability to Copy (Ctrl-C) from the FastColoredTextBox
- Show hourglass icon on tabs when threads are running
- If you click button [New tab] the program automatically jumps to the next tab
- Separator lines between tabs
- Changed text of include / exclude text fields to better describe what they mean
15-07-2012
- Fixed issue causing error with parentheses in search text
- Added extra comments to source
- Tested if different types of method definitions can be searched
- Some refactoring: regions etc.
- Added Messagebox for button "Update solution List".