There are a lot of tools out there to help you resolve your problems, get to know them, know what they can do and when they can help. This is a list of tools I use on a regular basis...
There are more tools avaiable, what you use depends on the problem you are trying to solve, in particular tools like U2U's CAML Builder can help you to replicate problems.
Sometimes, no matter how hard you try, you cannot reproduce an error within a test environment...it only happens on the live server. If your Trace statements are not giving you the details you require then you are going to have to debug the problem on the server causing the problem. Fortunately there are a number of solutions to this problem.
If you have a connection for which the firewall allows remote debugging you can copy some files across to the live server and attach the debugger from your development machine...this will obviously work best with a debug build deployed. Debugging remotely actually works very well, but it will prevent the server from processing requests, this means the server will be unavailable whilst you perform your debugging...which might be early Sunday morning...the only time you can make the server unavailable!
If you can't get through the firewall then you can use WinDbg to debug on the server. This will allow you to attach to processes and step through code. It is actually more powerful than VS.Net, but is harder to use as, even though it has a UI, it relies on a cryptic set of commands to get it to do what you want. Even so, it is well worth using as it can give you access to valuable information.
To give you a comparison, the following is the same debugging process in WinDbg as the previous example in VS.Net.
Firstly attach to the w3wp process...
Once attached you will need to load the SOS.DLL to give you access to the debugging functions you'll need. You can use the following command to achieve this...
.load C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\sos.dll
You now need to tell the bugger to stop on .Net exceptions. Do this with the following command...
sxe CLR
You can now continue the debugging by entering go (or click on the icon) and wait for the exception. When the exception is thrown the debugger will break and you will see the following in the command window...
As you can see we have hit an exception on the same line of code as we did in VS.Net. We can now look at the call stack using the command !clrstack, which produces the following results...
This again shows us the WSS class and method in which the error occurred. Here you could also enter !clrstack -p, which would show you the parameters and the memory addresses. If you want you can look at the method parameters using the command !dumpobject...
0:015> !clrstack -p
OS Thread Id: 0x50d4 (15)
ESP EIP
01c7ebe4 0bd81a4b Microsoft.SharePoint.Publishing.WebControls.ConsoleXmlUtilities.ConfigurationXml(System.String, Boolean)
PARAMETERS:
configProvider = 0x0e843368
isBuiltInConfigFile = 0x00000000
0:015> !dumpobj 0x0e843368
Name: System.String
MethodTable: 790fa3e0
EEClass: 790fa340
Size: 52(0x34) bytes
(C:\WINDOWS\assembly\GAC_32\mscorlib\2.0.0.0__b77a5c561934e089\mscorlib.dll)
String: CustomEditingMenu
Fields:
MT Field Offset Type VT Attr Value Name
790fed1c 4000096 4 System.Int32 0 instance 18 m_arrayLength
790fed1c 4000097 8 System.Int32 0 instance 17 m_stringLength
790fbefc 4000098 c System.Char 0 instance 43 m_firstChar
790fa3e0 4000099 10 System.String 0 shared static Empty
>> Domain:Value 000db050:790d6584 000fe470:790d6584 <<
79124670 400009a 14 System.Char[] 0 shared static WhitespaceChars
>> Domain:Value 000db050:01d61438 000fe470:01d65140 <<
By using the !dumpobj (or !do for short) you can look at all the objects in memory...providing you can find their memory address. Other useful commands include...
!dumpstackobjects (shows all objects currently on the stack)
!printexception (!pe)
!dumpallexceptions (!dae)
Now we know the class & method we can look at the code using the command !u...
!u 0bd81a4b
0bd81a0e e9b2000000 jmp 0bd81ac5
0bd81a13 8b8d54ffffff mov ecx,dword ptr [ebp-0ACh]
0bd81a19 ba02000000 mov edx,2
0bd81a1e ff150c7ee20b call dword ptr ds:[0BE27E0Ch]
0bd81a24 8bf0 mov esi,eax
0bd81a26 89b540ffffff mov dword ptr [ebp-0C0h],esi
0bd81a2c 8b8d54ffffff mov ecx,dword ptr [ebp-0ACh]
0bd81a32 ba01000000 mov edx,1
0bd81a37 ff150c7ee20b call dword ptr ds:[0BE27E0Ch]
0bd81a3d 8bf0 mov esi,eax
0bd81a3f 89b53cffffff mov dword ptr [ebp-0C4h],esi
0bd81a45 8b8d3cffffff mov ecx,dword ptr [ebp-0C4h]
>>> 0bd81a4b 3909 cmp dword ptr [ecx],ecx
0bd81a4d ff15a891b00a call dword ptr ds:[0AB091A8h]
0bd81a53 8bf0 mov esi,eax
0bd81a55 8bce mov ecx,esi
0bd81a57 3909 cmp dword ptr [ecx],ecx
0bd81a59 e8a24e2900 call 0c016900 (Microsoft.SharePoint.SPListItem.get_ListItems(), mdToken: 060035ef)
This is an abbreviated version, but you get the same IL as you do in VS.Net, actually its better as you get some method names. We can now use reflector as before to solve the problem.
Most of the time your not really going to be able to attach a debugger to a live server, but that still doesn't mean you can't debug exceptions in SharePoint. Microsoft provide a utility called ADPlus, which will create mini dumps of the exceptions within your SharePoint application. These dumps can then be opened in WinDbg to look at the dump in exactly the same way as you would using WinDbg live.
ADPlus is a console application which attaches to the process and waits for dumps to occur, taking a dump when they do. Once captured the dumps can be transferred back to your development machine diagnosed for as long as you want without tying up the live server. This is particularly useful when your SharePoint site is being managed by a hosting company and you do not have RDP access to the live server. You can easily script the commands so a support engineer at the hosting provider can create the dumps and email them to you.
Useful commands include
adplus -hang -pn w3wp.exe
and
adplus -crash -pn w3wp.exe
When debugging SharePoint I have only ever got -crash to give me anything useful, but I am sure -hang will be useful one day.
Note: By default you do not get a memory dump for first chance exceptions (because they can occur frequently), however adplus can be configured to do this (see http://support.microsoft.com/kb/q286350/). Not having a memory dump only means you can't see the contents of parameters & objects...you may not always need them.
More on ADPlus can be found within these articles...