Using WinDbg to Troubleshoot WCF Performance Issues

Yesterday I was working on a very interesting problem.  I am involved in developing a Silverlight application that is using WCF Services to communicate with a server to perform CRUD and business logic related operations.  All of a sudden QA started experiencing very slow performance testing application at random times.  Once we took a look, we noticed that CPU utilization kept hovering around 100 % at those times.  I was trying to figure out what is causing high CPU utilization.  Typically, those issues would be linked to tight loops of sorts, that tend to peg CPU at near maximum.  In our case, once CPU got to 100 %, it stayed there until app pool was recycled.  Give the tight loop hunch, I was guessing that somehow a service process was getting stuck in an infinite loop or some heavy processing loop. 

I blogged previously on using WinDbg to debug memory leaks in Silverlight applications.  It was time to find out if WinDbg would help me in this case as well.  The following describes the steps I took to troubleshoot the problem.

First of all, use steps in the previous post to install debugging tools for Windows.

Then, you need to catch the IIS process when it is in high SPU utilization mode.  You should let the system sit there for a little while to ensure you will capture the procedures that cause the trouble.  At that time open command window and switch to debugging tools directory.  In 64 bit Windows it is C:Program FilesDebugging Tools for Windows (x64).  At the command prompt type “adplus -hang -o c:dump -IIS”.  This will create process dump in c:dump folder for all IIS related processes.

This you can reset IIS.

Next step is to run WinDbg and open a dump file using File->Open Crash Dump menu.  You will have multiple files with “dmp” extension in C:Dump<Time-Stamp> folder that was just created.  I’d pick the largest one, since this is likely to belong to your process.

Now, load debug symbols by typing the following into command window (one line high windows in the bottom of the WinDbg windows.).  Your cursor should be blinking there.  .load C:WindowsMicrosoft.NETFramework64v4.0.30319sos.dll.  If you use 32 bit framework, adjust location accordingly.

image

Then, type “!runaway” command to get the list of running threads sorted by CPU time, starting the longest running thread.  If you suspect a hang, it is likely that your problem causing threads will be listed first.

image

Then, we will “select” a thread by using thread command – “~XXs”, where XX is the thread number from !runaway command.  In my case those are threads 18 and 30, sot the command would be ~18s

Now, we can look at the thread stack by using the following command – “!clrstack”.

 

image

What we are interested in is the very first line that points to PerfDemoWcfService.Service1.GetData method.  This is the method that causes the issue.  In my sample case methods…

  public class Service1 : IService1

  {

    public string GetData(int value)

    {

      Int64 variable = 0;

      while (true)

      {

        if (variable > 1000000000)

        {

          variable = 0;

        }

        else

        {

          variable++;

        }

      }

      return string.Format("You entered: {0}", value);

    }

 

 

… looks ridiculous.  In my real case it was a lot more complicated, but I did know what method is causing troubles for me.

 

Thanks.

Leave a Reply

Your email address will not be published. Required fields are marked *