Articles

Affichage des articles du novembre, 2018
Image
The mysterious case of the 1 minute pauses I spend a lot of time debugging issues on our service - we have a big team and periodically we hit some really weird issues that get escalated to me. In this particular case, we deployed our new sprint, full of exciting features to our dogfood server, and after a day or two, people started complaining of sluggishness. A quick check of the dashboard showed our availability was slowly getting worse: Our DRIs had been tracking this issue but couldn't get a usable PerfView log - so I got called in to help out - and I saw the same behavior - i.e. the ETL traces that we auto-collect were not helpful - mostly because they either didn't show high CPU or because the symbols were messed up  - the dreaded ?!? in PerfView (that happens sometimes with dynamically generated code, like XML Serializers or compiled regex) but this was worse than usual. So I hopped onto the machines and captured a more detail log of requests/sec...
How to diagnose SqlConnection Leaks This is a fairly common problem - you test your app, everything works great. Throw it into production and then it stops working and you see a lot of these: System.InvalidOperationException: Timeout expired.  The timeout period elapsed prior to obtaining a connection from the pool. You then check the number of connections in Perfmon and see that you have 100 connections used up...oops...something is not closing the connection.... You can follow the instructions on this blog  to figure out which pool you're leaking from, but sometimes that's not quite enough, and you have to figure out where exactly in the code the connection is getting leaked. Here's how: First, dump all the SqlConnection objects in the heap: To do that more efficiently, I use !DumpHeap -MT <MethodTable> but I have to resolve the method table for SqlConnection like so: 0:108> !Name2EE System.Data.dll System.Data.SqlClient.SqlConnection Modu...