troubleshooting-sql-server

FlushCache messages might not be an actual IO stall

At a client site recently a SQL Server 2012 (Enterprise, SP1 + CU#9) server with 512GB RAM and 80 cores (40 cores assigned to SQL Server) that had been patched and restarted displayed the following symptoms after an index reorg operation was restarted: CPU saturation, with all cores available to SQL Server (but not all cores on the box) almost totally maxed out and the perfmon data showing an average of 96% CPU usage for those CPUs.

Capturing login failures via SQL Server Audit

If you want to trace a SQL Server login failure (18456 error) you’re going to have to get to grips with SQL Server Audit from now on. This deprecation of SQL Profiler provided a suitable opportunity to update my original post on isolating login failures in order to show how to trace login failures using SQL Server Audit, but as it would have bloated that post way too much I’ve created it as a separate post.

The SQL Server default trace

The SQL Server default trace is useful! The SQL Server default trace (introduced from SQL Server 2005 onwards) is a background sql server trace that runs continuously and records event information that, despite the adverse comments on the web about its value, can be useful in troubleshooting problems. Not only that, but it’s there out-of-the box so you don’t have to do put any effort into enabling or managing it.

Troubleshooting SQL Server listener connectivity

SQL Server listeners Verifying and troubleshooting SQL Server listener connectivity via tcp, named pipes or shared memory connections is a key component in a DBA’s armoury but often overlooked until it’s too late and application teams call up complaining their apps can’t connect. SQL Server can listen for connection requests via multiple listeners, such as TCP/IP, shared memory or named pipes, but sometimes it’s necessary to check connectivity by a specific listener.

SQL Server error 18456 (Login failures)

With a little help from SQL Server’s tracing tools you can easily get to the bottom of almost all login failures (18456 errors). If the login failure is against SQL Server 2005 or above and the default trace is running just go to my SQL Server default trace post. If you’d prefer to use extended events (as SQL Profiler is now deprecated), go to Capturing login failures via SQL Server Audit.

Orphaned MSDTC transactions (-2 spids)

SPIDS with a value of -2 Any dba looking after a server that takes part in a distributed transaction will probably have come across situations where a SPID that does not show up in master..sysprocesses or sys.dm_exec_sessions (depending on your version of SQL Server) ends up blocking other users. Looking at sp_lock or the syslockinfo table or sys.dm_tran_locks would show resources being locked by a SPID with a value of -2.