Capturing login failures via SQL Server Audit

Page content

If you want to trace a SQL Server login failure (18456 error) you’re going to have to get to grips with SQL Server Audit from now on.

This deprecation of SQL Profiler provided a suitable opportunity to update my original post on isolating login failures in order to show how to trace login failures using SQL Server Audit, but as it would have bloated that post way too much I’ve created it as a separate post.

I had been hoping to use Extended Events (XE) to do this, as this is the direct replacement for SQL Profiler, but unfortunately, login failures have been hived off into the SQL Server Audit component.

Whilst this still uses the XE engine, the processing of audit events is slightly different, and there is what I consider a major gotcha (which gives less functionality than SQL Profiler offered) to watch out for, which I will point out later.

Creating an audit trace for SQL Server login failures

The example I am about to show is based on the minimal steps needed to set up an audit trace using just T-SQL (no endless clicking on a multitude of SSMS dialogs) and works on SQL Server 2008R2 and SQL Server 2012.

This is a process that can be split up into four surprisingly simple steps:

  • Create an audit (basically a container which stipulates where the audit information is going to be saved).
  • Create an audit specification to capture failed logins
  • Start the capture (which can be done as part of the creation of the audit specification above)
  • Interrogate the capture output for the login failure details.

Some of the above steps can be combined, but I think they’re easier to understand this way if you’re new to SQL Server Audit, as I imagine most of us still are.

Create the audit

Login to the relevant server via SSMS (any version of SSMS and SQL Server equal to or above SQL Server 2008R2) and create the audit. I’ve chosen to write to a file in D:\TRACES\AUDITS\XE\login_failures:

CREATE SERVER AUDIT audit_login_failure TO FILE ( filepath=‘D:\TRACES\AUDITS\XE\login_failures’ , maxsize = 50MB) WITH (ON_FAILURE=CONTINUE)

as this is an audit trace you cannot specify the filename as SQL Server will create a file prefixed with the name of the audit (audit_logon_failure in our case) and then a long GUID and finally ending with a .sqlaudit extension. Alternative options to a file are the Application or Security event logs. The ON_FAILURE option controls what to do if there is a failure writing to the chosen output.

With the audit created we need to enable it:

ALTER SERVER AUDIT audit_login_failure WITH (STATE=ON)

[Thanks to Daniel for pointing out that this step was missing from the original post].

Create the audit specification

Now that the container has been created, the audit specification which specifies what the audit is going to capture needs to be specified as follows:

CREATE SERVER AUDIT SPECIFICATION audit_login_failure_spec FOR SERVER AUDIT audit_login_failure ADD (FAILED_LOGIN_GROUP); GO

The audit actions that can be traced are listed in the sys.dm_audit_actions DMV, and multiple audit actions can be added to an audit specification, but I’m just going with the FAILED_LOGIN_GROUP because that’s all that is needed to trap a login failure.

That’s basically it, the plumbing has been set up, and now we just have to turn the tap on and off when we want to start the tracing of our login failures.

In order to verify what we’ve set up we can run the following query against sys.server_audits:

SELECT name, type_desc, on_failure_desc, is_state_enabled, queue_delay, predicate FROM SYS.SERVER_AUDITS WHERE NAME=‘audit_login_failure’

This will confirm the relevant details about the audit specification including the output details, whether it is enabled or not, the queue delay and the predicate (neither of which we defined for this audit specification so are at their default settings):

Now that we’re all set up you might be wondering where this audit information can be seen in SQL Server Management Studio (SSMS), and it can be found under the Security folder:

SQL Server security audit details in SSMS

Start the capture

Now we’ve confirmed that the trace is set up we can start it

ALTER SERVER AUDIT SPECIFICATION audit_login_failure_spec WITH (STATE=ON) GO

and the following query which calls on the sys.dm_server_audit_status DMV will confirm it has started and where it is writing to

SELECT status_desc, audit_file_path FROM SYS.DM_SERVER_AUDIT_STATUS WHERE NAME=‘audit_login_failure’

It will output something like the following:

Leave the trace running for a long enough period to capture a few failed logins; if it’s going to be while before the next login failure is likely to be triggered, there’s no harm in generating a few deliberate login failures yourself so you can check the trace is working correctly.

Interrogate the capture file

Once you have your login failures and have confirmed the name and location of the with the above query the details can be extracted via the following query:

Leave the trace running for a long enough period to capture a few failed logins; if it’s going to be while before the next login failure is likely to be triggered, there’s no harm in generating a few deliberate login failures yourself so you can check the trace is working correctly.

select [event_time],[target_database_principal_name],[database_name], [server_principal_name], [statement], [additional_information] from sys.fn_get_audit_file (‘D:\TRACES\AUDITS\XE\login_failures\audit_login_failure_A8DD195D-26E6-4B7B-BF0E-BFEBA9B4153A_0_130194192428290000.sqlaudit’, default, default) where action_id = ‘LGIF’

Below is the output for a couple of test login failures that were manually triggered:

We can see the time of the login failure and the full login error (which is what is dumped to the SQL erorr log), and that is pretty much it.

Troubleshooting SQL Server Audit

In case of any issues use the DMVs to check the status of the audits.

The issue is usually going to be down the audit specification not being enabled or an issue with where the audit information is being written to and the above queries against sys.dm_server_audit_status and sys.server_audits should highlight the issue.

Don’t forget also that if SQL Server is restarted the audit filename (if you’re writing to a file) will change and you’ll to get the latest filename by using one of the the previously listed queries against sys.server_audits to get that filename. That’s one of the reasons it is probably better to write to the event log for these errors.

However

There is, unfortunately, a major downside to tracking login failures via SQL Server Audit, as opposed to using SQL Profiler; the latter allowed the capture of the process id, or PID of the connecting process, and armed with the hostname and PID of the process generating the login failure it was fairly straightforward to pin down the offending process (as long as it wasn’t using JDBC to connect).
SQL Server Audit currently offers no way of capturing the client PID. To me, this is a major oversight and I raised a connect item for this which Microsoft have reviewed and deemed as something they don’t plan to fix. To me, it’s a deprecation as IMHO new versions of SQL Server should build on the functionality provided by existing versions of SQL Server, rather than offer less functionality.

More than that, I think it’s a bit a security lapse not being able to identify the PID so I’ll be sticking to using SQL Profiler for troubleshooting persistent login failures from unknown processes for as long as it ships with SQL Server (or until they fix the connect item). Feel free to upvote the item if you share the same opinion.

Happy hunting!

Login failures (18456 errors) in SQL Server
Troubleshooting SQL Server listener connectivity
The SQL Server default trace