365DBA Blog

Database Stuff

Transactions With Log Records

Here is another fairly simple SQL Server script to show information on open transactions which have any log records. You can use this to get detailed information about processes that are currently running or schedule it to catch any processes which might be bloating your transaction log (which is hard to catch after the fact if you don't see it right afterward rolling back).

Select GetDate() As recCreatedDt,
DateDiff(ss, DBT.database_transaction_begin_time, GetDate()) As AgeInSeconds,
DB_Name(DBT.database_id) As 'DatabaseName',
DBT.transaction_id,
DBT.database_transaction_begin_time,
Case DBT.database_transaction_type
When 1 Then 'Read/write transaction'
When 2 Then 'Read-only transaction'
When 3 Then 'System transaction'
End As 'TypeOfTransaction',
Case DBT.database_transaction_state
When 1 Then 'The transaction has not been initialized.'
When 3 Then 'The transaction has been initialized but has not generated any log records.'
When 4 Then 'The transaction has generated log records'
When 5 Then 'The transaction has been prepared.'
When 10 Then 'The transaction has been committed.'
When 11 Then 'The transaction has been rolled back.'
When 12 Then 'The transaction is being committed. In this state the log record is being generated, but it has not been materialized or persisted.'
End As 'StateOfTransaction',
DBT.database_transaction_log_record_count As record_count,
DBT.database_transaction_replicate_record_count As replicated_records,
DBT.database_transaction_log_bytes_used As bytes_used,
DBT.database_transaction_log_bytes_reserved As bytes_reserved,
DBT.database_transaction_log_bytes_used_system As system_bytes,
DBT.database_transaction_log_bytes_reserved_system As system_reserved,
--DBT.database_transaction_begin_lsn,
--DBT.database_transaction_last_lsn,
--DBT.database_transaction_most_recent_savepoint_lsn,
--DBT.database_transaction_commit_lsn,
--DBT.database_transaction_last_rollback_lsn,
--DBT.database_transaction_next_undo_lsn, 
ES.[host_name], 
ES.[program_name], 
Case ER.transaction_isolation_level
When 0 Then 'Unspecified'
When 1 Then 'ReadUncomitted'
When 2 Then 'ReadCommitted'
When 3 Then 'Repeatable'
When 4 Then 'Serializable'
When 5 Then 'Snapshot'
End As IsolationLevel, 
ER.command,
S.text, 
SubString(S.text, (ER.statement_start_offset/2)+1, 
((Case ER.statement_end_offset
When -1 Then DataLength(S.text)
Else ER.statement_end_offset
End - ER.statement_start_offset)/2) + 1) As statement_text
From sys.dm_tran_database_transactions DBT
Left Join sys.dm_tran_session_transactions ST On DBT.transaction_id = ST.transaction_id
Left Join sys.dm_exec_sessions ES On ST.session_id = ES.session_id
Left Join sys.dm_exec_requests ER On DBT.transaction_id = ER.transaction_id
Outer Apply sys.dm_exec_sql_text(ER.sql_handle) S
Where DBT.database_transaction_log_record_count > 0

Database Free Space

Here is a very simple SQL Server script that has come in handy throughout the years; it displays the total space, used space, and free space (bloat) for each data file within every database on the instance. Remove the commented line to limit the results to a specific drive (in case you are running out of room on a specific drive and are only interested in files which reside on that drive). 

*Disclaimer: Keep in mind however, that shrinking files is against best practices and causes everything to get fragmented, etc., but in real life sometimes you need to do it regardless of how ugly it is.

If Object_ID('tempdb.dbo.#UsedSpace') > 0
    Drop Table #UsedSpace
Create Table #UsedSpace (DBName VarChar(150), FileID SmallInt, FileGroup SmallInt, TotalExtents Int, UsedExtents Int, 
Name NVarChar(256), FileName NVarChar(512))
exec sp_msforeachdb 'use [?]
Declare @SQL VarChar(8000)
Set @SQL = ''DBCC SHOWFILESTATS WITH TABLERESULTS''
Insert Into #UsedSpace (FileID, FileGroup, TotalExtents, UsedExtents, Name, FileName)
Exec(@SQL);
Update #UsedSpace Set DBName = ''?'' Where DBName Is Null;
'
Select DBName, Name, (TotalExtents * 64) / 1024 As TotalSpace,
(UsedExtents * 64) / 1024 As UsedSpace, 
((TotalExtents * 64) / 1024) - ((UsedExtents * 64) / 1024) As FreeSpace
From #UsedSpace
--Where Left(FileName, 1) = 'F'
Order By ((TotalExtents * 64) / 1024) - ((UsedExtents * 64) / 1024) Desc
Drop Table #UsedSpace

Get Excel Spreadsheet Names by Mapping a DataSet to System.Object Variable in SSIS

I was creating an import task for a product and the source Excel workbook had a date-stamped name along with date-stamped worksheets within the book (image below) and the date-stamp was not always consistent so I couldn’t use any date conversion/ modification logic to determine what the names of the spreadsheets were going to be. In order to accomplish pulling the data from the individual sheets without knowing exactly what the names were going to be at any given point, I needed to get the names of the spreadsheets and populate variables for each spreadsheet. In my situation, I knew the number of spreadsheets that were in the workbook which helps, but with same additional preparation, I believe you could perform this with a varying amount of spreadsheets.

I did some research and did find some ways of performing this by using a ForEach Loop using an ADO.NET Schema Rowset Enumerator, but that relied heavily on the OLE DB Provider and the registered version of the providers which makes it much more difficult to configure initially and maintain if you move it to different servers and environments. In order to get around some of the issues associated with the ADO.NET Enumerator method, I started looking at using C# code within a script task to loop through the spreadsheet names and I came across the fact that the System.Object variable type was a .Net DataSet and I was using a DataSet within my script task for the list of sheets. The process is quite simple as shown below.

The spreadsheet names were similar to this:

and the relevant tasks in the package are here:


And the variables I am using:

The code for getting the DataSet of all the spreadsheet names is quite simple (in the "Populate SheetNames Variable" task), you need to first make sure that you have a variable with a Data type of Object and that the variable is set to ReadWrite. 

In your script, you need to import the OleDb namespace:


Then you pull in the list of sheets as a DataTable, add it to a DataSet, and set the variable from the DataSet:

As mentioned, I also had a variable I was using to store the location and Excel file name (FullSourcePath) and if your situation is similar, you will need to make sure that the variable is listed in the ReadOnlyVariables section of the script task.
Also, you could still use just code in the same script task to set the name of the sheet variables, but I wanted the possibility of using the list in other places within the package so went with just pulling the entire list out into the context of the entire package.

Once the Object variable is populated with the DataSet of sheet names, you can use that in a Foreach Loop container by using a Foreach ADO Enumerator and specifying your Object variable as the "ADO object source variable" like so:


In your loop, you will be be using a variable to hold the current sheet name so in your "Variable Mappings" section, you will want to specify that variable and make sure you use the index of 2 since that is the Index that holds the actual sheet name:


Once you have the current name in the holding variable, you can use that to specify your specific sheet names via several different methods. I used a script task, but you could use other methods and you could also do it differently by using a switch statement within your code, but whatever works for you is cool. In this script, I am determining if the current sheet name in the SheetName variable contains my known values for determining which sheet it is and notice that I am removing the quotes that are around the name within the DataSet. 


Once your specific sheet names are known and set, they can be used in expressions for queries later on in the package.

Encrypting Columns With SQL Server

It's no surprise that SQL Server offers a variety of ways to encrypt the values in a column, but there are some things to keep in mind when deciding whether column encryption is the right path to choose. I did some testing a while ago, but the findings still hold true and as such, I find myself referring back to the results more frequently than I would have thought so decided to put it out there for friends and colleagues as well as making it easier for myself to find next time I need to.

For some fields (a password field is a good example) you might not need to ever decrypt the information since getting the plain-text value isn't necessary as you can simply encrypt the input value and compare it to the stored value to determine a match. For times when the user forgets the password, the system should encrypt a new, randomly generated value and send that to them rather than decrypting the stored value and sending that to them in plain text (I hate web sites that do that).

My testing involved encrypting a field by passphrase, 64-bit DES, and 256-bit AES on a table with close to 1 Billion records and I have outlined my findings below:

Initial encryption of plain-text field:
  • An index was required on the column in order to optimize exclusion of null values and empty strings, this took approximately 45 minutes to complete (based on 3 tests).
  • Once the index was created, setting an encrypted value only took around 15-20 minutes depending on the type and encryption algorithm used (looping through 500,000 records at a time).
  • CPU spiked periodically, but usually stayed around 40-60% throughout the process (I don't recall the hardware, but as I said, this was several years ago so YMMV).
Some things to keep in mind when deciding between passphrase and key (if you go with SQL encryption at all vs. hash/ compare at the code-level):

  • With any encryption function (passphrase or key) if the passphrase is incorrect or the key is not available or open, the return result will just be null, no error will be thrown.
  • If using replication or any HA solution with data available for access on a separate server, symmetric keys need to be synchronized in all locations for encryption/ decryption to work; if these are not in sync, no error will be thrown, but encryption and decryption will fail so it may not be readily apparent.
  • If the symmetric key is not open, EncryptByKey() and DecryptByKey() functions will complete successfully, but inserted/ encrypted value will be null for EncryptByKey() and selected values will be null for DecryptByKey() which again, may not be readily apparent. This can be safe-guarded against with a query to sys.OpenKeys ensuring that the key is open, but this adds more overhead and room for errors.
  • Null values in the field will be ignored, but an empty string will still be encrypted adding some overhead.
  • VarBinary columns for encrypted passwords must be created with “Ansi_Padding = On” or any trailing 0’s will be removed causing invalid data when converted back to VarChar.
  • Published testing results from other companies’ tests have shown an average increase of 3-5 (but up to 7) times the resource hit (all CPU and I/O) using symmetric key encryption compared to no encryption (completely anecdotal and I found that when implemented correctly, there was hardly any noticeable increase in resource utilization).

The biggest drawback was the silent errors that it throws if the wrong keys are opened or if there is no key opened at all and there is no exceptionally good checking available to make that determination due to increased overhead of every call.

SQL Server Monitoring Scripts - 9 - Putting It All Together

So now that you have some scripts to perform basic monitoring and status checking, you want to run it against your collection of servers and set it up as an automated process, so in comes PowerShell to help you out. The following script reads a list of servers that you specify in a separate text file and loops through all files with a .sql extension in a folder that you specify, running them against your servers and outputting the results into your specified output folder. All of it can be easily customized by changing a few lines here and there. Also, the commented sqlcmd line is for using SQL authentication and passing in the user name and password parameters when calling the PS script so if using SQL authentication, all user names and passwords will have to be the same.

#######################################
#PS Script (change the variables as necessary):

#The list of server to run the scripts against:
$ServerListFile = "C:\Users\davidzahner\Documents\PSScripts\ServerList1.txt"
$Servers = Get-Content $ServerListFile

#Set the folder location and file type filter:
$path = "C:\Users\davidzahner\Documents\PSScripts"
$filter = "*.sql"

#Set the output folder:
$OutputFolder = "C:\Users\davidzahner\Documents\CheckResults"

#Set U and P values from parameters:
$User = $Args[0]
$Pass = $Args[1]

$Date = Get-Date -UFormat "%Y%m%d"
$Date = $Date -replace "/", "_"

ForEach ($Server in $Servers)
{
$ServerString = $Server -replace "\\", "-"
$OutputFile = $OutputFolder + "\" + $Date + "__" + $ServerString + ".txt"
get-childitem -recurse -path $path -filter $filter | % {
        $file = $_
        #Sqlcmd -S $Server -U$User -P$Pass -i $file.FullName | Out-File -append -filePath "$OutputFile"
        Sqlcmd -S $Server -E -i $file.FullName | Out-File -append -filePath "$OutputFile"
    }
}

#######################################
#Server List Example:
SERVER05
SERVER05\INSTANCE2
SERVER04
SERVER04\INSTANCE2
SERVER03
SERVER03\INSTANCE2
SERVER02
SERVER02\INSTANCE2
SERVER01
SERVER01\INSTANCE2

SQL Server Monitoring Scripts - 8 - General SQL Errors Check

To wrap up the SQL scripts for this segment, here is a script to comb the logs and look for some common errors. You can add and remove checks by modifying the calls to xp_readerrorlog so you can catch errors that you are most concerned about in your environment.

Set NoCount On
Set ANSI_Warnings Off
Select '---------------General SQL Errors Check---------------'
Declare @LastCheck DateTime
Select @LastCheck = DateAdd(dd, -365, GetDate())
If (@LastCheck Is Null)
Begin
Set @LastCheck = '1-1-2000'
End
If Object_ID('tempdb.dbo.#ErrLog1') Is Not Null
Drop Table #ErrLog1;
Create Table #ErrLog1 (LogDate DateTime, ProcessInfo VarChar(50), LogText VarChar(2000))
Insert Into #ErrLog1 (LogDate, ProcessInfo, LogText) 
Exec xp_readerrorlog 0, 1, N'STACK DUMP'
Insert Into #ErrLog1 (LogDate, ProcessInfo, LogText)
Exec xp_readerrorlog 0, 1, N'Severity';
Insert Into #ErrLog1 (LogDate, ProcessInfo, LogText)
Exec xp_readerrorlog 0, 1, N'Login', N'failed';
Insert Into #ErrLog1 (LogDate, ProcessInfo, LogText)
Exec xp_readerrorlog 0, 1, N'I/O requests';
Delete From #ErrLog1 Where LogDate <= @LastCheck
Delete From #ErrLog1 Where LogText Like 'Error: 18456%' --Login Failed (removing since we are capturing it explicitly above)
Delete From #ErrLog1 Where LogText Like 'Error: 18210%' --VDI Backup Failures (http://support.microsoft.com/kb/2497575)
Delete From #ErrLog1 Where LogText Like 'Error: 3041%' --Log Backup Failures (removing since we are capturing it explicitly in other scripts)

Select Cast(LogDate As VarChar(25)), Left(LogText, 500) From #ErrLog1
If Object_ID('tempdb.dbo.#ErrLog1') Is Not Null
Drop Table #ErrLog1;

SQL Server Monitoring Scripts - 7 - SQL Restart Check

This is a simple script to let you know if the SQL instance has been restarted in the past 24 hours (configurable with the @Threshold variable. 

Set NoCount On
Set ANSI_Warnings Off
Select '---------------SQL Restart Check---------------'
Declare @RestartTime DateTime, @Threshold DateTime
Select @RestartTime = crdate From sys.sysdatabases Where [name] = 'tempdb'
Select @Threshold = DateAdd(hh, -24, GetDate())
If DateDiff(mi, @Threshold, @RestartTime) > 5
Begin
Select 'Server: ' + Cast(ServerProperty('MachineName') As sysname) + IsNull('\' + Cast(ServerProperty('InstanceName') As sysname), '') + ' restarted at: ' + Cast(@RestartTime As VarChar(25))
End

SQL Server Monitoring Scripts - 6 - Missing Differential Backups

And to round out the missing backups, we need to add the check for missing differentials (if your backup strategy includes differential backups). As with the other related scripts, the threshold is in hours so you can adjust as necessary. 

Set NoCount On
Set ANSI_Warnings Off
Select '---------------Differential Backup Check---------------'
Declare @MissingBackups Table(ID Int Identity, DatabaseName VarChar(100))
Declare @Threshold Int
Select @Threshold = 28 --hours
Insert Into @MissingBackups (DatabaseName)
Select d.[name] From sys.databases d 
Left Join msdb.dbo.backupset b On d.[name] = b.database_name
And b.[type] = 'I' 
Inner Join sys.database_mirroring m On d.[database_id] = m.[database_id]
Where DatabasePropertyEx(d.[name], 'Updateability') = 'READ_WRITE'  
And DatabasePropertyEx(d.[name], 'Status') = 'ONLINE'
And d.[name] Not In ('tempdb')
And ((m.mirroring_guid Is Null)
Or (m.mirroring_guid Is Not Null And mirroring_role = 1))
Group By d.[name]
Having IsNull(Max(b.backup_finish_date), '1/1/2000') < DateAdd(hh, (-1 * @Threshold), GetDate())
If (Select Count(*) From @MissingBackups) > 0
Begin
Declare @Count Int, @Counter Int
Select @Count = Max(ID) From @MissingBackups
Select @Counter = Min(ID) From @MissingBackups
While @Counter <= @Count
Begin
Select 'Server: ' + Cast(ServerProperty('MachineName') As sysname) + IsNull('\' + Cast(ServerProperty('InstanceName') As sysname), '') + 
'. Missing differential backups for DB: ' + DatabaseName From @MissingBackups Where ID = @Counter
Set @Counter = @Counter + 1
End
End

SQL Server Monitoring Scripts - 5 - Missing Full Backups

We also want to know about any databases that do not have a full backup; maybe they're new or maybe the backup failed for some reason and we weren't notified, whatever the case, it definitely deserves some investigating.

Set NoCount On
Set ANSI_Warnings Off
Select '---------------Full Backup Check---------------'
Declare @MissingBackups Table(ID Int Identity, DatabaseName VarChar(100))
Declare @Threshold Int
 --This threshold is in hours so you can specify the total hours or 24 * number of days in your full backup cycle
Select @Threshold = (24 * 7)
Insert Into @MissingBackups (DatabaseName)
Select d.[name] From sys.databases d 
Left Join msdb.dbo.backupset b On d.[name] = b.database_name
And b.[type] = 'D' 
Inner Join sys.database_mirroring m On d.[database_id] = m.[database_id]
Where DatabasePropertyEx(d.[name], 'Updateability') = 'READ_WRITE' 
And DatabasePropertyEx(d.[name], 'Status') = 'ONLINE' 
And d.[name] Not In ('tempdb')
And ((m.mirroring_guid Is Null)
Or (m.mirroring_guid Is Not Null And mirroring_role = 1))
Group By d.[name]
Having IsNull(Max(b.backup_finish_date), '1/1/2000') < DateAdd(hh, (-1 * @Threshold), GetDate())
If (Select Count(*) From @MissingBackups) > 0
Begin
Declare @Count Int, @Counter Int
Select @Count = Max(ID) From @MissingBackups
Select @Counter = Min(ID) From @MissingBackups
While @Counter <= @Count
Begin
Select 'Server: ' + Cast(ServerProperty('MachineName') As sysname) + IsNull('\' + Cast(ServerProperty('InstanceName') As sysname), '') + 
'. Missing full backups for DB: ' + DatabaseName From @MissingBackups Where ID = @Counter
Set @Counter = @Counter + 1
End
End

SQL Server Monitoring Scripts - 4 - Missing Transaction Log Backups

We don't want any databases in full recovery going without their log backups so here is a script to let us know when we have one. You will need to adjust the @Threshold variable to your specific needs (which is true for all of the scripts in this series) and then you are set. This script takes into account whether the database is mirrored, it's status, and if it's a system database.

Set NoCount On
Set ANSI_Warnings Off
Select '---------------Log Backup Check---------------'
Declare @MissingLogBackups Table(ID Int Identity, DatabaseName VarChar(100))
Declare @Threshold Int
Select @Threshold = 4
Insert Into @MissingLogBackups (DatabaseName)
Select d.[name] From sys.databases d 
Left Join msdb.dbo.backupset b On d.[name] = b.database_name
And b.[type] = 'L' 
Inner Join sys.database_mirroring m On d.[database_id] = m.[database_id]
Where DatabasePropertyEx(d.[name], 'Recovery') <> 'SIMPLE'
and DatabasePropertyEx(d.[name], 'Status') = 'ONLINE'
And d.[name] Not In ('master', 'msdb', 'tempdb', 'model')
And ((m.mirroring_guid Is Null)
Or (m.mirroring_guid Is Not Null And mirroring_role = 1))
Group By d.[name]
Having IsNull(Max(b.backup_finish_date), '1/1/2000') < DateAdd(hh, (-1 * @Threshold), GetDate())
If (Select Count(*) From @MissingLogBackups) > 0
Begin
Declare @Count Int, @Counter Int
Select @Count = Max(ID) From @MissingLogBackups
Select @Counter = Min(ID) From @MissingLogBackups
While @Counter <= @Count
Begin
Select 'Server: ' + Cast(ServerProperty('MachineName') As sysname) + IsNull('\' + Cast(ServerProperty('InstanceName') As sysname), '') + 
'. Missing log backups for DB: ' + DatabaseName From @MissingLogBackups Where ID = @Counter
Set @Counter = @Counter + 1
End
End