Automating Stored Procedures When Refreshing Virtual SQL Server Databases

So… you’re using a Delphix virtualization engine to provision SQL Server virtual databases (VDBs), but you’re finding that your SQL account passwords are being reset to those used in production after every refresh?  Likewise, all of the settings in the application supported by the database?

Provisioning a new virtual databases is easy.  Go into the Delphix administrative console, find the dSource you want, find the snapshot for the point-in-time you want on that dSource, click the Provision button, specify the database server where the new database should reside, and hey presto!  In 10 minutes or less, a new virtual database is ready for use.

Except not quite…

That new VDB still has all of the data from the source database, and if that source database is production, then that means that it still has production passwords.  And it still has confidential data.  All of that has to be fixed before the database can be used by developers or testers.

So, even though you can now clone from production in minutes instead of hours or days, you still have to do the same post-clone and post-refresh processing tasks you’ve always had to do prior to Delphix.  Now, you just do them immediately rather than later.

If you’re willing to automate those tasks, whether in T-SQL stored procedures or in Windows Powershell scripts, then Delphix can help by embedding them as part of the operations of provision or refresh.

Delphix offers hooks, which are programmatic callouts which fire before and after certain actions by the Delphix virtualization engine, such as as a refresh action…

Hooks provide the ability to execute Powershell code on the target Windows server as the Windows domain account registered with Delphix, either before or after the successful completion of an action.  Here is what the form for adding hooks within a VDB looks like…

Here are the actions for which hooks can be entered, also seen listed in the screenshot above…

  • Provision
    • Configure Clone hook fires after the action completes
  • Refresh

    • Pre-Refresh hook fires before the action begins
    • Post-Refresh hook fires after the action completes successfully
    • Configure Clone hook fires after the Post-Refresh hook completes successfully
  • Rewind

    • Pre-Rewind hook fires before the action begins
    • Post-Rewind hook fires after the action completes successfully
  • Snapshot
    • Pre-Snapshot hook fires before the action begins
    • Post-Snapshot hook fires after the action completes successfully
  • Start
    • Pre-Start hook fires before the action begins
    • Post-Start hook fires after the action completes successfully
  • Stop

    • Pre-Stop hook fires before the action begins
    • Post-Stop hook fires after the action completes successfully

So back to the problem at hand…

We want some actions to take place automatically each time we refresh our virtual database (VDB).  As it turn out, we have two Transact-SQL stored procedures already coded to do the job…

  1. stored procedure MSDB.DBO.CAPTURE_ACCTS_PASSWDS
    • Saves all of the current accounts and account passwords to a set of tables in the MSDB system database
  2. stored procedure MSDB.DBO.REAPPLY_ACCTS_PASSWDS
    • Re-applies all of the current accounts and account passwords from the information previously stored in the MSDB system database

“@DatabaseName”, which is the name of the VDB, is the only parameter for these stored procedures.

I haven’t posted the T-SQL code for these stored procedures, partly because it is always going to be very customized to its environment, but mostly because I am not proficient with T-SQL myself, and I would just be copying someone else’s code for a time.

So looking at our list of Delphix hooks, it should be clear that we need to call the CAPTURE_ACCTS_PASSWDS stored procedure during the Pre-Refresh hook, and call the REAPPLY_ACCTS_PASSWDS stored procedure during the Post-Refresh hook.  Since hooks only call Powershell code and not T-SQL, here is some Powershell code we can use…

#======================================================================
# File:      callsp.ps1
# Type:      powershell script
# Author:    Delphix Professional Services
# Date:      02-Nov 2015
#
# Copyright and license:
#
#       Licensed under the Apache License, Version 2.0 (the "License");
#       you may not use this file except in compliance with the
#       License.
#
#       You may obtain a copy of the License at
#     
#               http://www.apache.org/licenses/LICENSE-2.0
#
#       Unless required by applicable law or agreed to in writing,
#       software distributed under the License is distributed on an
#       "AS IS" basis, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND,
#       either express or implied.
#
#       See the License for the specific language governing permissions
#       and limitations under the License.
#     
#       Copyright (c) 2015 by Delphix.  All rights reserved.
#
# Description:
#
#    Call the appropriate stored procedure within the DBO schema in
#    the MSDB database on behalf of the VDB.  The stored procedure
#    name the name of the database as a parameter called
#    "@DatabaseName"..
#
# Command-line parameters:
#
#    $fqSpName        fully-qualified stored procedure name
#
# Environment inputs expected:
#
#    VDB_DATABASE_NAME    SQL Server database name for the VDB
#    VDB_INSTANCE_NAME    SQL Server instance name for the VDB
#    VDB_INSTANCE_PORT    SQL Server instance port number for the VDB
#    VDB_INSTANCE_HOST    SQL Server instance hostname for the VDB
#
# Note:
#
# Modifications:
#    TGorman    02nov15    first version
#======================================================================
param([string]$fqSpName = "~~~")
#
#----------------------------------------------------------------------
# Verify the "$dirPath" and "$fqSpName" command-line parameter
# values...
#----------------------------------------------------------------------
 if ( $fqSpName -eq "~~~" ) {
     throw "Command-line parameter 'fqSpName' not found"
 } 
#
#----------------------------------------------------------------------
# Clean up a log file to capture future output from this script...
#----------------------------------------------------------------------
 $dirPath = [Environment]::GetFolderPath("Desktop")
 $timeStamp = Get-Date -UFormat "%Y%m%d_%H%M%S"
 $logFile = $dirPath + "\" + $env:VDB_DATABASE_NAME + "_" + $timeStamp + "_SP.LOG"
 "logFile is " + $logFile
#
#----------------------------------------------------------------------
# Output the variable names and values to the log file...
#----------------------------------------------------------------------
 "INFO: dirPath = '" + $dirPath + "'" | Out-File $logFile
 "INFO: fqSpName = '" + $fqSpName + "'" | Out-File $logFile -Append
 "INFO: env:VDB_INSTANCE_HOST = '" + $env:VDB_INSTANCE_HOST + "'" | Out-File $logFile -Append
 "INFO: env:VDB_INSTANCE_NAME = '" + $env:VDB_INSTANCE_NAME + "'" | Out-File $logFile -Append
 "INFO: env:VDB_INSTANCE_PORT = '" + $env:VDB_INSTANCE_PORT + "'" | Out-File $logFile -Append
 "INFO: env:VDB_DATABASE_NAME = '" + $env:VDB_DATABASE_NAME + "'" | Out-File $logFile -Append
#
#----------------------------------------------------------------------
# Housekeeping: remove any existing log files older than 15 days...
#----------------------------------------------------------------------
 "INFO: removing log files older than 15 days..." | Out-File $logFile -Append
 $ageLimit = (Get-Date).AddDays(-15)
 $logFilePattern = $env:VDB_DATABASE_NAME + "_*_SP.LOG"
 "INFO: logFilePattern = '" + $logFilePattern + "'" | Out-File $logFile -Append
 Get-ChildItem -Path $dirPath -recurse -include $logFilePattern |
     Where-Object { !$_.PSIsContainer -and $_.CreationTime -lt $ageLimit } |
     Remove-Item
#
#----------------------------------------------------------------------
# Run the stored procedure...
#----------------------------------------------------------------------
 "INFO: Running stored procedure '" + $fqSpName + "' within database '" +
     $env:VDB_DATABASE_NAME + "'..." | Out-File $logFile -Append
 try {
     "INFO: open SQL Server connection..." | Out-File $logFile -Append
     $sqlServer = $env:VDB_INSTANCE_HOST + "\" + $env:VDB_INSTANCE_NAME + ", " + $env:VDB_INSTANCE_PORT
     "INFO: sqlServer = '" + $sqlServer + "'" | Out-File $logFile -Append
#     [System.Reflection.Assembly]::LoadWithPartialName("Microsoft.SqlServer.Smo") | Out-Null;
     $conn = New-Object System.Data.SqlClient.SqlConnection
     $conn.ConnectionString = "Server=$sqlServer; Database=MSDB; Integrated Security=SSPI;"
     "INFO: conn.ConnectionString = '" + $conn.ConnectionString + "'" | Out-File $logFile -Append
     $conn.Open()
     $cmd1 = New-Object System.Data.SqlClient.SqlCommand($fqSpName, $conn)
     $cmd1.CommandType = [System.Data.CommandType]::StoredProcedure
     $cmd1.Parameters.Add('@DatabaseName', $env:VDB_DATABASE_NAME) | Out-null
     "INFO: calling " + $fqSpName + ", @DatabaseName = " + $env:VDB_DATABASE_NAME | Out-File $logFile -Append
     $exec1 = $cmd1.ExecuteReader()
     $exec1.Close()
     $conn.Close()
 } catch { Throw $Error[0].Exception.Message | Out-File $logFile -Append }
 #
 "INFO: completed stored procedure '" + $fqSpName + "' within database '" +
     $env:VDB_DATABASE_NAME + "' successfully" | Out-File $logFile -Append
#
#----------------------------------------------------------------------
# Exit with success status...
#----------------------------------------------------------------------
exit 0

The formatting of the code is a little constrained by the narrow viewing area, so if you want the actual code correctly formatted, it can be downloaded from HERE.

Of course, this code does more than just calling the stored procedure with the database name as the sole parameter;  it is also qualifying command-line parameters, creating and updating a log file, and handling possible error conditions.

Once this script is saved on the target Windows host server where the VDB resides, then we can call it from the Pre-Refresh and Post-Refresh hooks.

Here we see the Pre-Refresh hook being called with a call to the CALLSP.PS1 powershell script located within the C:\DELPHIX\SCRIPTS directory on the target Windows server, to call the CAPTURE_ACCTS_PASSWDS stored procedure…

Likewise, we see here how the Post-Refresh hook is constructed to call the REAPPLY_ACCTS_PASSWDS stored procedure…

And finally, with both hooks created, we can see them in the Configuration panel together like this…

So now, whenever the refresh operation is performed on the VDB named VV11, these hooks will execute the Powershell script, which will in turn execute the specified stored procedures, and so when the refresh operation is complete, the account and password settings which were present prior to the refresh operation will still be present.

I hope this is useful?


This post can also be found on the Delphix website HERE


If you have any questions about this topic, or any questions about data virtualization or data masking using Delphix, please feel free to contact me at “tim.gorman@delphix.com”.

SQL Saturday 588 – May 20 in NYC

I’m really excited about speaking at SQL Saturday #588 at the Microsoft office in Times Square, I’ll be discussing Accelerating TDM Using Data Virtualization.  This is my favorite topic, because for the 20+ years that I worked as a database administrator (DBA), I ran into the same challenges time and again.

The challenges of data volume.

More than 20 years ago, a friend named Cary Millsap was asked to define the term very large database, because its acronym (VLDB) had become popular.  Knowing that numbers are relative and that they obsolesce rapidly, he refused to cite a number.  Instead, he replied, “A very large database is one that is too large for you to manage comfortably“, which of course states the real problem, as well as the solution.

The problem of data volume has, if anything, become more severe, and there is no sign that it will abate and allow storage and networking technology to catch up.

So it becomes necessary to cease doing things the way we’ve done them for the past several decades, and find a more clever way.

And that is why I work for the solution.  Delphix data virtualization addresses the root of the problem, providing a far faster, less expensive way to provision and refresh huge volumes of data.  The result is revolutionary for the beleaguered software testing industry.

Every mishap or delay in the software development life-cycle means time pressure on the last step:  testing.  Creating full-size and fully-functional copies of the environment to be tested is also time-consuming, putting more time pressure on testing to prevent the slippage of delivery dates.  The end result is untested or poorly tested systems in production, despite heroic efforts by under-appreciated testers.

Constraining the creation of test data is data volume, which is growing beyond the capability of most enterprises.  “Storage is cheap” is not merely a cheap lie to dismiss the problem without solving it, but it is irrelevant and beside the point.

The real issue is time, because it takes a lot of time to push terabytes around from place to place.  It is just physics.  It is time which is most expensive and, by the way, storage really isn’t cheap, especially not in the quality that is demanded, and certainly not in the quantities in which it is being demanded.

Provisioning a full environment for each tester, one for each task, of each project, seems unrealistic when each environment might require terabytes of storage.  As a result, testers are limited to working in shared DEV or TEST environments which are refreshed only every few months, if ever.  Code quality suffers because testing is incomplete, and the pace of delivery fails to live up to business needs.

Server virtualization unleashed a surge of productivity in IT infrastructure over the past ten years.  But each virtual server still requires actual storage, making storage and data the constraint which holds everything back.

Data virtualization is the solution.

Come learn how to accelerate testing and how to accelerate the pace of innovation.

Come learn how to inspire more A-ha! moments by liberating QA and testing, and eliminating the constraint of data under which the software development life-cycle — from waterfall to agile — has labored over the past decades.

Liberate testing now!

SQL Saturday – 25-March 2017 – Colorado Springs CO

I’m really excited about speaking at SQL Saturday on my favorite topic!

SQLSaturday is a free training event for Microsoft Data Platform professionals and those wanting to learn about SQL Server, Business Intelligence and Analytics.  I’m all for that!

At this event, I’ll be discussing Accelerating DevOps and TDM Using Data Virtualization.  This is my favorite topic, because for the 20 years that I worked as a DBA, I ran into the same roadblock time and again.  The roadblock of data volume.

“DevOps” is a conflation of parts of the words “development” and “operations”, and it represents the grass-roots movement to merge application development, application testing, and IT operations into one continuous stream.  All tasks from coding to testing to operations must be automated so that new features and fixes can be delivered on a continual flow.

“TDM” is Test Data Management, the optimization of software quality testing to ensure that applications are built and maintained to operate according to their specifications.

Constraining both these goals is data volume, which is growing beyond the capability of most enterprises.  “Storage is cheap” is not merely a cheap lie to dismiss the problem without solving it, but it is irrelevant and beside the point.

The real issue is time, because it takes a lot of time to push terabytes around from place to place; it is just physics.  It is time which is most expensive and, by the way, storage really isn’t cheap, especially not in the quality that is demanded, and certainly not in the quantities in which it is being demanded.

Provisioning a full environment for each developer or tester, one for each task, of each project, seems unrealistic when each environment might require terabytes of storage.  As a result, developers and testers are limited to working in shared DEV or TEST environments which are refreshed only every few months, if ever.  Code quality suffers, testing suffers, and the pace of delivery fails to live up to business needs.

Server virtualization unleashed a surge of productivity in IT infrastructure over the past ten years.  But each virtual server still requires actual storage, making storage and data the constraint which holds everything back.

Data virtualization is the solution.

Come learn how to accelerate development, how to accelerate testing, and how to accelerate the pace of innovation.

Come learn how to inspire more “a-ha!” moments by eliminating the constraint of data under which the software development lifecycle has labored over the past decades.