Thursday, November 17, 2011

Data-tier Applications, DAC Framework and SQL Azure Integration with a Continuous Integration (CI) Process

:: Introduction

Leveraging the efficiency and cost savings of the "cloud" within an enterprise requires the use of specific tools applicable to the PAAS provider used. When it comes to working with SQL Azure (Microsoft's cloud based relational database engine), I have found that data-tier applications along with some supporting deployment infrastructure (PowerShell Scripts) work particularly well to synchronize both the database changes across a development team and to deploy those changes to other testing environments as they make their way to a production system.

:: Data-tier Applications

Data-tier applications are a project type within Visual Studio 2010 that allows development teams to manage database changes across ALL of their environments whether in the cloud or on-premiss. Developers work with individual SQL scripts within the project which define anything from table schema's to indexes to stored procedures. Once a change has been made to a data-tier application's scripts, the entire project can be compiled and a DAC Package (called DACPAC from here on out) is then created that can be deployed to a SQL Server 2008 R2 or SQL Azure database instance.

For details on how to create and setup a data-tier application for the first time see this link

:: DACPAC Files

A DACPAC is simply a file with a ".dacpac" extension that contains all the schema, stored procedures, users and policies required to setup a new database or upgrade an existing one. Note that DACPACs should not be confused with BACPACs. BACPACs are a newer format put forward by Microsoft and the file format can contain everything a DACPAC can but can also include data. 

There is also a version number (with a format of x.x.x.x) embedded within the DACPAC file which can be used to version the DACPAC. Unfortunately this version number is not updated automatically by project when the data-tier application is modified. Therefore a developer will need to increment it manually before deployment. To access and modify this version number right-click on the data-tier application project and find it under the "Project Settings" tab.

:: Deploying Data-tier Applications

Before a data-tier application can be continuously upgraded with a newer version (the typical workflow of any development team), it must first be deployed. The deployment essentially creates the initial database, sets it up as a data-tier application within the database engine (either SQL Server 2008 R2 or SQL Azure) and then goes about deploying all the schema, store procedures, users... etc that are contained within the DACPAC.

The DACPAC can be easily deployed from within Visual Studio by right-clicking the data-tier application project and selecting deploy (note that the connection string within the data-tier application project will need to be setup before doing this). Alternatively you can use SQL Server Management Studio (SSMS) or some PowerShell commands: How to: Deploy a Data-tier Application

Note that the above deployment is really just a one time operation. Once the initial setup has taken place and the data-tier application is deployed, you'll only need to upgrade the data-tier application from then on (unless you delete the database and start again which is handy for certain developers who might break things and need to start afresh).

:: Upgrading Data-tier Applications

There are a variety of mechanisms available to deploy a DACPAC file to a SQL Server 2008 R2 or SQL Azure database instance. Unfortunately the documentation available for these mechanisms is fairly minimal and therefore it makes it difficult to assess what mechanisms to use for each deployment scenario you may have within your enterprise. Here are the ones I have come across, the way to invoke them and the scenario(s) I think they are applicable to:
  • Within Visual Studio
    • Invoke: Within Visual Studio right-click on the data-tier application project and select "Deploy". Note that Visual Studio figures out
    • Details: Depending on the current configuration (e.g. Debug/Release) the appropriately set connection string with be used when the project is deployed. Therefore the connection string can be changed to access a local SQL Server 2008 R2 database instance or a SQL Azure database instance
    • Scenario: This deployment scenario is most applicable to developers working locally during  a typical development workflow of repeatedly modifying code, compiling and deploying it and then debugging the result. I highly recommend that developer's use a local instance of SQL Server 2008 R2 for two reasons:
      • No data transfer costs are accrued as ALL data transfer occurs on a developer's local machine and not back and forth between SQL Azure and their local machine.
      • DACPAC deployments are much faster and more reliable when performed locally since deploying a DACPAC to SQL Azure can take minutes/hours depending on the connection and the database connections can be dropped mid-deployment (this won't cause an error but the deployment will need to be restarted)
    • This 20min video, by a program manager on the Microsoft SQL Server Developer Manageability team, explains a range of information about data-tier applications and their use within Visual Studio.
  • Within SQL Server Management Studio (SSMS)
    • Invoke: Within SSMS open the "Management" folder and then the "Data-tier Applications" node. Right-click the data-tier application you would like to upgrade and select "Upgrade Data-tier Application..."
    • Details: Just follow the on-screen instructions making sure to save the SQL Script that the DACPAC uses to apply the changes to the database instance. On some occasions the upgrade wizard has failed but because I saved the SQL Script I could execute it myself within SSMS.
    • Scenario: This is the easiest way to get started when deploying a DACPAC to SQL Azure since SSMS provides a easy to follow wizard. Note however that I have seen this entire process take 1-2 hours to upgrade a database (40 tables + 50 stored procedures). Being able to automate this step is very helpful which the next section on PowerShell scripting will explain.
  • Using a DAC Framework enabled PowerShell Script
    • Invoke: You'll need to create your own PowerShell script to manage the upgrade process of one or more of your data-tier applications residing on SQL Server 2008 R2 or within SQL Azure. Note that if you are unfamiliar with PowerShell and the DAC Framework creating a robust and reliable script will take significant development effort.
    • Details: I created my own parameterized PowerShell script that accepts a DACPAC file and some credentials to a SQL Azure database, subsequently connects to it and then upgrades the existing data-tier application. Two sources of information that I used when creating my PowerShell script were the following:
    • Scenario: This is the most appropriate method for building an automated process for managing data-tier applications that can be used by a continuous integration (CI) system.  
    • Note that side-by-side upgrades are the older way to upgrade a data-tier application and were used with the DAC Framework v1.0. There is a newer upgrade process called in-place upgrades and is supported by the DAC Framework v1.1. I decided to use the in-place upgrade since it doesn't create copies of the "pre-upgrade" database. In order to perform an in-place upgrade using PowerShell make sure to use the DacStore.IncrementalUpgrade Method and not the DacStore.Upgrade Method which is now obsolete. Also note that the DacUpgradeOptions passed into the DacStore.IncrementalUpgrade Method control things like whether or not to ignore data loss, ignore drift, skip policy validation and/or perform rollbacks on failure. Make sure to select these options properly for each environment that you're upgrading (i.e. you may not care about drift on your local or CI databases but in your production database database drift should be a cause for concern).
:: Data-tier Application Upgrade Architecture across Environments

The following diagram shows how I used a data-tier application's DACPAC output to upgrade various SQL Server 2008 R2 and SQL Azure database instances across various environments (from local development through to production).


Note that only the changes to the SQL scripts within the data-tier application in Visual Studio are committed (not the DACPAC output that is built locally). The DACPAC is rebuilt on CI using NAnt and MSBuild and is used to immediately upgrade the CI's SQL Azure database instance. The same DACPAC is then archived and used in subsequent deployments for Staging, UAT and Production. The reason behind this build architecture of "build once and deploy to all subsequent environments" is outlined in the book called Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation.

Also note that the "Upgrade" step in both the CI and Staging/UAT/Production environments is performed via the parameterized PowerShell script mentioned in a section above.

10 comments:

Andrew Thompson said...

Additional details about why some upgrade operations take so long to complete when performing them locally can be found here: http://stackoverflow.com/questions/7213316/why-does-a-sql-azure-dacpac-upgrade-via-a-powershell-script-consistently-take

Omri Gazitt said...

Andrew, this is very cool. Any chance you could post your PS scripts for those of us that need to do the same thing? :-)

Thanks!

Omri.

Andrew Thompson said...

Hi Omri,

Slide #42 on this post http://andrewmatthewthompson.blogspot.ca/2012/01/presenting-at-vancouver-windows-azure.html will have all the PowerShell commands you require to setup an initial script to deploy an upgrade to a Data-tier Application using a DACPAC (locally on SQL Server or in the cloud using SQL Azure).

Omri said...

Awesome - thanks Andrew! I have a very similar setup to yours (although I wrote the DAC package "by hand" in VS as opposed to extracting a DACPAC).

What version of DACfx do you recommend using? I applied the SMO patch (10.51.2500.0) as the MS SQL team recommends, but I still can't do basic things with DAC, like upgrade a SQL Azure DAC from the management node in SSMS.

Andrew Thompson said...

Hi Omri,

I'm using SQL Server 2008 R2 SP1 on the box that uses the PowerShell scrips to deploy the DACPAC to my SQL Azure database. That version of SQL Server translates into the version number 10.50.2500.0 which is slightly different from what you quoted above. Here is the link for that version of SQL Server: http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=26727

Also what .NET method are you using to upgrade your DACPAC, is it Upgrade(...) or IncrementalUpgrade(...)? DacFx 1.0 only had the Upgrade(...) method which performed a side-by-side copy when upgrading. DacFx 1.1 now has the IncrementalUpgrade(...) method which performs an in-place upgrade and should be used instead. More details here: http://msdn.microsoft.com/en-us/library/microsoft.sqlserver.management.dac.dacstore.incrementalupgrade.aspx

Omri said...

Thanks again for the info, Andrew! I also have SQL2008 R2 sp1 on the dev box. The higher version number for DACfx that I have is because I installed a Microsoft-recommended SMO download (http://stackoverflow.com/questions/8918964/error-deploying-a-dacpac-to-a-sql-azure-server/9089294#9089294).

I've only tried to use SSMS to deploy/upgrade DAC's - never tried programmatically.

On advice of the SQL Azure team I recently installed the new SQL Server Data Tools (just shipped with SQL2012) - http://www.microsoft.com/web/gallery/install.aspx?appid=SSDT

The package is a drop-in for VS2010 sp1, has far better tools for SQL Server than the previous (database-agnostic) toolset, and comes with a new DACFx that seems to work better. SSDT will auto-upgrade your DAC projects, and allow you to create publish profiles (just like ASP.NET) and publish straight from VS (I have a publish profile for each of my dev environments and for production). Under the covers it computes deltas and uses DACFx to push any non-destructive schema changes to the database. Right now this appears good enough that I may not need to write the PS scripts after all...

Andrew Thompson said...

(Adding in this comment as it made it to my email inbox but was never posted to this post's comment section).

---------

Omri has left a new comment on your post "Data-tier Applications, DAC Framework and SQL Azur...":

Thanks again for the info, Andrew! I also have SQL2008 R2 sp1 on the dev box. The higher version number for DACfx that I have is because I installed a Microsoft-recommended SMO download (http://stackoverflow.com/questions/8918964/error-deploying-a-dacpac-to-a-sql-azure-server/9089294#9089294).

I've only tried to use SSMS to deploy/upgrade DAC's - never tried programmatically.

On advice of the SQL Azure team I recently installed the new SQL Server Data Tools (just shipped with SQL2012) - http://www.microsoft.com/web/gallery/install.aspx?appid=SSDT

The package is a drop-in for VS2010 sp1, has far better tools for SQL Server than the previous (database-agnostic) toolset, and comes with a new DACFx that seems to work better. SSDT will auto-upgrade your DAC projects, and allow you to create publish profiles (just like ASP.NET) and publish straight from VS (I have a publish profile for each of my dev environments and for production). Under the covers it computes deltas and uses DACFx to push any non-destructive schema changes to the database. Right now this appears good enough that I may not need to write the PS scripts after all...

Brian Reynolds said...

Great article bringing together databases, Azure, and CI/CD!

Andrew Thompson said...

No problem Brian, I'm glad it helped you think through whatever database migration/Azure/CI task you're currently exploring or working on.

One thing that my team and I have discovered over the last few months is to set the IgnoreDataLoss property on the Microsoft.SqlServer.Management.Dac.DacUpgradeOptions object (if you're automatically deploying using PowerShell or your own custom C# program). Things work great 99.9% of the time (like for us it worked on thousands of deployments until it didn't) but if for some reason you modify a table's schema in a way that the DACPAC doesn't fully understand you could incur dataloss. So to mitigate this problem simply set that property to false which will block the DACPAC upgrade process if data loss is going to occur (huge safety hear which I would highly recommend).

Andrew Thompson said...

Note that more details about the use of PowerShell and Upgrade Options can be found on one of my other blog posts: http://andrewmatthewthompson.blogspot.ca/2012/01/presenting-at-vancouver-windows-azure.html