Versioning static data

Versioning static data is not a common feature supported by various database version control systems due to the additional complexities involved. Therefore, developers avoid storing static data in SQL Server databases or do not version data in the tables in the same way as application's code or database schema being versioned.

No more! Static data is fully supported in DB Version Control and we will show how to version control static data in your SQL Server database.

We will continue using a previously created project within DB Version Control with the AdventureWorks sample database as well as Github Desktop for the file versioning.

Version control static data tables

Which data should be version controlled? Usually, it is data that does not change often and cannot be changed by the users. For example, lookup tables, application configuration data, etc.

To get started, go to the Static data tab for the project in DB Version Control, tick Person.AddressType table to version control its data.

Next, go to Changes tab and click Get changes - a new change will be found - clicking on the row will actually show a grid with data (with source and target columns side by side). Checkboxes on the first column allow to include only some of the rows to be version controlled!

Finally, click Apply changes and (once done) switch to your preferred version control system. In our case, 3 files changed:

  • Database.sqlproj (modified) - the settings file that indicates that Person.AddressType is now being version controlled
  • Migration script (new) - a SQL script to put data in Person.AddressType table
  • Static Data\Person.AddressType (new) - a scripted file for this table that includes SQL statements with INSERT INTO

Similarly as with schema changes, a migration script is used to move from one database version to another, while the scripted file is used to easily see the changes in the table's data.

Even more, the automatically generated migration script can include schema changes and all these changes will be applied in a single SQL transaction.

That's it! Commit changes to the version control - DB Version Control now tracks changes in this table automatically and Migration runner deploys them.

Settings

There are a couple of settings available in the Static data tab:

  • Table scripting format - default, XML or JSON. The default scripts table's data as INSERT INTO statements. This option works well when the static data tables are simple (a few columns) and static data tables are not related to each other. XML/JSON option scripts static data as XML/JSON and allows seeing the differences in the static data easier - we will cover these options in this tutorial below
  • Include referenced tables automatically - this option indicates that all related tables should be included automatically according to foreign key relations (otherwise it may be impossible to re-create data from migration scripts)
  • Ignored columns - some columns could be excluded from being versioned. For example, LastModified, ModifiedDate or similar columns that may change even thought such changes should not be tracked

XML/JSON scripting format

What is the difference between default and XML/JSON formats? Instead of scripting data as INSERT INTO statements, the scripted files contain data in XML/JSON format. Remember, that these scripted files are only used to better understand the change, so there is no requirement for them to be SQL statements.

The only difference between XML and JSON scripting formats is that they script the table's data into XML or JSON file formats respectively. 

Let's try switching Person.AddressType to XML format.

Leave all other options for now as is and switch to Changes tab and click Get changes:

Similarly as before, a new change for static data table is found - clicking on the row opens the table in a side by side viewer, where data is displayed in a hierarchical format.

Clicking Apply changes will make a couple of changes to the scripted files:

  • Database.sqlproj (modified) - the settings file now contains information that Person.AddressType should be scripted into XML
  • Migration (new) - this is a SQL script with statements how to fill in Person.AddressType table (INSERT INTO statements)
  • Static Data\Person.AddressType.xml (new) - a XML file with table's data converted into XML

 XML/JSON settings

 There are additional settings available for XML/JSON data scripting format:

  • Split into files - instead of having a single large XML file for table's data, the file could be split (partitioned) into multiple files based on the column's value (for an example, ID, CategoryID, etc.)
  • Column formatting - the columns that contain XML/JSON data could be included into the scripted data files "as is". For example, if a varchar column contains JSON documents, these documents would be encoded and embedded into XML/JSON files when scripting the table's data. This setting allows the document to not be encoded allowing easier change tracking

Table relations for XML/JSON data scripting format

Having a normalized database design is great for SQL Server but it may get difficult to understand changes in those tables for us, humans. DB Version Controls allows defining parent/child relations between tables for easier change tracking.

Let's take an example - include Production.Product table and it's required related tables (those tables are included automatically once Production.Product checkbox is ticked).

Next, define the relations between these tables ([R] - referenced tables, [C] - child tables) at the bottom of Static data tab as below - only Production.ProductSubcategory, in this case, is a child table for the Production.ProductCategory table.

Next, click Get changes:

Clicking on Production.Product row shows how Production.Product has related tables' data included in a single file.

A similar view can be seen in your preferred version control system.

Split into multiple files

Next, let's split products data file into multiple files - the column which is used for the partitioning should be a non-null column and should be of a simple type (e.g. integer or small length string) as the column's value will be included in the scripted file name.

For example, the ProductID column was chosen, we would get a single file per product.

Include XML/JSON documents in the scripted files

The Production.ProductModel.xml scripted data file contains Instructions column which contains XML fragments. However, because we scripted table's data into XML file, the XML fragments are encoded in order to keep the correct XML file format:

This makes it difficult to see the changes in the Instructions column.

If we go back to Static data tab, find Production.ProductModel table and under Column formatting column choose XML for the Instructions column - after the changes have been applied and scripted into the files, data would look much easier to understand:

Understand the change

Let's try an example that illustrates how XML/JSON data scripting format helps understand the change for us, humans.

In Production.ProductSubcategory table, we changed the value of ProductCategoryID column from 1 to 4 for a row with ProductSubcategoryID = 1 - basically, we changed Mountain Bikes subcategory from Bikes to Accessories.

Here is how the change looks like if the default scripting format is used:

A similar cryptic change appears in the version control system's window (it may be easy to understand the change once it is done but it may be much harder to remember after a couple of months):

What if we used XML format with relations defined?

And how it looks like in the version control system:

We think it is much easier to understand the change - now we can clearly see that the category of the Mountain Bikes changed to the Accessories product category.

Summary

Versioning static data should be as easy as versioning database schema objects. Static data supports transactions and even supports easier visualization using XML/JSON formats:

  • Nesting multiple data tables
  • Splitting large data tables into small files
  • Ignoring changes in specific columns 

Similarly as with schema changes - a migration file is used to move from one version to another, while the scripted data file is used to easily see what changed and does not have to be a SQL file.

Leave a comment