Adam Gundry

The appraoch to data schema migration relies on two DSLs: one that describes the data model, and one that describes the changelog. As most data model schemas, it’s typed. Here’s an example.

User
  = record
      name :: Username
      admin :: boolean

Username
  = basic string

If you were to add a new record

User
  = record
      name :: Username
      admin :: boolean
      logins :: LoginCount

Username
  = basic string

LoginCount
  = basic integer

You could express the change in the following way:

version "0.2"
changed record User
  field added
    logins :: LoginCount
    default 0
  added LoginCount
    basic integer

Question: Is this autogenerated or written by a programmer?

Answer: In many cases yes, it’s autogenerated, but you also have to option to manually author changelogs as well.

With these in place you can do some analysis about whether, for example, you can get from an older version to the latest version. Futhermore, you can interpret the changelog and produce a migration that you can run on your data. If the changelog language does not support a particular operation you need, there’s a release valve built-in that allows you to run arbitrary Haskell code within an upgrade.

All records in the system are stored with a schema version. So the upgrade process (which the presenter notes is a “Stop the World” upgrade) proceeds as follows:

export old applcation’s dataset as JSON
install new version of application
find the version number in the changelog
apply changes to schema and data in parallel
check resulting schema is as expected
import json dataset into new application

The result is easy changes are very simple and straightforward, with validation identifying erorrs before they’re executed.