Building app infrastructure in Elixir: Data/state store

Infrastructure engineering is something of a passion of mine. However, I strongly dislike how golang has taken over the world of infra. To work around this in my own way, I’ve been writing my own infrastructure stack – mahou (魔法). mahou is, of course, built on top of the “amyware stack” – namely, singyeong (신경) and crush. This is the start to a series of blog posts discussing actually building this out.

Current state #

Currently, mahou is the bare-minimum – you can deploy a container, but there’s no deployment state saved anywhere, nothing “smart” to go with it.

Building out a data store #

Wait, what? Why build out a new data store?

  1. I want to :D
  2. crush already has some desirable properties:
    1. key-value store
    2. distributed / some level of fault-tolerance
    3. ability to store previous values of a key and provide them all on request.

It turns out that the last point there is the most important one! In the land of Kube, it seems – and I could be mistaken, it’s been a bit since I was in Kube-land – that external tooling is relied on to make this work, such as Helm. So to deal with this, mahou is being built on top of crush, so that we can have this sort of functionality built in!

To make this as useful as possible for mahou, there’s a few things we need:

  1. Unlimited revisions of objects, so that we can roll back deployments to any point in time, if necessary.
  2. Some form of on-disk storage. This isn’t strictly necessary, especially since you can sorta-kinda-a-little-bit cheat around it by just distributing crush nodes across multiple zones and hoping that the speed of light + eventual consistency of delta CRDTs isn’t an issue for you, but it’d be nice to have. Let’s call it a stretch goal.
  3. The ability to fork deployment configs. While it may not be immediately obvious, this is really important. By adding this in, so much extra bespoke tooling that people build on top of Kube/etc. becomes mostly unnecessary! This would make it quite easy to add blue-green or maybe even rainbow deploys, stuff like forking a new staging environment off of some root one becomes super easy, …
  4. The ability to merge deployment configs. Since the previous point gives us easy forks, we should be able to merge forks into other forks or even the root. However, this is potentially problematic – patch theory is hard! While I can just look into something like Pijul’s sound patch theory, the problem of merge conflicts never goes away, thus requiring manual intervention. There’s not really an easy solution to this, sadly. I imagine that if I could solve all cases of this automatically, that’d be kinda big. I’m not quite that smart tho ^^; A “last-write-wins” model could work, but isn’t ideal.

So now that we know all of this, let’s start working!

Unlimited object revisions #

This is probably the easiest one. crush sets a maximum revisions constant:

@max_revisions 8

and checking against it when setting a key:
linked code

While this does work, it’s not optimal. At some point, I’ll want to be able to get the number of revisions of a certain key, and calling length/1 is said to not scale well. Although neither the Erlang nor Elixir docs go into detail thereabout, you can do some really unscientific testing with to figure out why:

iex(1)> l = []
iex(2)> length l
iex(3)> fn -> length l end
{5, 0}
iex(4)> l = for i <- 0..100_000_000,  do: i
# This will take a REALLY long time...
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21,
 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41,
 42, 43, 44, 45, 46, 47, 48, 49, ...]
iex(5)> fn -> length l end       
{221635, 100000001}

According to the docs, returns time elapsed in microseconds. That means it took nearly 222ms(!!!) to get the length of a list. But why? Lists are singly-linked lists, and the length isn’t stored therewith, so it has to do a full list iteration to get the length.

Of course, in the scales that crush would actually be used in, it’s highly-unlikely to be an issue:

iex(6)> l = for i <- 0..20_000, do: i 
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21,
 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41,
 42, 43, 44, 45, 46, 47, 48, 49, ...]
iex(7)> fn -> length l end 
{46, 20001}

I think 46 microseconds to traverse a list to get the length is a price I’m willing to pay. It can be revisited in the future if it turns out to be problematic.

Now let’s actually write some code!

This can be done in a few steps:

  1. Remove @max_revisions
  2. Remove patch count calculation + Enum.take
  3. Write {value, [next_patch | patches]} to the store

And by inspecting the output when fetching a key, I can see it worked!

  "test one",
  [[:eq, "test "], [:ins, "tw"], [:eq, "o"], [:del, "ne"]],
  [[:eq, "test "], [:del, "tw"], [:eq, "o"], [:ins, "ne"]],
  [[:eq, "test "], [:ins, "tw"], [:eq, "o"], [:del, "ne"]],
  [[:eq, "test "], [:del, "tw"], [:eq, "o"], [:ins, "ne"]],
  [[:eq, "test "], [:ins, "tw"], [:eq, "o"], [:del, "ne"]],
  [[:eq, "test "], [:del, "tw"], [:eq, "o"], [:ins, "ne"]],
  [[:eq, "test "], [:ins, "tw"], [:eq, "o"], [:del, "ne"]],
  [[:eq, "test "], [:del, "tw"], [:eq, "o"], [:ins, "ne"]],
  [[:eq, "test "], [:ins, "tw"], [:eq, "o"], [:del, "ne"]],
  [[:eq, "test "], [:del, "tw"], [:eq, "o"], [:ins, "ne"]],
  [[:eq, "test "], [:ins, "tw"], [:eq, "o"], [:del, "ne"]]

Well that was easy enough.

Adding a new route to fetch info about a key is also pretty easy:

def key_info(conn, %{"key" => key}) do
  revision_count =
    case Store.get(key, :all, false) do
      {_, revisions} ->

      nil -> 0

  info = %{
    key: key,
    revision_count: revision_count,

  json conn, info

And it works pretty well:

git:(master) X | ▶  curl localhost:7654/test/info
{"key":"test","revision_count":11}                                                                                           git:(master) X | ▶  

Forking keys #

This is one of those things that sounds way harder than it actually is. However, there are a few things to take into consideration:

These are essential for a proper fork-merge solution.

“But wait!” I hear you cry, “You’re basically just inventing DVCS but like 10x worse!” And you’re right! This is slowly turning into a DVCS. So why not just embed Git or Pijul or …? Well, the issue with that is that I don’t really need the full DVCS feature set here, just basic forking and merging. Merging is a very complicated thing to work out, and the odds are that no auto-merge solution will work. That’s a problem for end-user tooling, not the crush server.

Beyond that, BEAM libraries tend to be… out of date. Take gitex for example. Very-low activity and goes without commits for long periods. It doesn’t inspire confidence about building on top of it. Worse, it’ll tie me to a specific disk format, which I’m not quite ready to commit (hah) to yet.

I’ll dig into this in the next post ^^


Now read this

NFS sucks

I built a NAS recently. It’s a nice little box: I rackmounted an HPE ProLiant DL325 G10 (EPYC 7402, 64GB, 2x 240GB SSD) with a ES212X12 JBOD to hold 12x 8TB disks. Installing TrueNAS was pretty painless, convincing the ProLiant to be... Continue →