ZFS (for Zeta File System) is probably one of the best filesystem made. You can easily create/destroy/move data across different servers and datastore. You can also easily check if your pool or your data are corrupted. Some new awesome feature will appear in next month and year, like cryptography … Okay, nice, we have a big picture, an amazing well designed filesystem. Why Erlang?
One key feature of ZFS is the replication part. When you are using ZFS you can snapshot your data and send it in another place. You can make more snapshot, and send them in incremental fashion. Pretty nice isn’t it? I just love this feature, and I use it everyday. I use it for all my personal backup but also to replicate my data across different servers.
Long time ago, I was thinking about this piece of technology, and tell how it would be great if we had something like a ZFS Stream proxy, a tool taking snapshot, ensure all is good, and store them automatically in one or more servers.
In December 2017, speaking with one of my colleague, we’ll found this same idea… Wait. I can do this project now! I’m using a good high level language (Erlang), I’m using some low-level programming language (ASM/C), why not trying to create my own ZFS library, in Erlang?
# Proof of Concept and Resources
Before starting a project from scratch, for the love of feature and glory, we need to take a look on zfs and probably all tools around it. I’m not the only one who want to parse and compose/decompose data from ZFS Stream. First thing first, where can I found ZFS source code? ZFS was designed by Sun Microsystem, now owned by Oracle. We have 2 implementations of this filesystem. One with closed source, the Oracle one, and one with opened source, OpenZFS, coming from OpenSolaris.
- OpenZFS
- FreeBSD
- ZFS on Linux (ZoL)
We have sources, we need a little more, some documentation and good references about how all those bricks works togethers:
- OpenZFS wiki
- ZFS documentation at Oracle
- ZFS at youtube
- FreeBSD Design
Good! We have a good vision on this ecosystem. Now, I will try to list all important tools directly linked to ZFS Stream, in short, zfs send/receive functions.
- zfs send/receive source code
- zfsdump
We have C source code from official OpenZFS tools and we have also zfsstreamdump, this tool is pretty awesome and will help us to disassemble our dump from scratch without knowing anything about specification.
The goal of this PoC is to have just an Erlang snippet doing same works as zfsstreamdump, extract information from a zfs stream and print them (or in this particular case, create an high level abstraction based on Erlang data-structure).
Here my code
My code works as expected, I have now the power to parse and print ZFS Stream in Erlang VM!