|The more-sophisticated device-mapper targets require complex metadata
|that is managed in kernel. In late 2010 we were seeing that various
|different targets were rolling their own data structures, for example:
|- Mikulas Patocka's multisnap implementation
|- Heinz Mauelshagen's thin provisioning target
|- Another btree-based caching target posted to dm-devel
|- Another multi-snapshot target based on a design of Daniel Phillips
|Maintaining these data structures takes a lot of work, so if possible
|we'd like to reduce the number.
|The persistent-data library is an attempt to provide a re-usable
|framework for people who want to store metadata in device-mapper
|targets. It's currently used by the thin-provisioning target and an
|upcoming hierarchical storage target.
|The main documentation is in the header files which can all be found
|The block manager
|This provides access to the data on disk in fixed sized-blocks. There
|is a read/write locking interface to prevent concurrent accesses, and
|keep data that is being used in the cache.
|Clients of persistent-data are unlikely to use this directly.
|The transaction manager
|This restricts access to blocks and enforces copy-on-write semantics.
|The only way you can get hold of a writable block through the
|transaction manager is by shadowing an existing block (ie. doing
|copy-on-write) or allocating a fresh one. Shadowing is elided within
|the same transaction so performance is reasonable. The commit method
|ensures that all data is flushed before it writes the superblock.
|On power failure your metadata will be as it was when last committed.
|The Space Maps
|On-disk data structures that keep track of reference counts of blocks.
|Also acts as the allocator of new blocks. Currently two
|implementations: a simpler one for managing blocks on a different
|device (eg. thinly-provisioned data blocks); and one for managing
|the metadata space. The latter is complicated by the need to store
|its own data within the space it's managing.
|The data structures
|Currently there is only one data structure, a hierarchical btree.
|There are plans to add more. For example, something with an
|array-like interface would see a lot of use.
|The btree is 'hierarchical' in that you can define it to be composed
|of nested btrees, and take multiple keys. For example, the
|thin-provisioning target uses a btree with two levels of nesting.
|The first maps a device id to a mapping tree, and that in turn maps a
|virtual block to a physical block.
|Values stored in the btrees can have arbitrary size. Keys are always
|64bits, although nesting allows you to use multiple keys.