[clipart]I like Mercurial (hg). The obvious advantage of a distributed version control system (dvcs) is that each repository is equal. Among many other things, this is useful if you ever need to restore a repository (and all of its history) from a failed hard drive.

Way back in the day (wow nearly five years ago!), I made the transition of my personal documents into Subversion. Around about the time my central Subversion repository no longer became accessible to me I decided to investigate Mercurial as a new option. One thing that I like is that the repos are accessible over http, so it's pretty easy to set up.

Last night I finally moved the bulk (95%) of my personal documents over to hg repos from a collection of duplicated directories of various historical states on two different computers. Whew, that's a load off my mind.

The next thing I want to think about is putting my digital media (photos and the like) into repositories. This may not be as crazy it sounds. Yes it's true that there are some systems out there built for this (rsync) and yes it's true, DVCS make some general assumptions:

  • Most files will be edited over time
  • Most files are text-based (source files)
  • Most edits will result in relatively small diffs
  • Having full history back to the beginning of time is useful

In general, for a Distributed Media Control System (DMCS), these four assumptions are totally incorrect:

  • The majority of files will not be edited/changed over time
  • Most files are binary
  • Even the smallest edit will result in a relatively huge diff
  • Having full history back to the beginning of time is not really useful

This means, in general, that files will take up a minimum of 2N their required space and that each edit (adding metadata, cropping a photo) will generally require another N of disk space. I briefly tested hg, git and bzr and they all do the same thing.

So if you consider disk space as being cheap (and it is), we can live with these things until a better system comes along.

For instance, every once in awhile (once a year) you can recreate the hg repo to remove unneeded history.

About the only thing that I'm miffed about is that Mercurial has a rather serious design flaw: Renaming/moving a file duplicates the file in the internal repo. This is totally wrong since the repo should just be noting that the path of the file has changed and nothing else. At least they're working on it.

NOTE: I'm not advocating using Mercurial to back up your videos just yet, but depending on the size of your digital photos I think it could start to make some sense.

§689 · February 23, 2010 · Uncategorized · · [Print]

Comments are closed.