As originally discussed here and reported on here, I keep most of my work in Subversion repositories. Since starting doing this it's been much easier to migrate from one machine to another and continue working. However, there are down sides to using Subversion. I thought I would lay out what I would consider to be the best method of storing/accessing my documents.

First, a quick description of what Subversion is: Subversion is a revision control software that allows multiple people to check out and make changes to documents from a central repository. For convenient use, it requires the centralized server to be always running and for users to have network access.

The problems I face deal mostly with the centralized aspect, really. First, I'm running a Linux server at home that is not always available (sometimes I boot it into Windows to get some video conversion/media editing done). Second, I am collaborating on several projects at once which means I have to access several repositories, some of which are not under my control - everything in Subversion hinges on the a single server that contains the repository.

Requirements

So ideally, I'd like the revision control software to be:

  • Free And Open Source - 'nuff said
  • Cross-Platform - where Platforms = {Windows, Linux, MacOS }
  • Fast & Efficient - for all operations (renames, branching, commits, etc)
  • Accessible - Means the ability to easily browse the repository (including older versions) and do diffs. It would be nicer if one could do this without having to download the whole repository too (i.e. via a web client like WebSVN.
  • Secure - Means that the repository should password-protected and transactions must be encrypted
  • Ability to Lock Down - Means the ability on a per directory (or even per-file basis) and per user basis to set permissions: Read-Write, Read-Only, or None (not even visible to that user). Some projects I'd like to collaborate on, some I'd like to share, but some are strictly private (like financial information).
  • Decentralized - If one machine goes down all other users/machines can still continue working and checking in. This inherently means that all users/machines get a full copy of the repository, but I'm ok with that.
  • Available Offline - I ride the train a few times a week now, so I need to be able to work offline. This rules out things like Google Docs or similar simple facilities.
  • Ability to Backup - The problem with SVN is all those .svn directories, one per folder. If I want to do a backup, I have to strip those out, which is a pain and time-consuming. I'd like the ability to just copy the whole repository to some media for archiving.
  • Ability to Purge - All repositories eventually become large in size because they maintain the history of every single file since the beginning of the repository. I'm going on three years now and I really haven't ever had a need to go back more than a couple months to look at a version of a file. I think a purging mechanism would be nice to keep the repositories relatively low in space. Of course the first time I need to go back to an old version that's no longer available, I'll probably end up cursing this idea and yelling "Khaaaan!" like William Shatner.

So actually Subversion comes close to my requirements, but fails on the decentralized thing. Git is sounding better. I may have to give it a shot one day as their tools mature.

Does anybody have any experience with organizing things like code, documents and photos with a version control system? Any recommended software?

§405 · November 8, 2007 · Questions, Software, Technology · · [Print]

Leave a Comment to “Ideal Life Sort”

  1. D. Moonfire says:

    I haven’t used Git personally, but I’ve pretty much fallen into the Subversion camp because of features SVN does have. I will admit, I don’t have the backup problem with SVN that you have, mainly I just use “svn export” in a temp directory and then tarball that.

    I don’t know if Git supports it, but I use the properties of SVN pretty heavily. For CuteGod, I set a property on the file for the creator, the url I got it, and the license I can use with it. Then I have a little program that goes through and creates the credits and background music files from those properties. Yeah, its a bit overkilled, but I like having it there. I also use properties for photographs, mainly so I can pull it out and group them as appropriate, I use taxonomies rather heavily in my work.

    Speaking of locking things down, I’ve never had a problem with that. The advantage of being able to use Apache lets you use the full username/password and allow/deny system to enable or disable use. We used it here with a couple of our code-on-demand programmers. I also like being able to run scripts on commits, we have a few processes that IM people when certain files are checked in, send out emails to anyone listening, or add an entry into Bugzilla for updates.

    The distributed bit is probably the hard one. I used ‘darcs’ but I ended up not liking it. Bazar looked interesting, but never really got into it either. Fortunately, for me (and my company), distributed never really came into play.

    It will be interesting to see what you end up using, though. I full admit my system of choice (Subversion) isn’t for everyone but I’m also interested in seeing what others pick.

  2. tabrez says:

    There is no versioning, tagging etc with cvs or subversion without a valid connection to the repository/server. Git and Bazaar score heavily over them if you want the team members to work independently for long periods of time still be able to do versioning. There are many other strengths of git and bazaar but I find this especially attractive.