Introduction to Subversion

What is Subversion

SVN, or "Subversion", is a source code management system that aims to be a compelling replacement for CVS, which we have used in prior years. It is a tool that allows a group of programmers to work together on the same collection of source files without getting in each others' way any more than necessary. It also keeps track of the change history of the collection and makes it possible to examine the way things looked at arbitrary times in the past. And finally, by keeping track of what is and is not part of a project, it helps programmers maintain the organization of large projects. If you have used CVS in the past, you will find that Subversion is quite similar, although somewhat more sane. If you are familiar with CVS and concerned about the differences in migrating to SVN, there is a special appendix of the Subversion Book specifically for you!

Why source management?

In most coursework in computer science, each assignment is a distinct unit: you sit down and code something up, you hand it in, it gets graded, and you immediately forget about it or even throw it away. In this environment, a source management system is not really necessary and may, if imposed by course staff, seem like a waste of time.

In the real world, however, programs are large and expensive to develop, and so their life-cycles are measured in years or sometimes decades. Over these time scales, and with such large amounts of code, just keeping track of everything becomes a major problem.

Worse, in the real world, programs have users, who are not part of the development team and not (generally) interested in internal details of the program. Usually, someone insists that now and then a new version be made available to the users. The development team has to be able to issue these releases, and then also has to be able, for years afterwards, to handle bug reports and sometimes issue fixes. In this environment it is imperative to be able to go to some central place and get a copy of the precise release you need.

And finally, when you have a number of programmers working on the same program at once, it's essential that some mechanism be put in place to allow them to coordinate their work. Otherwise, each programmer's version slowly diverges from the others, and eventually everyone has a private version different from everyone else's... and none of them work. Once this happens, it takes an immense amount of effort to straighten out the mess.

Source management (or version control) systems are designed to help programming teams handle these issues.

In cs372, the probable lifetime of your project is a few months, not a few years, and most likely no more than two people will be working on it at once. Furthermore, the OS/161 source you will be working with is several orders of magnitude smaller than a large real-world project. Nonetheless, it is large enough, the time is long enough, and there are enough people involved that failure to use some kind of source management system would be an act of reckless folly.

Why Subversion?

We require the use of SVN in cs372 because it is reasonably powerful, freely (and widely) available, and commonly used. Many large open-source and proprietary projects are managed using Subversion.

If you are familiar with another source management system (such as CVS), Subversion should be easy to understand and use.

Remainder of This Document

The rest of this handout is divided into two main sections. The first explains the philosophy of Subversion, its operating model, and the assumptions behind the way it works. The rest explains, in terms of how one actually uses Subversion rather than its various commands, a number of basic and not-so-basic Subversion operations. A small additional section lists the main Subversion commands.

You do not need to remember everything in this handout. In fact, this handout was written mostly so you do not need to remember all this.

The World According to Subversion

Working Model

The Subversion model assumes that there is one central official master copy of everything. It is called "the Subversion repository." This copy is managed by Subversion and is not meant to be touched directly.

Instead, when you wish to work on a Subversion-managed project, you "check out" your own private copy. Since work on a program is an ongoing process, and other people may be working on the same program at the same time, you generally want changes in the official master copy to appear in your private copy. To facilitate this, the private copy is connected back to the master copy. Some people think of this as a "subscription" -the private copy is signed up to receive copies of updates that are made to the master copy.

Note, however, that these updates do not happen automatically. You must explicitly update your private copy when the master copy changes.

When you transfer ("commit") changes in your private copy to the master copy, the master copy is updated and other developers will then see those changes when they update. Changes you make in your private copy that you do not commit do not officially exist and will not be seen by anyone else until you commit them.

You can have as many working trees (private copies connected to the master copy) as you want. Often you will have only one, but circumstances may arise in which it is more convenient to have two, or to make temporary ones, or whatever. Just remember that they are all independent: while they're all connected back to the master copy, they are not connected to each other except through the master copy.

Repository

The official master copy is known as the "Subversion repository". Subversion provides a variety of ways of accessing the repository. In this course, we are providing repository access using Subversion over HTTP, meaning that your Subversion development repository URL will look something like https://tashi/svn/<username>, where <username> is your department username. You should not have to muck with the repository by hand, and in fact you cannot for this class. If you really bork something, please let us know and we can try to help you.

Merging

Unlike older source control systems, Subversion does not lock files for modification. Instead, Subversion works from a model where everybody edits freely and changes to the same file are merged. Subversion does not support locks at all.

In the merge-based model, anyone can edit any file at any time. This is both an advantage and a disadvantage: if two people have small unrelated changes to make in the same file, they can do so without any difficulty. On the other hand, if two people make sweeping changes to the same file at once, the resulting merge becomes a nightmare.

(Another major advantage of merging is that when there are no locks, nobody can hold up development by leaving on vacation while holding a lock on a critical file.)

The way Subversion merging works is as follows: when you check out a working tree, Subversion remembers a global version number describing the tree you got. When you update your working tree, it updates the global version number. When you go to commit, if the version your changes are based on is not the latest one, Subversion aborts and tells you to update first.

Then, when you update, Subversion notices that you have changed your copy and the master copy has changed as well. It then tries to merge the two sets of changes. If the changes are to unrelated areas of the file, this usually succeeds. If the changes overlap, or the merge program gets confused, Subversion will ask you what to do. If you do not choose to accept the repositories changes ('tf', or "theirs-full"), push your changes entirely ('mf', "mine-full"), the safest thing to do is to simply 'p' or "postpone". This will mark effected files in a "conflicted" state and insert blocks that look like this:

@@ -1 +1,5 @@
  int foo(void) {
+<<<<<<< .mine
+   bar();
+=======
+   baz();
+>>>>>>> .r32

This means that your copy of foo.c changed function foo to call bar, but that the official master copy, in revision 32, changed foo in the same place so it calls baz.

When you get merge conflicts like this, you need to resolve them before committing your new versions. You might pick your version, or the latest version from the repository, or some combination of the two, or whatever. When doing this, it's up to you to make sure you do the right thing. Once you are finished you need to run the svn resolved command to indicate to Subversion that you have successfully dealt with the conflicts and whatever is left is ready to be checked in to the repository.

Some notes:

Even when there are conflicts, the conflict blocks do not necessarily reflect all the changes associated with the merge. Some may have merged successfully. If in doubt, look at diffs.
It is also not always the case that everything that may be involved in resolving a conflict correctly is contained within the conflict block delimiters. The merge program is only a program, not an omnipotent human being.
While the merge system is reasonably robust, once in a while it makes a mistake, particularly if some but not all of the changes merged. It's prudent to look at diffs after an automatic merge, just in case.
Merging is painful. Merging a big change is a lot more painful than merging the same amount of change a bit at a time. Update early and often. Commit early and often.

If you are planning to make huge changes to a file, like reordering all the functions or moving large blocks of code into if clauses (which changes the indent, making Subversion think everything changed), it's a good idea to coordinate manually with anyone else who might have pending changes to the file.

Log Messages

When you commit changes to the Subversion repository, Subversion gives you the opportunity to provide a message explaining the change. These messages get saved in the Subversion file and can be reviewed later using svn log. This can be quite useful when trying to reconstruct the thinking that led to some piece of code you wrote months previously.

These messages can also be logged centrally or mailed out to the people working on the project. It is possible to set up your Subversion repository to mail commit messages to you and your partner. (See below.) While the volume of mail thus generated can be irritating, there's no better way to stay in touch with what's going on.

The commit message should thus describe (briefly) what you did and why. There's no need to report the exact changes, as they can be retrieved using svn diff.

When to commit

The general rule for commits is that any change should be committed as soon as you're reasonably certain that it's correct and appropriate in the long term, subject to the proviso that committing many small changes in quick succession will probably annoy everyone working with you.

Remember that your partner won't see anything you don't commit, so get bug fixes in quickly and hold back a little on new features that might still have problems.

Ideally you and your partner should keep track of which tests you expect to work at any particular time, and before committing check to make sure that they all still do work.

In most cases, one should try to avoid committing changes that cause the program to stop working properly (or, even, stop compiling at all.) This rule can sometimes be profitably bent when you know your partner will not be affected by the errors introduced.

Tags

While Subversion uses global version numbers, unlike the private version numbers used by tools like CVS, retrieving your codebase via a global version number is rarely useful. Instead, Subversion supports a concept known as a "tag". A tag is a symbolic name (like asst4-debugged) that you attach to a particular version of some set of files. You can then refer to that version of those files with the name.

Indeed, tags fit seemlessly into Subversion's view of the world. Unlike other version control systems where a tag is something different, in Subversion tagging is accomplished simply by creating a copy of your current development directory in another part of your versioned tree.

To introduce some degree of consistency most Subversion users structure their repositories into three areas, divided at the top level into: trunk, which contains the canonical working copy; tags which contains copies of trees saved for the purposes of tagging; and branches which contains copies of trees used by individual developers as branches off of the main development line. The last is an advanced feature that we will not fully exploit in this course. Given the canonical Subversion repository structure, creating a tag or branch means simply copying trunk to another part of the repository.

A tag is normally used to identify a single consistent version of an entire project. For instance, the directions for each assignment (after assignment 0) tell you to create one tag before starting and another tag after you're done. These tags then identify the versions of all your files that were current before and after you did the assignment. This lets you, for example, ask Subversion to show you all the changes in the entire system between those two points.

See below for specific directions for manipulating tags with Subversion.

Branches

Sometimes you might have more than one "line of development" in your program. For instance, when you ship release 1.0 to customers, you might have one team working on release 2.0, and another team making minor bug fixes to the release 1.0 code for release 1.01.

In this case, most changes made for release 2.0 should not be incorporated into release 1.01, and while many fixes made for release 1.01 should be incorporated into release 2.0, some probably shouldn't be.

This sort of situation is handled using "branches". Each branch is a (mostly) separate line of development, diverging from some common ancestor version. (This divergence is where the term "branch" arose.)

While Subversion has good branch support, using them is beyond the scope of this course.

Use Subversion Effectively

Subversion is a tool, not a panacea. It helps you organize and maintain a project, but it doesn't do it by itself. It requires that you use it in a manner that makes it useful.

In order for the repository to be a useful tool for keeping track of what is really part of the project and what is not, you have to actively maintain the set of files Subversion knows about. Don't add or commit temporary files, editor backups, object files, and the like to the Subversion repository. If you have files that are complicating your development process that you do not want to commit, explore the svn:ignore property that can be set on any versioned Subversion directory. Do remove files you're not using any more. (Even after telling Subversion to remove them, you can still get them back later, because removing them is just a change that Subversion tracks.)

In order for the version history to be useful, you have to add tags at important points in development, like releases. You also have to write at least minimally useful commit messages so you can look at them later and be reminded of the circumstances.

In order for the merging features to be useful, you have to avoid making sweeping changes without warning your partner, you have to update and commit regularly but not insanely often, and you have to take the trouble to merge correctly by hand when conflicts occur.

If you don't do these things, you will eventually end up in a hole, and Subversion will not save you from yourself.

How do I...

The previous section explained Subversion concepts in general terms. In this section we explain how to do various useful things.

How do I make a new repository? Don't bother with this. We will create repositories for you.

How do I make a new project in a repository?

There are two ways to add code to a Subversion repository. One is to use svn add in a working directory to add files and directories one at a time. The other is to use svn import to do a bulk import of a whole unversioned existing source tree into the repository. The next section describes using svn import; svn add is described further below.

How do I import existing code into a repository?

Unpack the source tree you're going to import in a temporary directory.

  % mkdir ~/tmp
  % cd ~/tmp
  % tar -xvzf  ~/somewhere/os161-1.11-du.tar.gz

Now run svn import. You need to provide two things: the place to import into, a commit message, and the code to import.

To specify the place to import into, you provide a Subversion URL specifying the repository and location where you want the contents of the current directory to appear. For cs372, you want the OS/161 distribution to appear in the trunk at the top level of the repository, so you would specify https://tashi/svn/<username>/trunk.

The log message is entered along with your import. You can specify it on the command line with the -m "<message>" argument, and if you do not an editor will be started where you can enter it interactively.

  % svn import os161-1.11 https://tashi/svn/<username>/trunk/ -m "Initial import of os161"

Checking file status

with Subversion, unlike CVS, you can examine the status of your complete repository or a portion of it using the svn status command. This is useful to do before committing, and at any point during the development cycle. It can also help you spot files that are missing from the repository, either because they need to be added or you need to tell Subversion to ignore them using the svn:ignore property.

Adding files and directories to a repository

You can add files or directories to a repository using svn add. To add a directory into an existing repository, create the directory in the appropriate place in a checked out tree and then ask Subversion to add the directory:

  % svn add dir

To add files to an existing directory, create the new files in the appropriate directory in a checked out Subversion tree, and then ask Subversion to add them to the repository:

  % svn add newfile

Unlike adding directories, these files will not be added to the repository until you use Subversion to commit your changes.

How do I check out a working tree?

Use the svn checkout command with the name of the project (the top-level directory in the Subversion repository):

  % svn checkout <repository path> src

For instance:

  % svn checkout https://tashi/svn/<username>/trunk src

This will create a directory called src that holds a working tree.

To check out a particular tag or branch, you need to change the URL that you check out from. It's that simple. For example, if for some reason you wanted to check out (or export, see below) the tag that you checked in for the beginning of ASST1, you would simply run:

  % svn co https://tashi/svn/<username>/tags/asst1-begin <target directory>

If you want to check out the trunk, branch or even a tag while specifying a global revision number, explore the -r option to svn co. This can be useful if you are trying to roll back to a known good version. However, if you roll back your working copy to an earlier version Subversion will remember that you did this, and you will need to update again to the head version using a different argument to -r.

How do I update my working tree?

Use svn update. You can update whole directory trees or individual files. It's your responsibility, if you don't update everything at once, to make sure the resulting working tree you have is self-consistent.

If you don't specify what to update, Subversion updates the current directory (and any subdirectories). Like with svn checkout you can specify particular versions or dates with -r.

When you update, SVN prints one line for each file it processes, with a letter in front of it reflecting the file's status. These letters are:

A Added
B Broken lock (third column only)
D Deleted
U Updated
C Conflicted
G Merged
E Existed

Subversion can be made to shut up about unknown files whose presence is routine by adding them to a the svn:ignore property which can be attached to any versioned directory. See this page for more details.

If you want to retrieve status information, use svn status, not svn update, since the latter will update things below you that you might want to update. Yet.

How do I commit my changes?

Use svn commit. You can commit directories or individual files. You can use the -m option to supply a commit message on the command line; if you don't, Subversion will invoke the editor for each directory into which it commits files. Like with most commands, if you do not specify anything to commit explicitly, Subversion commits all changes in the current directory and all subdirectories.

  % svn commit foo.c

  % svn commit src/kern

Remember that changes that have not been committed-including adding and removing files- will not be seen by other developers.

How do I add files?

Use svn add and supply the filenames. The files will then show up with the "A" code on subsequent calls to svn update or svn status until they are committed. Remember, they will not be seen by other developers until they are committed.

How do I remove files?

When you wish to remove a file or directory from the tree, run svn rm. Do not delete it first. Subversion will do this for you. Commit the file (or its directory) after removing it.

It's a good idea to compile the project after removing but before committing, just to make sure you aren't breaking things.

Files that have been deleted are still kept around by SVN; while they'll be removed from people's working trees by default, you can still look at them, and you can bring them back again later if needed.

How do I rename files?

Run svn mv. This is another (beyond the ability to remove directories) Subversion improvement over CVS. However, you may lose revision history upon moving, or get it tangled up with another file that may have had the same name previously.

How do I add a directory?

Just like a file.

How do I remove a directory?

Again, just like a file.

How do I create a tag?

Quite simple, using svn copy. For example, assuming you have the trunk,tags,branches structure described above and you want to create the tag FOO:

  % svn copy http://tashi/repos/<username>/trunk http://tashi/repos/<username>/tags/FOO

Easy, right? The versions tagged will be the version in the repository that your working tree is based on. (As always, uncommitted changes will not be processed.)

How do I export a release?

Use svn export. This is almost identical to svn checkout, except instead of creating a "subscription" to the Subversion repository, it extracts a snapshot, with no Subversion control/management files. svn export takes the same -roptions as svn checkout. Always do svn export in a temporary directory. Doing it over your working directory makes a mess.

How do I set up commit messages to be mailed out?

We may do this, if people are interested.

How do I make diffs?

Use svn diff. Specify the files or directory trees you wish to compare; if you do not specify anything, by default the current directory and all subdirectories are diffed.

By default your working tree is diffed against the version in the repository to which it was last updated. You can diff against a specific repository version by providing an -r option (as described above under checkout), or against a tag by providing the repository URL. You can diff two specific versions by providing two such options.

If you want to see the latest commits that you haven't updated yet in your working directory, use -rHEAD as one of the arguments.

You can also provide most of the normal diff format options after an -x option. The most commonly used format is the "-u" format. The "-w" option causes diff to ignore changes in spacing and is thus also useful. The "-N" option includes the contents of new files in the diffs, instead of just a note that such files are new. See the diff man page for more information. When preparing diffs for CS161, please use the "-uw" options.

How do I find out where a particular line of code appeared?

The svn blame command prints each line of the file with a prefix containing the version number in which the line appeared, the date of that version, and the username of the person who committed it. You can use the -r options, or a tag or branch URL to retrieve the file as it existed in any previous version.

This can be used in conjunction with svn diff to track down the history of individual lines of code, as long as they haven't moved around very much.

How do I back out a bad commit?

It's late at night and you foolishly/accidentally commit some immensely stupid change that breaks everything. (We've all been there; if you haven't yet, you will eventually.)

All is not lost. Part of the role of Subversion is to keep track of old versions; you can extract the old version and re-commit it, or you can tell Subversion to unmerge the change. Supposing you can determine than version 342 was the last "good" version of the code. You can return your code to that state using the following command:

  % svn merge -c -342 http://tashi/repos/trunk/

This tells Subversion to merge ("join") your working copy with revision 342, which you'll note is the reverse direction from time (since you want to back the change out), into the current version. If this causes merge conflicts, you should resolve them in the usual way. You have to commit your local changes which will create a new revision number and update the repository. You can also do this to "back up" to a tag:

  % svn merge http://tashi/repos/tag/FOO

How do I move my working directory?

In general, you can just rename the entire tree with mv. The Subversion control files in the working directory are position-independent.

For more information

To get a list of Subversion commands and options, use the help page for Subversion (type svn help), for a particular command (type svn help <command>) or look at the online Subversion documentation.

Computer Science 372 Operating Systems

Introduction to Subversion

Computer Science 372
Operating Systems