Bazaar

Bazaar

 




Wiki Tools

  • Find Page
  • Recent Changes
  • Page History
  • Attachments

Bazaar vs Git

This page is obsolete!

See Ten reasons to switch to Bazaar instead.

This is not intended to be an unbiased comparison, but a list of some of the reasons for selecting Bazaar over Git, as seen through the eyes of Bazaar developers and users. YMMV. Git is a great tool, but different in design in many ways to Bazaar.

Overview

Distributed version control systems have compelling advantages, but selecting the best one for you can be far from easy. Many communities and teams have Bazaar and Git on their shortlist, and are investigating both in detail. This document aims to provide information to assist those groups in evaluating the tools and making an informed decision.

Git is an innovation in its category, and is steadily gaining in popularity. It was designed for the needs of the Linux kernel community, to match the processes applicable there and the brilliance of the people executing them. Bazaar, on the other hand, was designed to be suitable for a wide range of people, workflows and environments. In Git the features were the primary focus, whereas in Bazaar the clean user interface and command set were given careful attention. For this reason Bazaar will work out of the box for a much wider audience than Git, and may well be the right choice for your community or team. In addition, in contrast to Git, Bazaar is platform independent and will work without problems in mixed development environments---anywhere where Python is available.

This document compares Bazaar with Git 1.5.x. Be aware that generalizations (like Bazaar is too slow and Git is unusable) simply aren't true these days as both tools are evolving rapidly. That said, here are some common reasons for selecting Bazaar over Git:

  • Windows Support - reaches 85% of computer users
  • Less attitude - direct support for more work flows
  • It Just Works - the UI is much simpler
  • Better storage model - shared repositories
  • Robust renaming - collaborate without fear
  • Better asynchronous sharing - intelligent bundles instead of patch sets
  • Plug-in architecture - solutions matter, not products
  • Launchpad integration - free code hosting, branch registries and more
  • Integration with related tools
  • Commercial training and support

Git had some early advantages including:

  • Speed
  • Storage efficiency

Cryptographic validation of content is also sometimes cited as a feature of Git over Bazaar when that's not really accurate.

Each of these topics is discussed in further detail below.

Reasons for selecting Bazaar over Git

Windows Support

Windows occupies 85% of the operating system market. While it is possible to run Git on top of Windows using the Cygwin layer, that would bring a mixed operating system environment into the picture for regular Windows users. Cygwin is a POSIX environment where handling of case insensitivity and filename globbing are different from what Windows users expect — not to mention the commands.

Git currently has beta-level native Windows support, code-named MSysGit. However the complete command set is not yet supported, many open problems remain, and the installer package is available only as a third-party project.

Bazaar, on the other hand, includes a native Windows port and installer. The port feels like a regular Windows program, and the installer includes the graphical front end for Windows, TortoiseBzr. Olive is available as a graphical front end for Linux.

Both Bazaar and Git are slower on Windows than on Linux, due to underlying file system differences.

Less attitude — direct support for more workflows

There's more than one way to collaborate together. The best way depends on a whole range of factors. Your team or community will have one primary workflow model — and the one encouraged by Git is a very good one — but different groups within your community will undoubtedly mix and match as circumstances dictate. Bazaar's UI directly supports a larger range of work flows than Git.

Take for example a centralized SVN-style workflow, which is well supported in Bazaar's command set. In Bazaar, it is possible to commit directly to the central server. In Git it would need two actions: a local commit, followed by a push to remote host. Similarly Bazaar supports SVN-style checkout, whereas in Git you may have to download whole — possibly big — repository (see restrictions of the --depth option in git-clone). In principle any work flow is possible in Git, but the actions are usually more demanding and less intuitive than in Bazaar.

It Just Works

What's the primary reason given by ex-Git users who now use Bazaar instead? Bazaar Just Works and makes it much harder than Git to shoot yourself in the foot.

As Karl Fogel explains in Subversion, distributed version control and the future, users want tools that stay out of their way. Bazaar generally achieves this better than Git. Git's strength is a simple core data storage model, but it has weaknesses on the UI level. Git quickly gets complicated with concepts like staging area, dangling objects, detached heads, plumbing vs porcelain, and reflogs — concepts that you don’t need to know at all in Bazaar.

The key differences between Bazaar's and Git's UI are:

  • Directories are branches. In Git they are branch containers where you switch to different views.

  • Empty directories cannot be versioned in Git.
  • Within a branch, changes directly made are emphasized over changes from a merge.
  • Revision objects are simple: r1, r2 etc. In git they are SHA1s which are represented by 40 character long hexadecimal encoding.

  • Git’s automatic merge and commit may create problems.
  • Git has over 150 different commands. The UI between these commands is not consistent, and there is no unified GNU --long option convention support.

  • Bazaar uses familiar commands known to Subversion and CVS users. Git contains a whole new vocabulary: for example, commit into repository is very different in Git.

Taken individually, each of these may not be a big deal, particularly if your team is full of really smart people who enjoy squeezing every bit of power out of their tool set. Just remember though that 95% of programmers are still using central VCS tools and get by with 5-10 SCM commands/actions: checkout, update, commit, status, diff, log, add, delete. People usually don't care about the tool - they care about getting their code written and integrated. (In fact, with modern IDEs, they rarely ever see the VCS beyond the configuration setting used to configure the branch associated with a “project”.) Bazaar has a clean UI model resulting in a lower learning curve and more potential contributors to your community. As the open source world continues to grow in its appeal to non-programmers, usability is not something to dismiss lightly. From a Git perspective, DVCS looks much more complex than it needs to be.

Better storage model

Bazaar can efficiently share revisions between branches through shared repositories. These are completely optional — a standalone tree has its own repository by default. But if a parent directory has been configured as a shared repository, the revisions are stored and shared there.

For efficiency, this is actually far more flexible and powerful than Git's default model. For example, the one repository can be used on a developer workstation for storing revisions from:

  • release branches
  • a tracking branch of the main development trunk
  • topic branches for each fix or feature currently being worked on
  • temporary QA or integration branches for reviewing other changes.

Some topic branches might be pushed to a publicly accessible location (like ~/public_html), while others might be experimental only. With Bazaar, it doesn't matter — they can all efficiently use the one repository without worrying about others accidentally pulling junk they don't need. For users on laptops in particular, this can be a big deal.

Bazaar’s approach has value well beyond the single user case. In commercial environments, different teams might have their access controlled to various branches (e.g. development vs QA vs maintenance) on a server but, once again, these branches can be using a shared repository for efficient storage. It's also clearly beneficial for hosting sites like Launchpad.

Git can achieve the same functionality by a semi-official git-new-workdir extension, but this tool is not part of the standard command set and has many usability and portability problems. Similar efficiency gains can be achieved using the "alternates" mechanism native to Git, but it has no workflow impact (branches still stay local to a repository, only the object database is shared in one direction).

Easier administration

In a central VCS deployment, it's not uncommon to have staff dedicated to administrating the SCM servers and repositories. Bazaar and Git make creating repositories trivial, but the ongoing cost of ownership shouldn't be ignored. Here are some examples.

As well as the UI advantages, Bazaar's directory-is-a-branch model has administration advantages. Security can be applied to different branches by using existing operating system access control facilities. Bazaar's plug-in architecture is also a good thing w.r.t. cost of ownership. Plugins are typically easier to upgrade and share than enhancements made outside such a framework.

In a commercial environment, one of the arguments against adopting DVCS tools is that central VCS tools encourage daily check-ins which fits in well with the daily backup cycle. That's the wrong trade-off — it leads to fragile code being checked in, trunk quality dropping, other developers grabbing those changes before they are ready, and lost time all around. That's bad enough for a centrally located in-house team. For distributed teams and open source communities, it’s even worse. The right solution for backing up the work of developers using a DVCS is a central backup server that developers can push changes to daily. Bazaar’s shared repositories can be useful on that backup server. In particular, while Bazaar users with laptops are in the office, they can alternatively bind branches to ones on that server if they want backups to happen implicitly.

Both Bazaar and Git have tools for detecting junk and inconsistencies in repositories. Bazaar also has an upgrade tool for switching between file formats and a reconfigure tool for changing how a branch is configured, e.g. lightweight vs heavyweight, standalone vs shared repository.

Robust renaming

Git prides itself on being a “content manager” and deriving what got renamed using heuristics. This mostly works, but breaks under certain merge conditions. If you want your team or community to collaborate without fear of breaking merges, Bazaar's robust renaming is essential as explained by Mark Shuttleworth’s article on this topic.

See http://thread.gmane.org/gmane.comp.version-control.git/94861 for an example of renames breaking in git.

Better asynchronous sharing

When changes in a branch are ready for sharing and you wish to share asynchronously (e.g. via email instead of advertising a public branch), Bazaar handles this better than Git. The recommended way to do this in Git is via the format-patch command which generates a set of normal patches which can be applied with its am command. Bazaar implements this via the send command, which generates an intelligent patch known as a "merge directive". In addition to a preview of the overall change, a merge directive includes metadata like renames, the base revision (common ancestor) of a submit branch, and digital signatures. Consistent with the way branches are used, Bazaar's merge and pull commands are used to apply a merge directive to another branch. If the changes need to be applied to code managed outside Bazaar, simply feed the merge directive to GNU patch (and the merge preview will be processed).

Plugin architecture

A VCS tool is only part of the broader Collaborative Development Environment required by communities and teams. It needs to be integrated with heaps of other tools and the overall solution needs to be manageable over time. A plugin architecture has many advantages including reduced Total Cost of Ownership.

Bazaar has a good architecture internally and a rich public API available for integrating other tools. In contrast, Git takes the "toolkit" approach - great for prototyping but not so good over time. Git is currently working towards a public API so this advantage of Bazaar's may diminish over time.

Launchpad integration

Launchpad is a free service provided by Canonical, designed to make it easier for open source communities to collaborate both internally and with other open source communities. Features include free hosting of Bazaar branches, a registry of branches for projects, and linkage of branches to the bugs and blueprints they address. A huge amount of open source software is available as Bazaar branches within Launchpad. For a detailed look at how Bazaar and Launchpad can be used together and why good integration matters, see our tutorial on Using Bazaar with Launchpad.

Hosting is also available for git, including gitorious and github.

Bazaar has been integrated with some useful related tools in ways that might help your team. These tools include Patch Queue Manager (PQM), Bundle Buggy (BB) and various bug trackers.

Continuous Integration (CI) is a best practice adopted by many open source projects and commercial teams using Agile/Lean development methodologies like XP and SCRUM. The objective is always having a shippable code base by early detection and correction of when things break. Here are the core CI practices as given by chapter 2 of Paul Duvall's book:

  • commit code frequently - commit centrally at least once per day
  • don't commit broken code
  • fix broken builds immediately
  • write automated developer tests
  • all tests and inspections must pass
  • run private builds
  • avoid getting broken code.

A better workflow for achieving this is called Decentralized with automated gatekeeper as explained in http://bazaar-vcs.org/Workflows. PQM is a software gatekeeper which ensures that the mainline of the development branch never breaks. It is maintained by a member of the Bazaar team and well integrated with Bazaar.

Bundle Buggy (BB) is a tool for tracking peer reviews. It is maintained by a member of the Bazaar team and well integrated with Bazaar. See http://bundlebuggy.aaronbentley.com/help.

Version control repositories become an important source of history over time. To maximize the data mining potential of this "code data warehouse", linking changes to external data sources like bug trackers is very important. Bazaar has out of the box integration with Bugzilla, Trac and Launchpad's bug tracker. Integrating other bug trackers is easy to do via either configuration settings and/or plug-ins.

Commercial training and support

In addition to support from an active open source community, Bazaar professional services for commercial organizations are available from Canonical. If you require conversion, training, support or custom engineering for Bazaar, please contact Canonical to discuss your needs.

Git's Advantages

Speed

We want Bazaar to be the most usable tool around and performance is an important part, but certainly not the only part, of achieving that.

It's true that Git is really fast at many operations and that Bazaar was once quite slow. We made some big progress, and our comprehensive performance benchmarks show that bzr is fast enough for a large spectrum of projects when working on the working tree. It is still a bit slow for big histories (> 10 000 changesets), but we are currently working on it (and already made good progress for commands like log, blame and diff for a given set of revisions since bzr 1.3). Git is also likely to be faster than Bazaar for network operations but we expect that difference to disappear soon.

Storage efficiency

Git has a strong reputation for efficiently storing data and claims on its home page that it tops every other open source VCS in this area. Since Bazaar changed its default format to packs in 1.0, our benchmarks indicate that Bazaar is around 15% better on average for the initial commit. Across 33 open source projects, Bazaar was more efficient on every one.

However, more thorough testing on repositories with long history is needed to confirm storage efficiency. There are some indications that Git is more efficient in repositories with long history (see example). Git repositories can be packed very efficiently with clever tweaking. Subsequent revisions added to Git repository will cause the repository to grow, but recent Git releases will repack the repository regularly and automatically.

Cryptographic content validation

Linus made cryptographic strength integrity checking a core part of Git’s design. As revisions are named using their SHA, it's next to impossible to attack a Git repository. Bazaar explicitly chose to make its revision identifiers UUIDs instead of SHAs. This doesn't mean that Bazaar is less secure — the integrity of each revision is still validated using SHAs in Bazaar. If that isn’t enough security, Bazaar can be configured to digitally sign every commit.

A detailed look at the UI of Bazaar vs Git

Directories are branches, not branch containers

A key difference between Bazaar and Git is that branches are directories in Bazaar while they are internal names within a repository in Git. In general, this simplifies the UI a fair bit. This design choice is not unique to Bazaar as illustrated in Perforce's Software Life-cycle Modeling paper:

An important feature of the Perforce model is that the relevance of a file is encoded in its name: //depot/release/3.5/01/db/dbhdr.h is a file that is part of a release, and clearly not the place for miscellaneous enhancements. Similarly, //depot/main/db/dbhdr.h is clearly not where shippable products are built. Further, the whole of release 3.5/01 can be found under //depot/release/3.5/01 and not anywhere else in the depot. This plain-as-the-nose-on-your-face approach is the hallmark of Perforce's branching model.

On the other hand, it's true than many Git users like its approach and it does have benefits worth noting:

  • the one working tree can be used for multiple branches saving disk space
  • certain tasks like pulling and pushing a set of branches at the one time are easier.

Bazaar has solutions to both of these issues although they aren't necessarily well advertised. To use the one working tree for multiple branches, set up a shared repository with the --no-trees option together with a lightweight checkout (See GitStyleBranches). The switch command (within the bzrtools plugin before bzr v1.0, part of bzr itself thereafter) can be used to change which branch the working tree is tied to. To pull a set of branches at once, use the multi-pull command within bzrtools. There is currently no easy way to push a set of related branches in Bazaar - use the repo-push plug-in instead.

Within a branch, changes directly made are emphasized over changes from a merge

Technically, a branch is a directed acyclic graph (DAG) in Bazaar just like it is in Git. Practically though, a particular series of revisions is most important from the perspective of the owner of each branch. Here are some examples:

  • If a developer is working on a topic branch, makes some changes, does a merge and commit and asks for a log of the last 5 changes, then the merge is logically (only) one and the 4 changes before it ought to be listed as well
  • If a team is running a Continuous Integration server that triggers a full build+test after each commit, then the CI server should only be triggered after each commit to the main development line, not commits to merged branches.

Treating the left-hand side of the DAG (mainline) as special doesn't make Bazaar less distributed as some people have suggested. Instead, it recognizes that each branch has it's own viewpoint onto the DAG. Bazaar's UI has been designed to reflect this.

Revision IDs are not SHAs

The most obvious reflection of Bazaar's it's my branch philosophy is revision numbering. Revisions on the mainline are numbered using simple numbers starting at 1. Revisions within merged branches are given dotted revision numbers, e.g. 1.2, 63.3.5.9. Like all DVCS tools, Bazaar has globally unique revision identifiers as well but these are very rarely required in daily UI usage. Note that, like Git, Bazaar could use the content SHA for the internal revision identifier. It explicitly doesn't do this because that approach makes it difficult to use external revision stores (as bzr-svn does) and to evolve storage formats over time.

Consistency

In Bazaar, push and pull both enforce that the target is an ancestor of the source making the operations symmetric (as expected given their command names). In Git, this is enforced for push but not for pull. Pull is more magical and may:

  • work as it does in Bazaar
  • do a merge and implicitly commit if there are no conflicts
  • do a merge and make the user resolve conflicts and do the commit.

There are 3 design problems with this:

  • it's inconsistent with how push works
  • it ought to fail (by default) when using pull for updating a tracking branch (a.k.a. mirror branch)
  • implicitly committing a merge is the wrong thing to do (see below).

Another common example is the user experience of using bzr-svn: users just use the normal bzr commands and things pretty much work as expected. In contrast, git-svn introduces a different set of commands because Git's architecture doesn't permit transparent support of foreign repositories.

Automatic merge & commit may be harmful

Bazaar will never implicitly commit a merge just because it merges cleanly. No VCS can guarantee that a successful textual merge doesn't introduce bugs. For example, one branch may delete a function while another branch adds a client for that function.

Developers and teams that are serious about making sure trunk quality is sacred ought to be running tests before every commit using a tool like PQM.

Familiar commands for Subversion and CVS users

While some differences exist at the option level, the core Subversion and CVS commands generally work the same way in Bazaar. Git is different in ways svn/cvs users wouldn't intuitively expect. Some examples:

  • commit (without -a) doesn't take the latest copy of a file but the copy as of the last add
  • no update command
  • revision numbers are SHA strings, not numbers.

Related Links

The following links may also be of interest: