Bazaar

Bazaar

 




Wiki Tools

  • Find Page
  • Recent Changes
  • Page History
  • Attachments

About Using Repositories

This is a continuation of the discussion documented in ../AboutPushPullMerge.

This expanded into a discussion of using shared repositories, and ways of working with them. It also contains some discussion of branches vs. checkouts, and some other stuff. Definitely needs editing to separate the topics.

This should be cleaned up into an actual document some time, but for now, it's here for reference.

Log

USER
so one last area that's confusing me, init-repository, branch and pull?
fullermd
init-repo is used to create a shared repo, that can be shared across branches.
USER
i've seen people say you should init-repository on , for example, my windows machine, before branching.
fullermd
If you 'init' a branch without one (or create a branch another way, like with 'branch'), it builds an internal repository.
fullermd
The downside there is that if you have 2 or 3 or 30 branches of the same project, every branch has a full copy of the history, and most of it will naturally be duplicated.
fullermd
If the branches share a repository, though, all that duplicated history only gets stored once.
fullermd
So if you expect multiple branches of one project, it's better to init-repo a shared repo for them before you start.
USER
hmm, trying to think how you'd end up with multiple branches in a project..
fullermd
But it's not a big deal. You can create a repo later and change the branch[es] you already have to use it.
fullermd
Heck, I've got dozens of branches of projects around :p
USER
oh, so you could branch for a specific feature,
fullermd
It's not like SVN where that repository has some semantic meaning, or acts as a boundary.
USER
and then have 2 or 3 branches of the same code base
fullermd
The only real difference is makes is saving your disk space and I/O bandwidth.
fullermd
Which is well worth it, to be sure. But running commands between branches works the same whether they share a repo or not.
USER
ok,, so you can create a branch by bzr branch [remote branch] or bzr pull [remote branch] ??
fullermd
(this is actually a _slight_ lie, because there are some edge cases where the repo will change behavior. But they don't really matter until you get deep into stuff)
fullermd
pull won't create a branch; it only updates one you already made.
fullermd
That's one of the asymmetries between 'push' and 'pull'; push will create the target if it doesn't exist, pull won't.
USER
ahh, so you don't start a fresh branch with pull ,,
fullermd
Right. You basically create a branch one of two ways.
USER
init-repo or branch?
fullermd
With 'branch' (make a new copy of an existing branch to work with), and 'init' (create a new empty branch)
USER
cool..
fullermd
init, not init-repo. init-repo just makes a shared repo for branches to use.
USER
and from an init'd branch you could pull code from a remote branch somewhere
USER
ahhh
fullermd
With a bare Repository, you can't _do_ anything. It just exists for Branches to use behind the scenes.
USER
hmm
USER
that's a new concept.
fullermd

Sure. You could "bzr init x ; cd x ; bzr pull http://some/where" instead of "bzr branch http://some/where x", and they'd do pretty much the same thing.

fullermd

(it would be silly, but you could do it

USER
cool, ok that much makes sence.
USER
yeh, i wouldn't bother doing it, just to make sense of some of it that helps.
fullermd
Yeah. It can be a bit of a change from SVN, because in SVN the bzr concepts of Branch and Repository are sorta both crammed into the Repository.
fullermd
(well, half of the Branch concept anyway. The other half doesn't really exist, and just gets faked up with other primitives. But that's details.)
USER
you couldn't do bzr init ./blah; cd blah; bzr branch [remoterepo] ./v1; bzr branch [differentrepo] ./v2; would that be wrong?
fullermd
Well... it would be weird.
fullermd
(and you branch a Branch, not a Repo)
USER
ok sorry, branches then.
fullermd
What you'd end up with is three branches, one with nothing in it, and two others that happened to be below it in the filesystem.
fullermd
Maybe you meant init-repo there?
USER
hmm, ok so that'd be why you use init-repo for shared repo.
USER
not really, just getting the difference between the 2
fullermd
Basically, everything you directly interact with is a Branch.
fullermd
You commit onto a Branch, look at the log for a Branch, etc.
fullermd
Repositories are basically giant buckets that Revisions get dumped into.
fullermd
Branches point at a Repository and say "look in there for [these revisions], those are the ones that are part of me"
USER
if you worked in a group and had a central master branch you all checked out of, but wanted to be able to push specific changes over to a co-worker but not he central branch, you could set up pointers to eachothers local branches?
fullermd
Mostly, the only time you directly look at or interact with a Repo is when you run init-repo.
USER
ok
fullermd
Usually, you'd go the other way; tell them to pull (or merge) from your branch.
USER
checked out? there goes my svn centric concepts again.. a master branch you all branched out of...
fullermd
No, you could all checkout the master branch too.
USER
eh?
fullermd
'checkout' creates a Working Tree for a given branch. And usually it's used for making multiple WT's on the Branch.
USER
by checkout are you making a destinction from branching from the master branch?
fullermd
Which is the same flow you have in SVN.
USER
hmm
fullermd

No, by checking it out. "bzr checkout http://some/where"

fullermd

(well, probably not http, since you presumable want to be able to _write_ there, so bzr+ssh:// or something)

USER
well, web dav will allow you to write.
USER
but anyway.
fullermd
Then you basically have one shared branch that everybody has a WT on. When you commit, everybody else needs to 'update', just like in SVN.
fullermd
This is a workflow that DVCS's in general kinda shun, and nobody but bzr (TTBOMK) really first-class supports.
fullermd
Me, *I* use it all the dang time, because it's really useful.
USER
oh?
USER

TTBOMK>

USER
?
fullermd
To The Best Of My Knowledge
USER
ahh, just urban dictionary'd it.
fullermd
Ogod, I'm speaking urban now?
USER
:P
fullermd
By default, checkout creates what's called a "heavy" checkout, which means that it also copies down and stores the full history, just like 'branch' does.
fullermd
That means that things like "bzr log" don't have to go across the network to the server.
USER
hmm, why would it be useful? It just would prevent you from being able to ever use the WT to branch from again if you wanted to push work up from another machine, like in my case, I have 3 pc's, linux windows and a netbook, which i code on sometimes when traveling..
fullermd
With a heavy checkout, you _can_ use that checkout as a source for 'branch'.
USER
ohhh,,, yargg..
USER
not sure i get the difference between that and a regular branch then.
fullermd
The difference is that the Branch for that WT is the one off on the server you checked out from.
fullermd
So when you 'commit', the rev goes there. When somebody else has commit'd, you get told you're out of date and to 'update' before you can commit.
fullermd
It's something you might commonly use for 'trunk' branches, for instance.
USER
so it's more like a pure svn approach for that instance.
fullermd
Right.
fullermd
USER: Let's invent a setup much like what I use.
USER
haha, ok ..
fullermd
USER: We have a project, project1. There's a central server with a 'trunk' branch on it, that me and the other 6 guys all work on to move this thing forward.
USER
yeh, sure.
fullermd
I'll create a shared repo at /project1 (yeah, I base everything off the root dir for examples)
fullermd
So "bzr init-repo /project1"
USER
cool, we've init'd a shared repo. great.
fullermd

Now, I make a checkout of the trunk: "cd /project1 ; bzr co bzr+ssh://server/project1/trunk"

fullermd
Being a heavy checkout, it copies down all that history, and stores it in the shared repo instead of internally.
USER
right, we've got our heavy checkout of the trunk branch..
fullermd
So now I can work SVN-style by working and commit'ing and update'ing and all in /project1/trunk
fullermd
And in fact, if the other 6 guys are all SVN-heads, I can basically move them to bzr with this setup, and the only difference they'll see is they run "bzr whatever" instead of "svn whatever".
fullermd
They don't need to know anything about local branches or distributed whosafudge.
fullermd
But I do, see. 'cuz I'm smart and stuff.
USER
haha, cool, cause i have 5 guys in the office here all on svn, and i've broached the bzr scenario with them and they're... meh.. about it.
fullermd
So I want to work on adding nobbles to it. But that's gonna take a while and break stuff while I do it. People get pissed when I break trunk, for some reason.
fullermd
So, I make a new branch: "cd /project1 ; bzr branch trunk add-nobbles"
USER
yeh, i get that too.. but my commits usually break buildbot.
fullermd
Now I have trunk, still a checkout of the central trunk. Anything I do there goes straight to the central server.
fullermd
But I also have add-nobbles, as an independent branch, existing nowhere but on my drive.
USER
cool, so we've branched from our heavy checkout to another local branch but both in a shred repo. one history.. gotcha.
fullermd
(and that happened fast, because 'trunk' and 'nobbles' are both using the shared repo at /project1, so it didn't have to actually _copy_ any history)
fullermd
So now I can go ahead and work on add-nobbles for the 9 weeks it takes me to finish it, without bothering turnk.
fullermd
And without having 9 weeks of changes sitting uncommitted in a checkout, just begging for disaster. I can keep on making a few commits an hour there.
USER
yeh, this is a scenario i really like.
fullermd
Of course, trunk keeps moving ahead while this is happening too. I can still be working on fixes there and commiting them into trunk.
fullermd
So every couple days, I "cd /project1/add-nobbles ; bzr merge ../trunk ; (check them over) ; bzr commit", to bring in the trunk changes.
fullermd
This way I don't diverge too far, and when I get conflicts, they're small and easy to fix, rather than trying to do one giant merge at the end.
fullermd
Of course, this doesn't affect trunk at all. Just add-nobbles. But it means that I'm keeping up with those changes all the time.
fullermd
Eventually, I finish up, and it's ready to go. So I "cd /project1/trunk ; bzr update (just to be sure) ; bzr merge ../add-nobbles ; (check) ; bzr commit"
fullermd
My network churns for a while, uploading all those new revs to the server, and presto; the branch is landed into trunk.
fullermd
Next time everyone else 'update's, they get all those changes.
fullermd
Now, a few things here; first, that last merge is _EASY_, because I've already pretty much caught all the conflicts and fixed them with the periodic merges of trunk into add-nobbles.
fullermd
USER: Second, it's not just the equivalent of doing a big "diff | patch"; every single one of those hundreds of revs I put into add-nobbles is now in trunk, so if we need to track back into them for something later, they're ready and accessible.
fullermd
USER: In fact, I could "rm -rf /project1/add-nobbles" now if I wanted to; everything that was in it is in trunk now.
fullermd
USER: And third, all of that work was local. Nobody else even had to KNOW I was working on nobbles, and nobody needs to ever know or care that I did it in a separate branch.
fullermd
USER: So 95% of your team can totally ignore branching, and just work lockstep in trunk like SVN. And the 5% willing to shoulder the extra mental load of dealing with multiple branches can reap the benefits without getting in anyone else's way.
fullermd
USER: And that 95% can slowly start using branches if they want too. Maybe when you're working on something big in a branch, and want to share it with other people.
fullermd
USER: The two (or 3, 4, etc) of you can share the branch among yourselves, merge'ing each other. Or put it on a 'central' server, whether the same central server as trunk, or a different central server, and all create your own 'checkout's of it.
fullermd
USER: (and then your own local branch's of THAT checkout, and... *headsplode*)
fullermd
So you can start working just like SVN, and scale up (individually or as a team), when you're ready to handle the extra complexity and have a situation where it's worth it.
fullermd
In my work, a lot of stuff still just happens on trunk. Small bugfixes, tiny features... roughly "anything that fits in one revision", I tend to just do straight on trunk. Simpler.
fullermd
The longer or more involved or more breakage-inducing it is, the bigger the advantages of making a branch for it.
fullermd
I can choose on a case-by-case basis.
USER
fullermd, hmm, that's fascinating,, i'm going to have to digest it before i use it more broadly, but i can imagine myself doing so in time..
fullermd
USER: Sure. Pick up new bits when they're helpful.
fullermd
For a diversion, imagine I checkout'd /project1/trunk without making a shared repo at /project1, because I'm just gonna use the checkout, not branches anyway.
fullermd
But later, I suddenly decide I want to use branches. But I don't want to keep copying all the history.
USER
yeh, that was a question that was on my mind throughout this discussion.
fullermd
I can "bzr init-repo /project1" to create a repo there (empty; trunk still has its stuff internally)
fullermd
Then I can "cd /project1/trunk ; bzr reconfigure --use-shared" to move the history into the shared repo, and switch trunk to using it (and throw away its internal copy)
USER
hah, that's neat
fullermd
(you can also go the other way around and "bzr reconfigure --standalone" to create an internal repo for the branch and copy the history out of the shared into it. But you don't need to do that too often.)
fullermd
(it's also not QUITE exactly the mirror operation, for reasons we don't need to go into 'cuz you'll probably never use it)
USER
i was thinking bzr co /projectco; then bzr init-repo /project1; cd project1; bzr co /projectco ./trunk;
fullermd
That would probably not work so well. Having a checkout of a checkout is icky.
USER
hmm, yeh, maybe not.
fullermd
It may not work at all. It may sorta work, sometimes. I don't actually know.
fullermd
With straight branches it's easier, since they stand by themselves.
USER
you'd have to redirect the parent / target repo of the second checkout back to the main repo.
USER
or main trunk.
fullermd
(before reconfigure grew those options, using 'branch' into and out of a repo to move things in/out was SOP)
USER
aha, ok...
USER
wow, brain melt..... in a good way though.
fullermd
Now, if you already had the [heavy] checkout at /projectco, and didn't want to re-transfer all the data across the network, I'd do something like this:
fullermd

bzr init-repo /project1 ; bzr branch /projectco /project1/tmp (just to 'prime' the repo) ; rm -rf /project1/tmp (cleanup) ; bzr co bzr+ssh://server/project1/trunk /project1/trunk"

fullermd
Since that temporary 'branch' primed the repo with all the revisions, that last checkout didn't have to transfer much of anything; just check that we already had everything.
USER
well you could do bzr init-repo project1; mv projectco ./project1/; cd project1; bzr reconfigure --use-shared; no?
USER
arrg, cd project1/projectco; sorry
fullermd
That would also work, yep.
fullermd
And if you decide "hey, that should be called 'trunk', not 'projectco'", you could also "cd /project1 ; mv projectco trunk"
fullermd
(mv, not bzr mv, note)
USER
ah, handy
USER
yeh, of course.
fullermd
As long as the branch stays _inside_ the repo, you can move it around to anything you want.
USER
same as svn, using mv subcommand is only for moving items inside WT
fullermd
If you mv it outside the repo, though, it'll start weeping the first time you do something that makes it look for a rev.
USER
oh really.
USER
that's interesting.
fullermd
Yah. 'cuz it'll start looking for its repo, and... umm.... where'd it go?
fullermd
(that's one of the main cases you use reconfigure --standalone; to prepare for mv'ing it out of the repo)
USER
so you can move it into the shared repo. reconfigure to --use-shared and after that just mv ./projectco ./#trunk
USER
etc etc.. as much as you like, and it's still tied the the shared repo.
fullermd
Right. A branch finds it shared repo implicitly, by looking at .., then ../../, then ../../../, etc.
USER
ahh, that's very simplistic, but good i suppose.
fullermd
So you could mv it to /project1/foo/bar/baz/quux/trunk if you wanted.
fullermd
So you can mv the branch around inside the repo, or even mv the repo around as a whole. As long as the branches stay 'under' the repo, everything's cool.
USER
that's nice .. i like this shared repo idea..
fullermd
A thing to note here, too, is that shared repos are entirely local.
USER
really handy for tying branches of related work together.
fullermd
Imagine you have the trunk and 2 feature branches in a shared repo on the central server.
fullermd
/project1/trunk, /project1/feat1, /project1/feat2 (/project1 being a shared repo)
fullermd
You could have co's of each of them in a shared repo at /myproject1 on your box.
fullermd
Or you could have co's of them in /myproject1 _without_ a shared repo, each having its own copy of the history.
USER
they are totally ignorant of eachothers shared repos'
fullermd
Or have trunk at /myproject1, and feat1 and feat2 in a repo at /myproject1features.
USER
gotcha,, yeh, that makes sense.
fullermd
Right. This is an aspect of "Repositories aren't semantic". Everybody's setup can be different.
USER
not bad, not bad.
fullermd
The only thing you ever directly look at is a Branch.
USER
yeh, i get that concept now..
fullermd
(now, a downside of this is that we don't currently have something like "mirror-repo", or "pull-repo", to mirror/update the whole set of branches)
fullermd
Doesn't necessarily mean we _can't_, but it's more involved. Some future work is looking in that direction.
fullermd
But one reason we've gotten by this far without it is that you often don't need it, so...
USER
fullermd: couldn't you just cp the shared repo,
fullermd
I of course use 'we' to mean 'somebody competent, not me' :p
USER
you don't need a specific mirror command,
fullermd
Sure. You could tar it up and move it around, or rsync it, or whatever.
USER
so that's a simple way to mirror the repo.
fullermd
Gets more expensive than it has to be for updates though.
fullermd
(and of course you can't mirror just a subset that way)
USER
ok, but in situation where you only have http access to the repo, you may not have permissions to edit, so you may want to mirror the code base to fork the project or something.
fullermd
Right. You'd generally use 'branch' rather than 'checkout' for that.
USER
especially if you still want to pull the downstream updates back from the master project.
USER
yeh,
USER
upstream, downstream... umm which ever way that indicates....
fullermd

Then you could just "bzr branch http://where/ever upstream", and never do anything in that upstream branch except 'pull' updates, and use it as a source for 'merge'ing them into your codebase.

USER
so still reaching for a real reason you'd need pull-repo
fullermd
(never doing any merge's or push's into it, or committing, or etc)
fullermd
It would be a lot more efficient for updating it than rsync and friends, for one.
USER
well, you may want to pull-repo if say you have multiple branches for an app that's acorss say, different hardware architectures or something?
fullermd
For another, you might want all the branches in that repo in a repo locally, but ALSO have your own local branches in that same repo (sharing the storage)
fullermd
Yeah, there are a number of good reasons for it. Just takes some planning in a VCS that's Branch-oriented like bzr.
USER
ok, cool.
USER
bleugh... my brain is full i think
fullermd

My work here is done

MatthewFuller