About Using Repositories
This is a continuation of the discussion documented in ../AboutPushPullMerge.
This expanded into a discussion of using shared repositories, and ways of working with them. It also contains some discussion of branches vs. checkouts, and some other stuff. Definitely needs editing to separate the topics.
This should be cleaned up into an actual document some time, but for now, it's here for reference.
Log
- USER
- so one last area that's confusing me, init-repository, branch and pull?
- fullermd
- init-repo is used to create a shared repo, that can be shared across branches.
- USER
- i've seen people say you should init-repository on , for example, my windows machine, before branching.
- fullermd
- If you 'init' a branch without one (or create a branch another way, like with 'branch'), it builds an internal repository.
- fullermd
- The downside there is that if you have 2 or 3 or 30 branches of the same project, every branch has a full copy of the history, and most of it will naturally be duplicated.
- fullermd
- If the branches share a repository, though, all that duplicated history only gets stored once.
- fullermd
- So if you expect multiple branches of one project, it's better to init-repo a shared repo for them before you start.
- USER
- hmm, trying to think how you'd end up with multiple branches in a project..
- fullermd
- But it's not a big deal. You can create a repo later and change the branch[es] you already have to use it.
- fullermd
- Heck, I've got dozens of branches of projects around :p
- USER
- oh, so you could branch for a specific feature,
- fullermd
- It's not like SVN where that repository has some semantic meaning, or acts as a boundary.
- USER
- and then have 2 or 3 branches of the same code base
- fullermd
- The only real difference is makes is saving your disk space and I/O bandwidth.
- fullermd
- Which is well worth it, to be sure. But running commands between branches works the same whether they share a repo or not.
- USER
- ok,, so you can create a branch by bzr branch [remote branch] or bzr pull [remote branch] ??
- fullermd
- (this is actually a _slight_ lie, because there are some edge cases where the repo will change behavior. But they don't really matter until you get deep into stuff)
- fullermd
- pull won't create a branch; it only updates one you already made.
- fullermd
- That's one of the asymmetries between 'push' and 'pull'; push will create the target if it doesn't exist, pull won't.
- USER
- ahh, so you don't start a fresh branch with pull ,,
- fullermd
- Right. You basically create a branch one of two ways.
- USER
- init-repo or branch?
- fullermd
- With 'branch' (make a new copy of an existing branch to work with), and 'init' (create a new empty branch)
- USER
- cool..
- fullermd
- init, not init-repo. init-repo just makes a shared repo for branches to use.
- USER
- and from an init'd branch you could pull code from a remote branch somewhere
- USER
- ahhh
- fullermd
- With a bare Repository, you can't _do_ anything. It just exists for Branches to use behind the scenes.
- USER
- hmm
- USER
- that's a new concept.
- fullermd
Sure. You could "bzr init x ; cd x ; bzr pull http://some/where" instead of "bzr branch http://some/where x", and they'd do pretty much the same thing.
- fullermd
(it would be silly, but you could do it
- USER
- cool, ok that much makes sence.
- USER
- yeh, i wouldn't bother doing it, just to make sense of some of it that helps.
- fullermd
- Yeah. It can be a bit of a change from SVN, because in SVN the bzr concepts of Branch and Repository are sorta both crammed into the Repository.
- fullermd
- (well, half of the Branch concept anyway. The other half doesn't really exist, and just gets faked up with other primitives. But that's details.)
- USER
- you couldn't do bzr init ./blah; cd blah; bzr branch [remoterepo] ./v1; bzr branch [differentrepo] ./v2; would that be wrong?
- fullermd
- Well... it would be weird.
- fullermd
- (and you branch a Branch, not a Repo)
- USER
- ok sorry, branches then.
- fullermd
- What you'd end up with is three branches, one with nothing in it, and two others that happened to be below it in the filesystem.
- fullermd
- Maybe you meant init-repo there?
- USER
- hmm, ok so that'd be why you use init-repo for shared repo.
- USER
- not really, just getting the difference between the 2
- fullermd
- Basically, everything you directly interact with is a Branch.
- fullermd
- You commit onto a Branch, look at the log for a Branch, etc.
- fullermd
- Repositories are basically giant buckets that Revisions get dumped into.
- fullermd
- Branches point at a Repository and say "look in there for [these revisions], those are the ones that are part of me"
- USER
- if you worked in a group and had a central master branch you all checked out of, but wanted to be able to push specific changes over to a co-worker but not he central branch, you could set up pointers to eachothers local branches?
- fullermd
- Mostly, the only time you directly look at or interact with a Repo is when you run init-repo.
- USER
- ok
- fullermd
- Usually, you'd go the other way; tell them to pull (or merge) from your branch.
- USER
- checked out? there goes my svn centric concepts again.. a master branch you all branched out of...
- fullermd
- No, you could all checkout the master branch too.
- USER
- eh?
- fullermd
- 'checkout' creates a Working Tree for a given branch. And usually it's used for making multiple WT's on the Branch.
- USER
- by checkout are you making a destinction from branching from the master branch?
- fullermd
- Which is the same flow you have in SVN.
- USER
- hmm
- fullermd
No, by checking it out. "bzr checkout http://some/where"
- fullermd
(well, probably not http, since you presumable want to be able to _write_ there, so bzr+ssh:// or something)
- USER
- well, web dav will allow you to write.
- USER
- but anyway.
- fullermd
- Then you basically have one shared branch that everybody has a WT on. When you commit, everybody else needs to 'update', just like in SVN.
- fullermd
- This is a workflow that DVCS's in general kinda shun, and nobody but bzr (TTBOMK) really first-class supports.
- fullermd
- Me, *I* use it all the dang time, because it's really useful.
- USER
- oh?
- USER
TTBOMK>
- USER
- ?
- fullermd
- To The Best Of My Knowledge
- USER
- ahh, just urban dictionary'd it.
- fullermd
- Ogod, I'm speaking urban now?
- USER
- :P
- fullermd
- By default, checkout creates what's called a "heavy" checkout, which means that it also copies down and stores the full history, just like 'branch' does.
- fullermd
- That means that things like "bzr log" don't have to go across the network to the server.
- USER
- hmm, why would it be useful? It just would prevent you from being able to ever use the WT to branch from again if you wanted to push work up from another machine, like in my case, I have 3 pc's, linux windows and a netbook, which i code on sometimes when traveling..
- fullermd
- With a heavy checkout, you _can_ use that checkout as a source for 'branch'.
- USER
- ohhh,,, yargg..
- USER
- not sure i get the difference between that and a regular branch then.
- fullermd
- The difference is that the Branch for that WT is the one off on the server you checked out from.
- fullermd
- So when you 'commit', the rev goes there. When somebody else has commit'd, you get told you're out of date and to 'update' before you can commit.
- fullermd
- It's something you might commonly use for 'trunk' branches, for instance.
- USER
- so it's more like a pure svn approach for that instance.
- fullermd
- Right.
- fullermd
- USER: Let's invent a setup much like what I use.
- USER
- haha, ok ..
- fullermd
- USER: We have a project, project1. There's a central server with a 'trunk' branch on it, that me and the other 6 guys all work on to move this thing forward.
- USER
- yeh, sure.
- fullermd
- I'll create a shared repo at /project1 (yeah, I base everything off the root dir for examples)
- fullermd
- So "bzr init-repo /project1"
- USER
- cool, we've init'd a shared repo. great.
- fullermd
Now, I make a checkout of the trunk: "cd /project1 ; bzr co bzr+ssh://server/project1/trunk"
- fullermd
- Being a heavy checkout, it copies down all that history, and stores it in the shared repo instead of internally.
- USER
- right, we've got our heavy checkout of the trunk branch..
- fullermd
- So now I can work SVN-style by working and commit'ing and update'ing and all in /project1/trunk
- fullermd
- And in fact, if the other 6 guys are all SVN-heads, I can basically move them to bzr with this setup, and the only difference they'll see is they run "bzr whatever" instead of "svn whatever".
- fullermd
- They don't need to know anything about local branches or distributed whosafudge.
- fullermd
- But I do, see. 'cuz I'm smart and stuff.
- USER
- haha, cool, cause i have 5 guys in the office here all on svn, and i've broached the bzr scenario with them and they're... meh.. about it.
- fullermd
- So I want to work on adding nobbles to it. But that's gonna take a while and break stuff while I do it. People get pissed when I break trunk, for some reason.
- fullermd
- So, I make a new branch: "cd /project1 ; bzr branch trunk add-nobbles"
- USER
- yeh, i get that too.. but my commits usually break buildbot.
- fullermd
- Now I have trunk, still a checkout of the central trunk. Anything I do there goes straight to the central server.
- fullermd
- But I also have add-nobbles, as an independent branch, existing nowhere but on my drive.
- USER
- cool, so we've branched from our heavy checkout to another local branch but both in a shred repo. one history.. gotcha.
- fullermd
- (and that happened fast, because 'trunk' and 'nobbles' are both using the shared repo at /project1, so it didn't have to actually _copy_ any history)
- fullermd
- So now I can go ahead and work on add-nobbles for the 9 weeks it takes me to finish it, without bothering turnk.
- fullermd
- And without having 9 weeks of changes sitting uncommitted in a checkout, just begging for disaster. I can keep on making a few commits an hour there.
- USER
- yeh, this is a scenario i really like.
- fullermd
- Of course, trunk keeps moving ahead while this is happening too. I can still be working on fixes there and commiting them into trunk.
- fullermd
- So every couple days, I "cd /project1/add-nobbles ; bzr merge ../trunk ; (check them over) ; bzr commit", to bring in the trunk changes.
- fullermd
- This way I don't diverge too far, and when I get conflicts, they're small and easy to fix, rather than trying to do one giant merge at the end.
- fullermd
- Of course, this doesn't affect trunk at all. Just add-nobbles. But it means that I'm keeping up with those changes all the time.
- fullermd
- Eventually, I finish up, and it's ready to go. So I "cd /project1/trunk ; bzr update (just to be sure) ; bzr merge ../add-nobbles ; (check) ; bzr commit"
- fullermd
- My network churns for a while, uploading all those new revs to the server, and presto; the branch is landed into trunk.
- fullermd
- Next time everyone else 'update's, they get all those changes.
- fullermd
- Now, a few things here; first, that last merge is _EASY_, because I've already pretty much caught all the conflicts and fixed them with the periodic merges of trunk into add-nobbles.
- fullermd
- USER: Second, it's not just the equivalent of doing a big "diff | patch"; every single one of those hundreds of revs I put into add-nobbles is now in trunk, so if we need to track back into them for something later, they're ready and accessible.
- fullermd
- USER: In fact, I could "rm -rf /project1/add-nobbles" now if I wanted to; everything that was in it is in trunk now.
- fullermd
- USER: And third, all of that work was local. Nobody else even had to KNOW I was working on nobbles, and nobody needs to ever know or care that I did it in a separate branch.
- fullermd
- USER: So 95% of your team can totally ignore branching, and just work lockstep in trunk like SVN. And the 5% willing to shoulder the extra mental load of dealing with multiple branches can reap the benefits without getting in anyone else's way.
- fullermd
- USER: And that 95% can slowly start using branches if they want too. Maybe when you're working on something big in a branch, and want to share it with other people.
- fullermd
- USER: The two (or 3, 4, etc) of you can share the branch among yourselves, merge'ing each other. Or put it on a 'central' server, whether the same central server as trunk, or a different central server, and all create your own 'checkout's of it.
- fullermd
- USER: (and then your own local branch's of THAT checkout, and... *headsplode*)
- fullermd
- So you can start working just like SVN, and scale up (individually or as a team), when you're ready to handle the extra complexity and have a situation where it's worth it.
- fullermd
- In my work, a lot of stuff still just happens on trunk. Small bugfixes, tiny features... roughly "anything that fits in one revision", I tend to just do straight on trunk. Simpler.
- fullermd
- The longer or more involved or more breakage-inducing it is, the bigger the advantages of making a branch for it.
- fullermd
- I can choose on a case-by-case basis.
- USER
- fullermd, hmm, that's fascinating,, i'm going to have to digest it before i use it more broadly, but i can imagine myself doing so in time..
- fullermd
- USER: Sure. Pick up new bits when they're helpful.
- fullermd
- For a diversion, imagine I checkout'd /project1/trunk without making a shared repo at /project1, because I'm just gonna use the checkout, not branches anyway.
- fullermd
- But later, I suddenly decide I want to use branches. But I don't want to keep copying all the history.
- USER
- yeh, that was a question that was on my mind throughout this discussion.
- fullermd
- I can "bzr init-repo /project1" to create a repo there (empty; trunk still has its stuff internally)
- fullermd
- Then I can "cd /project1/trunk ; bzr reconfigure --use-shared" to move the history into the shared repo, and switch trunk to using it (and throw away its internal copy)
- USER
- hah, that's neat
- fullermd
- (you can also go the other way around and "bzr reconfigure --standalone" to create an internal repo for the branch and copy the history out of the shared into it. But you don't need to do that too often.)
- fullermd
- (it's also not QUITE exactly the mirror operation, for reasons we don't need to go into 'cuz you'll probably never use it)
- USER
- i was thinking bzr co /projectco; then bzr init-repo /project1; cd project1; bzr co /projectco ./trunk;
- fullermd
- That would probably not work so well. Having a checkout of a checkout is icky.
- USER
- hmm, yeh, maybe not.
- fullermd
- It may not work at all. It may sorta work, sometimes. I don't actually know.
- fullermd
- With straight branches it's easier, since they stand by themselves.
- USER
- you'd have to redirect the parent / target repo of the second checkout back to the main repo.
- USER
- or main trunk.
- fullermd
- (before reconfigure grew those options, using 'branch' into and out of a repo to move things in/out was SOP)
- USER
- aha, ok...
- USER
- wow, brain melt..... in a good way though.
- fullermd
- Now, if you already had the [heavy] checkout at /projectco, and didn't want to re-transfer all the data across the network, I'd do something like this:
- fullermd
bzr init-repo /project1 ; bzr branch /projectco /project1/tmp (just to 'prime' the repo) ; rm -rf /project1/tmp (cleanup) ; bzr co bzr+ssh://server/project1/trunk /project1/trunk"
- fullermd
- Since that temporary 'branch' primed the repo with all the revisions, that last checkout didn't have to transfer much of anything; just check that we already had everything.
- USER
- well you could do bzr init-repo project1; mv projectco ./project1/; cd project1; bzr reconfigure --use-shared; no?
- USER
- arrg, cd project1/projectco; sorry
- fullermd
- That would also work, yep.
- fullermd
- And if you decide "hey, that should be called 'trunk', not 'projectco'", you could also "cd /project1 ; mv projectco trunk"
- fullermd
- (mv, not bzr mv, note)
- USER
- ah, handy
- USER
- yeh, of course.
- fullermd
- As long as the branch stays _inside_ the repo, you can move it around to anything you want.
- USER
- same as svn, using mv subcommand is only for moving items inside WT
- fullermd
- If you mv it outside the repo, though, it'll start weeping the first time you do something that makes it look for a rev.
- USER
- oh really.
- USER
- that's interesting.
- fullermd
- Yah. 'cuz it'll start looking for its repo, and... umm.... where'd it go?
- fullermd
- (that's one of the main cases you use reconfigure --standalone; to prepare for mv'ing it out of the repo)
- USER
- so you can move it into the shared repo. reconfigure to --use-shared and after that just mv ./projectco ./#trunk
- USER
- etc etc.. as much as you like, and it's still tied the the shared repo.
- fullermd
- Right. A branch finds it shared repo implicitly, by looking at .., then ../../, then ../../../, etc.
- USER
- ahh, that's very simplistic, but good i suppose.
- fullermd
- So you could mv it to /project1/foo/bar/baz/quux/trunk if you wanted.
- fullermd
- So you can mv the branch around inside the repo, or even mv the repo around as a whole. As long as the branches stay 'under' the repo, everything's cool.
- USER
- that's nice .. i like this shared repo idea..
- fullermd
- A thing to note here, too, is that shared repos are entirely local.
- USER
- really handy for tying branches of related work together.
- fullermd
- Imagine you have the trunk and 2 feature branches in a shared repo on the central server.
- fullermd
- /project1/trunk, /project1/feat1, /project1/feat2 (/project1 being a shared repo)
- fullermd
- You could have co's of each of them in a shared repo at /myproject1 on your box.
- fullermd
- Or you could have co's of them in /myproject1 _without_ a shared repo, each having its own copy of the history.
- USER
- they are totally ignorant of eachothers shared repos'
- fullermd
- Or have trunk at /myproject1, and feat1 and feat2 in a repo at /myproject1features.
- USER
- gotcha,, yeh, that makes sense.
- fullermd
- Right. This is an aspect of "Repositories aren't semantic". Everybody's setup can be different.
- USER
- not bad, not bad.
- fullermd
- The only thing you ever directly look at is a Branch.
- USER
- yeh, i get that concept now..
- fullermd
- (now, a downside of this is that we don't currently have something like "mirror-repo", or "pull-repo", to mirror/update the whole set of branches)
- fullermd
- Doesn't necessarily mean we _can't_, but it's more involved. Some future work is looking in that direction.
- fullermd
- But one reason we've gotten by this far without it is that you often don't need it, so...
- USER
- fullermd: couldn't you just cp the shared repo,
- fullermd
- I of course use 'we' to mean 'somebody competent, not me' :p
- USER
- you don't need a specific mirror command,
- fullermd
- Sure. You could tar it up and move it around, or rsync it, or whatever.
- USER
- so that's a simple way to mirror the repo.
- fullermd
- Gets more expensive than it has to be for updates though.
- fullermd
- (and of course you can't mirror just a subset that way)
- USER
- ok, but in situation where you only have http access to the repo, you may not have permissions to edit, so you may want to mirror the code base to fork the project or something.
- fullermd
- Right. You'd generally use 'branch' rather than 'checkout' for that.
- USER
- especially if you still want to pull the downstream updates back from the master project.
- USER
- yeh,
- USER
- upstream, downstream... umm which ever way that indicates....
- fullermd
Then you could just "bzr branch http://where/ever upstream", and never do anything in that upstream branch except 'pull' updates, and use it as a source for 'merge'ing them into your codebase.
- USER
- so still reaching for a real reason you'd need pull-repo
- fullermd
- (never doing any merge's or push's into it, or committing, or etc)
- fullermd
- It would be a lot more efficient for updating it than rsync and friends, for one.
- USER
- well, you may want to pull-repo if say you have multiple branches for an app that's acorss say, different hardware architectures or something?
- fullermd
- For another, you might want all the branches in that repo in a repo locally, but ALSO have your own local branches in that same repo (sharing the storage)
- fullermd
- Yeah, there are a number of good reasons for it. Just takes some planning in a VCS that's Branch-oriented like bzr.
- USER
- ok, cool.
- USER
- bleugh... my brain is full i think
- fullermd
My work here is done