Bazaar

Bazaar

 




Wiki Tools

  • Find Page
  • Recent Changes
  • Page History
  • Attachments

<<TableOfContents: execution failed [Argument "maxdepth" must be an integer value, not "[]"] (see also the log)>>

Overview

The bzr high performance server framework is complete and now we need to implement callouts for each object level method. In the bzr client a connection to the smart server is represented by a SmartClientMedium object. This client represents the bottom of the protocol stack - its able to put bytes on a wire and deliver them to the smart server at the far end.

To expose the smart server internally we then build a client on top of the medium. For example RemoteTransport instances are clients that implement a bzrlib.transport.Transport using a SmartClientMedium.

To merge the full server we need to finish implementations of the RemoteBzrDir, RemoteBranch, and RemoteRepository remote objects. This may require some changes to SmartClientProtocol and SmartServerProtocol but these should be minimal at best. The bulk of the work should be adding additional tests and request methods to match these new objects. The current implementations of these RemoteObjects are implemented from the client, using the Transport VFS layer calls. The implementations will be finished when all tests can pass without the remote VFS enabled. As we progress from here to there, we expect to have all the tests passing at any point in time as long as the VFS is enabled.

RemoteObjects

A RemoteSomething class is one that represents Something which is being access via the smart protocol. For instance, a RemoteBranch is a branch which is being accessed over the smart protocol.

Instead of manipulating the disk data for these RemoteObjects using filesystem apis like we do when we use a Branch object over NFS/SFTP/webDAV/FTP, the RemoteObjects implement a higher level protocol, so we can directly request that a 'new branch be made' or 'please garbage collect this repository'.

Each request in the client API may be made up of a number of requests sent over the wire: each request must be strictly processed with all the incoming data read before any outbound data is sent. That is, you cannot change what you are sending in a request based on data from the same requests' response.

The request flow for a method you call on (say) RemoteBranch makes a number of requests, being transformed from an api call, into structured data by the RemoteBranch, and then into bytes-on-a-wire by the current Protocol version object, and then sent by the specific Medium in use. (HTTP is a Medium, as are sockets, pipes, and potentially other things like email, message queues etc). At the far end, a Medium passes the bytes to a Protocol object that converts them to simple structured data, and passes that to the request specific handler which will open up the Branch or Repository or whatever, perform the request, and return the result as simple structured data, where it begins the reverse traversal. In some concrete detail:

RemoteBranch.revision_history()
  |
  v
SmartClientProtocol.call('revision_history', 'path-to-branch')
  |
  v
SmartClientMedium.accept_bytes('revision_history\x01path-to-branch\n')
  |
  v
socket.write('revision_history\x01path-to-branch\n')
  |
  v
socket.read() -> 'revision_history\x01path-to-branch\n')
  |
  v
SmartServerMedium.accept_bytes('revision_history\x01path-to-branch\n')
  |
  v
SmartServerProtocol.accept_bytes('revision_history\x01path-to-branch\n')
  |
  v
SmartServerProtocol.method_handlers['revision_history']()('path-to-branch')
  |
  v
RevisionHistoryHandler.handler('path-to-branch')
  |
  v
SmartServerProtocol.send_bulk_data(('ok', ), '\n'.join(revision_history_here))
  |
  v
SmartServerMedium.write('ok\n...') (the bulk data is encoded here)
  |
  v
socket.write('ok\n....')
  |
  v
SmartClientMedium.read() -> 'ok\n...'
  |
  v
SmartClientProtocol.read() -> ('ok', ))
  |
  v
SmartClientProtocol.decode_bulk_data.split('\n')
  |
  v
RemoteBranch.revision_history now returns the result.

Adding additional functionality involves two key steps:

  • Addition of a user method to the RemoteObject that users will call.

  • Addition of a request handler to the protocol on the server end which will serve the request.

RemoteBzrDir

This class represents the '.bzr' concept remotely. We will probably put static methods on this, or perhaps on RemoteBzrDirFormat, for creating bzrdirs, branches and repositories remotely. RemoteBzrDir should be pretty simple when finished, as all its methods will be remotely implemented. It is probably the first one we want to implement.

RemoteBranch

This class represents the '.bzr/branch' structure for a remote branch. It will hopefully have relatively few remote methods, and have much of its api implemented locally in terms of just a couple of calls to its core api. This should be done with care to avoid performance problems - see below in Helping Out for more details.

RemoteRepository

This class represents the core data storage we work with, accessed remotely. It has a rich api, dealing with complex data types and large amounts of data. It is probably the last type we will finish converting, but it is also the most important. Our plan is that revision and text data being sent to a repository will be encoded as a binary bundle format - but we can start with regular bundles today. To avoid having massive request /response sizes, the RemoteRepository will need to implement operations like pull or push in such a way that a number of small requests are sent : for instance, one per 1000 revisions. As each is a roundtrip, performance profiling to determine the best number will be important - and we may even want it to be runtime selectable, to all dynamically determining the best number for a given scenario.

We will want InterRepository specialisations done for RemoteRepository to RemoteRepository and also RemoteRepository to anything else, and anything else to RemoteRepository. For instance, RemoteRepository to a RemoteRepository on the same server can potentially avoid downloading any data and do a local copy at the far end - though that requires some knowledge of the VFS mapping that we have not fine tuned yet - so that particular optimisation is not an immediate priority.

RemoteTransport

This is a Transport implementation which makes calls to the smart server that perform low level virtual-file-system actions - this is the VFS that we need to be able to disable by choice when all the smart server work is ready to merge.

Helping Out

The branch where current development is being carried out is a shared branch on the supermirror - sftp://bazaar.launchpad.net/~bzr/bzr/hpss. Any member of the bzr team in launchpad can commit to this branch (just use bzr checkout sftp://bazaar.launchpad.net/~bzr/bzr/hpss to get a checkout, then bzr commit will commit to it). If you are not a member of the bzr team, you can join the team easily - just head to https://launchpad.net/people/bzr/ and sign up.

Determining what needs to be worked on

If you run NO_SMART_VFS= ./bzr selftest you will cause the smart server to disable all the VFS api calls. This will make the current emulated implementations of the RemoteObjects fail - and you will get a test failure for each attempt to use a VFS method over the smart server protocol.

For each of these failures we need to add support for the needed method - first a request handler and then a user api.

Addition of a request handler to the protocol on the server end which will serve the request

When the call fails, the backtrace will look something like:

Traceback (most recent call last):
  File "bzrlib/commands.py", line 623, in run_bzr_catch_errors
    return run_bzr(argv)
  File "bzrlib/commands.py", line 585, in run_bzr
    ret = run(*run_argv)
  File "bzrlib/commands.py", line 291, in run_argv_aliases
    return self.run(**all_cmd_args)
  File "bzrlib/builtins.py", line 605, in run
    to_transport.mkdir(relurl)
  File "bzrlib/transport/remote.py", line 202, in mkdir
    self._translate_error(resp)
  File "bzrlib/transport/remote.py", line 337, in _translate_error
    raise errors.SmartProtocolError(unicode(resp[1]))
SmartProtocolError: Generic bzr smart protocol error: The smart server method MkdirRequest is disabled.

Note the topmost frame that is in bzrlib.transport. This is the VFS API the failing test is attempting to run. Immediately above it will be the api that is calling it, and thats the one we need to teach to use a smarter, more explicit call.

Firstly we should identify why that VFS API was being used. Looking at the demonstration backtrace above you can see that it is a call to mkdir and is being called from bzrlib.builtins.cmd_push.run. So looking at the code doing the call, we can see that it is in fact simply part of the 'create a branch' intent: it's creating the parent directory of the one needed to create the branch.

Secondly, we need to decide what the right smart server api should look like for this. This is probably a matter of taste, but this is intended to be a secure, high performance API. Because of the impact of latency of performance, having few primitives and building up functionality be undesirable. So we are currently building on the basis that:

  • Precise methods that achieve one user story are good.
  • Methods that can be coerced into doing many different things with unclear data-access or read/write requirements are bad.

For this example, I'm going to decide that I want a single API call to initialise the bzrdir, repository if needed, and branch called create_branch.

Now we need to add that API to the smart server SmartServerRequestHandler registry. We want this code to be extremely well tested, so we start this with tests. For each request handler we expect to see about 4 tests:

  • If it creates objects (revisions, branches, repositories, etc):
    • Test it fails with a semantic error and does not raise an exception.
  • If it returns existing objects, or derived data:
    • Test it returns an appropriate error if the object is not there, or if the derived data is missing a predicate.
  • Test that it does what it is meant to for each type of input parameter. I.e. if it takes an optional parameter, test it without that parameter, and with it.

  • Test boundary conditions if it is vulnerable to them (for instance, zero length paths), paths with too many '..' etc.
  • If it accepts or returns bulk data that it reads all the data/writes all the data in that order - it must not try to write some and then read some: this is a strictly message based protocol.

  • Test it is registered as anticipated in the registry.

create_branch will accept a path to create a branch at, a flag to control creating the prefix for the branch, and a branch format tuple. The branch format tuple will be the class names of the internal format objects to use. I'm choosing to use the class names because they are unique, and not all branches have a format string on disk which I can refer to. (For instance, pre-metadir branches only have one string for the entire format description). So I want to be able to make calls like :

   create_branch('missing_dir/new_branch', True)
   create_branch('new_branch', format=(None, 'KnitRepositoryFormat2', None))

but the smart protocol is not as rich as python. It particular it can only natively represent strings and tuples - so it cannot handle None in a tuple, nor does it have keyword or default-value functions. To implement default-value functions you need to explicitly cater for that in the API. So here are those calls made as the client will need to make them:

   create_branch('missing_dir/new_branch', 'T', '', '', '')
   create_branch('new_branch', 'F', '', 'KnitRepositoryFormat2', '')

In the smart protocol, errors are implemented by returning a normal response that is interpreted as an error by the client. For instance, ('ok',) is a regular response, and ('failure',) is a regular response that would trigger an error in the client (in this case because it would be an 'unknown response').

It does not accept bulk data, nor does it return it. This means that I will need the following tests (decided by inspecting the api for corner cases):

  • create a branch where the prefix exists and the branch can be created under it successfully returns ('ok',) with no format specifier. This should also check that the branch created has the current default format, as it is the trivial default behaviour case.
  • create a branch where there is a branch fails by sending ('branch exists',).
  • create a branch where the prefix does exist and the create_prefix flag is not 'T' works (check the corner case of 'T' being unneeded).
  • create a branch where the prefix directory does not exist and the create_prefix flag is not 'T' fails by sending ('prefix does not exist').
  • create a branch where the prefix directory does not exist and the create_prefix flag is 'T' succeeds and returns ('ok',).
  • create a branch with a specified format ('', 'KnitRepositoryFormat2', '') and check the format is what I requested

Additionally, for unicode sanity we should have an additional test that uses unicode paths, or perhaps just use a unicode path when possible according to the test environment. Need to speak with John about the nicest way to do this that will work on Mac and Windows. Until then, we just use a regular path.

This tutorial walks through making two of these tests - hopefully the rest are obvious, but if not, just ask on IRC for me (RobertCollins) or AndrewBennetts and we can happily guide you.

Adding the test case - the class to contain all the tests is straight forward: In bzrlib/tests/test_smart.py, add a new subclass of TestSmartServerRequestHandler like so:

class TestSmartServerRequest_create_branch(TestSmartServerRequestHandler)
    """Test the create_branch API which is meant to encapsulate creating a new branch."""

Now we can add the registration test method:

class TestSmartServerRequest_create_branch(TestSmartServerRequestHandler)
    """Test the create_branch API which is meant to encapsulate creating a new branch."""

    def test_create_branch_registration(self):
        # the create_branch method should be registered to create CreateBranchRequest objects.
        self.assertEqual(smart.server.CreateBranchRequest,
            smart.protocol.ProtocolOne().method_handlers['create_branch'])

And the first of the unit tests for the handler itself:

class TestSmartServerRequest_create_branch(TestSmartServerRequestHandler)
    """Test the create_branch API which is meant to encapsulate creating a new branch."""

    def test_create_branch_registration(self):
        # the create_branch method should be registered to create CreateBranchRequest objects.
        self.assertEqual(smart.server.CreateBranchRequest,
            smart.protocol.ProtocolOne().method_handlers['create_branch'])

    def test_create_branch(self):
        # Test that creating a new branch under an existing prefix with the a format specifier of
        # ('', '', ''), a create_prefix flag of 'F' and a branch name of 'new branch' returns ('ok',)
        # and creates a branch with the current default format.
        handler = smart.server.getrequestHandler()
        result = handler.handle('F', '', '', '')
        self.assertEqual(('ok', ), result)
        created_branch = branch.Branch.open('new branch')
        default_format = bzrdir.BzrDirFormat.get_default_format()
        self.assertEqual(default_format.__class__,
            created_branch.bzrdir._format.__class__)
        self.assertEqual(default_format.repository_format.__class__,
            created_branch.repository._format.__class__)
        self.assertEqual(default_format.branch_format.__class__,
            created_branch._format.__class__)
        # and it should have no history
        self.assertEqual([], branch.revision_history())

The other tests will look fairly similar, but they dont all need to check the format - they should be focused on testing the described tests above.

Addition of a user method to the `RemoteObject` that users will call

With the request handler established we now need to add a local call to perform this task. As before we need to start by defining what it will do and what tests we will want to see for it.

Each user method will fall into one of two groups: it will either be an implementation of an existing API on BzrDir/Branch/WorkingTree/Repository or it will be a new API offering an abstraction that is not currently in use. We deal with these a little differently:

  • For existing API's, there are tests already. All we need to test here is that the new implementation hands the requests off to the client protocol with a structure that matchines that expected by the request handler - which both tests for the specific structure so we will detect version skew if it occurs in the future, and that the new implementation is implemented using the smart protocol, not using the VFS layer. For one of these API's we expect the following tests:
    • Does the API call a protocol with the expected structure. (Use the same sample data we use for the request handler tests). Like with the request handler, we need a few tests - one for each different parameter to check they serialise correctly.
  • For new API's we need to do the tests we will do for existing API's, as well as tests for the API itself - which will typically involve an interface conformance test like those in bzrlib.tests.workingtree_implementations. For now, if you think you have a new API, chat with someone (random of course on IRC or send a message to the bzr mailing list.

In our example we need to have a create_branch API that replaces all the transport manipulation that we need done. At the moment we have a BzrDir.create_branch_convenience method that is a static method on BzrDir and very close to our needs. So we can consider extending its API. create_branch_convenience currently operates on URLs alone which is fine, except that creating a bzrdir remotely, *then* doing create_branch on it will be more round trips, and still manages to use the VFS. It looks like we will need a whole new API, or the ability to have a RemoteBzrDir refer to a URL that does not currently exist, let alone being a valid BzrDir. I've chosen to do this by making a BzrDirLocation abstraction, which will represent a bzr dir location, and will have all the methods that are currently static methods on bzrdir. We will then get the location via a factory method that can perform the current probing needed to detect a smart server, and fall back to the current Transport based implementation when there is no smart server.

What tests do I need? I'll need tests for success and failure of the new functionality, but not for simple relocation of existing functions : I can clone the existing tests for that [and mark the original copies as being tests of deprecated functions].

What new functionality is there? I plan to use the existing probe-on-transport hook to generate the BzrDirLocation instances - its a good generalisation of what is already there.

Theres a general pattern I'm going to follow here which is 'Refactor to make it easy to add new functionality'. So the next section is the refactoring, and we'll finish with the 'add the new functionality' which will count as just a new implementation of an existing API.

Refactor BzrDir to move create_branch_convenience to a BzrDirLocation class

The easiest immediate thing to do is to extend the existing create_branch_convenience API to perform create_prefix (still on regular transports).

For this, the existing tests cover the use of the existing API, so we just need to test that:

  • when create_prefix=True is passed in it creates the prefix, and
  • when create_prefix=False is passed in it raises the right error if the prefix does not exist.

The next thing that is easiest to do is to make a new BzrDirLocation class and move this method onto it. So we need tests for:

  • The existence of a base BzrDirLocation class - testing the API for extenders.

    • In bzrlib.tests.test_bzrdir:
      class TestBzrDirLocation(TestCaseWithTransport):
          """Tests for BzrDirLocation base class."""
      
          def test_subclassing(self):
              """BzrDirLocation should be subclassable without special effort."""
              class a_subclass(BzrDirLocation):
                  """A subclass."""
              # check it does not assert on construction
              a_subclass(self.get_transport())
  • All the existing create_branch_convenience tests from bzrlib.tests.test_bzrdir need to be ported. Skipped as an exercise for the reader :).
  • Move the functionality from BzrDir.create_branch_convenience to BzrDirLocation.create_branch

        @staticmethod
        def create_branch_convenience(base, force_new_repo=False,
                                      force_new_tree=None, format=None):
            """Deprecated method. Please see BzrDirLocation.create_branch."""
            t = get_transport(safe_unicode(base))
            return BzrDirLocation(t).create_branch('.',
                force_new_repo=force_new_repo, force_new_tree=force_new_tree,
                 format=format)
     
    ...
    
    class BzrDirLocation...
    
        def create_branch(self, relative_url, force_new_repo=False,
                          force_new_tree=None, format=None):
            """Create a new BzrDir, Branch and Repository at the relative url given.
    
            This is a convenience function - it will use an existing repository
            if possible, can be told explicitly whether to create a working tree or
            not.
    
            This will use the current default BzrDirFormat, and use whatever 
            repository format that that uses via bzrdir.create_branch and
            create_repository. If a shared repository is available that is used
            preferentially. Whatever repository is used, its tree creation policy
            is followed.
    
            The created Branch object is returned.
            If a working tree cannot be made due to base not being a file:// url,
            no error is raised unless force_new_tree is True, in which case no 
            data is created on disk and NotLocalUrl is raised.
    
            :param relative_url: The URL to create the branch at.
            :param force_new_repo: If True a new repository is always created.
            :param force_new_tree: If True or False force creation of a tree or 
                                   prevent such creation respectively.
            :param format: Override for the for the bzrdir format to create
            """
            if force_new_tree:
                # check for non local urls
                if not isinstance(self.transport, LocalTransport):
                    raise errors.NotLocalUrl(self.transport.base)
            if format is None:
                bzrdir = BzrDir.create_on_transport(transport.clone(relative_url))
            else:
                bzrdir = format.initialize_on_transport(transport.clone(relative_url))
            repo = bzrdir._find_or_create_repository(force_new_repo)
            result = bzrdir.create_branch()
            if force_new_tree or (repo.make_working_trees() and
                                  force_new_tree is None):
                try:
                    bzrdir.create_workingtree()
                except errors.NotLocalUrl:
                    pass
            return result

We now have a BzrDirLocation with a create_branch method that does what create_branch_convenience used to do. We should deprecate create_branch_convenience at this point [and update all its callers to use BzrDirLocation].

  • Convert the existing tests to use applyDeprecated
    •     def test_create_branch_convenience(self):
              # outside a repo the default convenience output is a repo+branch_tree
              old_format = bzrdir.BzrDirFormat.get_default_format()
              bzrdir.BzrDirFormat.set_default_format(bzrdir.BzrDirMetaFormat1())
              try:
                  branch = self.applyDeprecated(symbol_versioning.zero_thirteen, 
                      bzrdir.BzrDir.create_branch_convenience, '.'))
                  branch.bzrdir.open_workingtree()
                  branch.bzrdir.open_repository()
              finally:
                  bzrdir.BzrDirFormat.set_default_format(old_format)
  • And deprecate the function:
    •     @deprecated_method(zero_thirteen)
          @staticmethod
          def create_branch_convenience(base, force_new_repo=False,

I've skipped updating the callers - but it should be obvious how to do so. (Hint: make a BzrDirLocation with a transport; use it).

At this point all tests should still pass and we are ready to starting making this interface more powerful.

We now need to add the ability for BzrDirLocation to have a list of implementation and try them serially. The first step for this is to move the current implementation into a subclass, and use that as the seed member of the list. We can start by adding the internal helper functions which wont break anything.

  • We need a routine that will take a method name to call, its *args and **kwargs, a possible existing instance, and a list of factories to use as things fail.
    • So how can this fail? It could be a bad method name, or bad arguments, but python will catch these for us. The factories could fail to work, but again python will tell us and these are not regular conditions. We could run off the end of the list, so that needs an exception of its own - even though that should never happen, so a plain BzrError will be sufficient. The routine we want could fail to fallback though, or if a specific implementation fails with a different error than expected it might catch too much. So we need three tests.. success, fallback and passing through exceptions.

    • Testing this is best done with a little simulator to let us catch all the corner cases:
      def make_sample_bzrdirlocation(result, bad_args):
          """Generate a class on the fly for testing the BzrDirLocation provider list."""
          class SampleBzrDirLocation(object):
              """A sample BzrDirLocation for testing.
      
              when 'method' on this is called, it will return result. If result
              is an exception it will be raised instead of called. If bad_args is True,
              then TypeError is raised to simulate a python runtime parameter call with bad
              arguments.
              """
              calls = []
              def __init__(self, at_transport):
                  pass
              def method(self, *args, **kwargs):
                  self.calls.append((args, kwargs))
                  if bad_args:
                      raise TypeError
                  if isinstance(result, Exception):
                      raise result
                  return result
          return SampleBzrDirLocation
    • Now we can write the tests. Success:
      class TestBzrDirLocation...
      
          def test_apply_method(self):
              # trivial case: apply a method to a list with one provider and cache the result.
              location = BzrDirLocation(at_transport=MemoryTransport())
              sample_location = make_sample_bzrdirlocation(1, False)
              location.providers = [sample_location]
              self.assertEqual(1, location.apply_method('method', ['foo'], {'bar':'thing'}))
              self.assertEqual([(['foo'], {'bar':'thing'})], sample_location.calls)
              self.assertIsInstance(location.selected_provider, sample_location)
      Fallback:
          def test_fallback_on_incompatible_transport(self):
              # when errors.BzrDirLocationIncompatibleWithTransport is raised, apply_method should try the next provider
              location = BzrDirLocation(at_transport=MemoryTransport())
              incompatible_sample_location = make_sample_bzrdirlocation(errors.BzrDirLocationIncompatibleWithTransport(), False)
              sample_location = make_sample_bzrdirlocation(2, False)
              location.providers = [incompatible_sample_location, sample_location]
              self.assertEqual(2, location.apply_method('method', [], {}))
              self.assertEqual([([], {})], incompatible_sample_location.calls)
              self.assertIsInstance(location.selected_provider, sample_location)

      Raise other exceptions. We cant prove a negative, but checking for TypeError (cannot call the function) is sufficient unless someone deliberately tries to be difficult.

          def test_raise_other_exceptions(self):
              # when a random exception is raised, apply_method should just fail, but still cache the selected provider
              location = BzrDirLocation(at_transport=MemoryTransport())
              bad_type_sample_location = make_sample_bzrdirlocation(None, True)
              sample_location = make_sample_bzrdirlocation(2, False)
              location.providers = [bad_type_sample_location, sample_location]
              self.assertRaises(TypeError, location.apply_method('method', [], {}))
              self.assertEqual([([], {})], bad_type_sample_location)
              # no calls must be made to the fallback in this case.
              self.assertEqual([], sample_location)
              # and we must have cached the first one in the list.
              self.assertIsInstance(location.selected_provider, bad_type_sample_location)
      No providers available gets its own error:
          def test_raise_on_end_of_list(self):
              location = BzrDirLocation(at_transport=MemoryTransport())
              location.providers = []
              self.assertRaises(errors.BzrError, location.apply_method, 'method', [], {})
      Finally, use the selected provider if its set, even on failures that would cause fallback normally.
          def test_selected_provider_wins_and_there_is_no_further_fallback(self):
              # location.selected_provider should just be used, with no further probing.
              location = BzrDirLocation(at_transport=MemoryTransport())
              incompatible_sample_location = make_sample_bzrdirlocation(errors.BzrDirLocationIncompatibleWithTransport(), False)
              location.selected_provider = incompatible_sample_location()
              sample_location = make_sample_bzrdirlocation(2, False)
              location.providers = [sample_location]
              self.assertRaises(errors.BzrDirLocationIncompatibleWithTransport, location.apply_method, 'method', [], {})
              self.assertEqual([], sample_location.calls)
    • With those in place the code is easy:
          def apply_method(self, methodname, args, kwargs):
              """Run methodname in as many location providers as needed until one works.
      
              If there is a selected provider already present, its methodname is directly
              invoked with no try..except around it. Otherwise, each provider is constructed
              with self.transport and methodname invoked until one returns normally
              or raises anything other than errors.BzrDirLocationIncompatibleWithTransport.
              That one is then saved as self.selected_provider.
      
              :param methodname: The method to run.
              :param args: The args to pass as positional args to the method.
              :param kwargs: The args to pass as keywords args to the method.
              :return: the methods return value.
              """
              if self.selected_provider is not None:
                  return getattr(self.selected_provider, methodname)(*args, **kwargs)
              for provider_factory in self.providers:
                  provider = provider_factory(at_transport=self.transport)
                  try:
                      result = getattr(provider, methodname)(*args, **kwargs)
                      self.selected_provider = provider
                      return result
                  except errors.BzrDirLocationIncompatibleWithTransport:
                      pass # incompatible, try the next one
                  except: 
                      # all other exceptions stash the provider for future reference.
                      self.selected_provider = provider
                      raise
              raise errors.BzrError("No BzrDirLocation providers")
  • I need a subclass of BzrDirLocation to hold the provider for regular Transport based operations.

    • So a test that we can construct a TransportBzrDirLocation with a transport (goes in bzrlib.tests.test_bzrdir).

      class TestTransportBzrDirLocation(TestCaseWithTransport):
      
          def test_construct(self):
              """Construction should take a single parameter - a transport."""
              location = TransportBzrDirLocation(at_transport=get_transport(self.get_url()))
    • as we're using a duck typing idiom in bzrlib, checking that it rejects a non-transport is inappropriate.
  • We need to set the providers list on BzrDirLocation. As this can be dynamically extended, we can only test that TransportBzrDirLocation is in it, not that it has a specific value.

    class TestBzrDirLocation...
    
        def test_default_providers(self)
            """The default providers list must include TransportBzrDirLocation."""
            self.assertTrue(TransportBzrDirLocation in BzrDirLocation.providers)
  • Now we can push the create_branch implementation down to TransportBzrDirLocation:

    class BzrDirLocation...
    
        providers = []
    
        def create_branch(self, relative_url, force_new_repo=False,
                          force_new_tree=None, format=None):
            """Create a new BzrDir, Branch and Repository at the relative url given.
    
            This is a convenience function - it will use an existing repository
            if possible, can be told explicitly whether to create a working tree or
            not.
    
            This will use the current default BzrDirFormat, and use whatever 
            repository format that that uses via bzrdir.create_branch and
            create_repository. If a shared repository is available that is used
            preferentially. Whatever repository is used, its tree creation policy
            is followed.
    
            The created Branch object is returned.
            If a working tree cannot be made due to base not being a file:// url,
            no error is raised unless force_new_tree is True, in which case no 
            data is created on disk and NotLocalUrl is raised.
    
            :param relative_url: The URL to create the branch at.
            :param force_new_repo: If True a new repository is always created.
            :param force_new_tree: If True or False force creation of a tree or 
                                   prevent such creation respectively.
            :param format: Override for the for the bzrdir format to create
            """
            return self.apply_method('create_branch', args=[relative_url],
                kwargs={'force_new_repo':force_new_repo,
                        'force_new_tree':force_new_tree,
                        'format':format})
    
    
    class TransportBzrDirLocation(BzrDirLocation):
        """Transport-api using BzrDirLocation implementation."""
    
        def create_branch(self, relative_url, force_new_repo=False,
                          force_new_tree=None, format=None):
            """See BzrDirLocation.create_branch().
    
            This implementation uses Transport APIs alone.
            """
            if force_new_tree:
                # check for non local urls
                if not isinstance(self.transport, LocalTransport):
                    raise errors.NotLocalUrl(self.transport.base)
            if format is None:
                bzrdir = BzrDir.create_on_transport(transport.clone(relative_url))
            else:
                bzrdir = format.initialize_on_transport(transport.clone(relative_url))
            repo = bzrdir._find_or_create_repository(force_new_repo)
            result = bzrdir.create_branch()
            if force_new_tree or (repo.make_working_trees() and
                                  force_new_tree is None):
                try:
                    bzrdir.create_workingtree()
                except errors.NotLocalUrl:
                    pass
            return result
    
    
    BzrDirLocation.providers.append(TransportBzrDirLocation)

We've now fully refactored create_branch_convenience and all we need to do is introduce the smart client implementation. Generally you wont have anything anywhere near as big as this to do - but I hope this demonstrates the approach needed if it does occur!.

  • As this grows I'll want interface conformance tests for BzrDirLocation as it is an entirely new abstraction.

    • This is an uncommon step so I'm going to skip it for brevity. Ask on IRC or email for details if you need to do this (or look at the code [the code for this specific example is being written now])
  • We need a RemoteBzrDirLocation which will be tried before the TransportBzrDirLocation.

    • This should also construct with just a transport, for conformity - so we can have a list of location types to try. It can obtain its medium and url from the transport. It should not throw any exceptions on construction, even if given a transport without smart tunnelling support, because having a transport's smart client medium does not guarantee the presence of a real server connected to it - just the ability in principal to tunnel the protocol. So...
      1. construct with a transport
        class TestRemoteBzrDirLocation(TestCase):
        
            def test_construct_with_non_smart_transport(self):
                location = RemoteBzrDirLocation(at_transport=MemoryTransport())
      2. fail on the first operation it cannot perform. I.e. it cannot get a smart medium, or there is no server running.
            def test_fail_with_no_smart_medium(self):
                transport = MemoryTransport()
                location = RemoteBzrDirLocation(at_transport=transport)
                # We would normally add a test for this errors formatting to test_errors.py - skipped for
                # brevity
                self.assertRaises(errors.BzrDirLocationIncompatibleWithTransport, location.create_branch, '.')
        We have not tested that it gives the same error when no server is running yet, so lets add that.
            def test_fail_when_no_server_available(self):
                http_server = bzrlib.transport.http.HttpServer()
                # bring up the http server, so there is a server running, just no smart tunneling available
                http_server.setUp()
                self.addCleanup(http_server.tearDown)
                transport = bzrlib.transport.get_transport(http_server.get_url())
                location = RemoteBzrDirLocation(at_transport=transport)
                self.assertRaises(errors.BzrDirLocationIncompatibleWithTransport, location.create_branch, '.')
    • Now that we have tested the basic machinery we can test the actual functionality:
      • class TestWithServer(TestCaseWithTransport):
        
            def setUp(self):
                 super(TestWithServer, self).setUp()
                 self.transport_server = smart.TCPServer_for_testing
        
            def test_smoke_create_branch(self):
                # the interface conformance tests cover detailed testing of this api, this test is an example for this tutorial. 
                # we test the implementation directly, without the wrapper, as we are testing for precision.
                location = RemoteBzrDirLocation(self.get_transport())
                bzrdir = location.create_branch('.')
                self.assertIsInstance(bzrdir, bzrlib.remote.RemoteBzrDir)
                branch = bzrdir.open_branch()
                # a new branch should have no history.
                self.assertEqual([], branch.revision_history())
    • Register the implementation, and it must be registered before TransportBzrDirLocation.

      class TestRemoteBzrDirLocation...
      
          def test_default_providers(self)
              """The default providers list must include TransportBzrDirLocation after RemoteBzrDirLocation."""
              self.assertTrue(BzrDirLocation.providers.index(RemoteBzrDirLocation)
                   < BzrDirLocation.providers.index(TransportBzrDirLocation))

We've already refactored, so with this complete, we have finished the implementation.