Launchpad Entry: https://blueprints.launchpad.net/products/bzr/+spec/smart-server
The smart server in 1.0 is a usable alternative to vfs-based operations, and substantially faster in some cases. Further improvements are a major focus for the next several releases, particularly:
- removing roundtrips
- eliminating vfs-type operations
- streaming rather than buffering large transfers
- better remote graph operations
We are writing a high performance smart server to optimise performance over the network and offer low-overhead centralised triggers like 'email on commit'.
The smart server will operate over either HTTP or SSH providing read-write access in mode situations.
Its very hard to create a passive-server model with constant time access - there is a strong tendency for either the number of records that need to be accessed to increase, or the number of round trips to increase. A smart server allows network latency and bandwidth to be removed from the scaling equation as tree size and history size increase. This does not reduce the desire to have highly efficient storage - but it does improve performance over the network massively.
emacs dvc could start bzr server as a subprocess, with pipes (or unix-domain sockets) connected to its stdin and stdout. The user does a bzr operation in emacs (e.g. showing a diff, status, doing a commit). dvc composes a bzr remote command, writes it to the server, and reads and parses the output, then displays it to the user. (Similarly for meld, vim, eclipse, and anything else that can't or doesn't want to run bzr in-process because of language, licence or concurrency issues.)
- bzr connects over ssh to a remote machine and starts the server, and uses this to push changesets to or pull them from a remote branch.
- bzr connects to a smart server running within an http server. By probing a url it discovers that there's more than just a passive server. It passes bzr server commands through HTTP POST requests.
- Johanna pushes changes to a branch up onto the Launchpad Supermirror. The changes are published into the branch, but other Launchpad code is also notified so that it can send email, update indexes, etc etc.
- Ewen maintains a web site in bzr, and has bzr installed on the server. When he does a bzr push his history is mirrored up and the working copy is also updated; the working copy is served up as the contents of the web site. (This mode of operation is popular with users of darcs.)
We can add remote commands over time. The local GUI use case is likely to requite many detailed operations, whereas pushing to a server probably requires just a 'push' command to start with.
The desired uses constrain the protocol design.
Sending requests over HTTP POST means that the request must be entirely transmitted before any reply can be sent, and that we cannot rely on sending more than one POST across a single connection.
This seems to imply each request must be self-contained; e.g. locks cannot be held across multiple operations. Many of the existing Branch operations are too fine-grained to be useful in this way. Operations may be (very approximately) at the level of bzr commands, which also do not hold locks across commands.
The data transmitted may be quite large, when e.g. fetching the complete history of a large branch. This may imply some kind of streaming.
(Straw man) Remote operations have inputs and outputs similar to shell commands:
- commands are identified by a string name
- commands have a number of string parameters
- commands can take streamed input equivalent to xml
- commands can either succeed or fail
- if they succeed, they can return streamed output
- if they fail, they return some structured representation of an exception object, and a human-readable message
Requests and responses are encoded as a basic_io header, followed by streaming data similar to http chunked encoding.
First stanza gives the protocol version; it will be literally
Second stanza gives the command name and named arguments:
command COMMAND ARGNAME ARGVAL ... ...
Following this, we have a series of input bulk data chunks,
chunk = chunk-size [ chunk-extension ] CRLF chunk-data CRLF chunk-size = 1*HEX last-chunk = 1*("0") [ chunk-extension ] CRLF
The request is completely transmitted before the response comes back. We have the option to in the future do pipelined requests or responses.
Following the last chunk another request may be sent.
The response also starts with a version identifier in a stanza by itself:
If an error occurred,
error ERRNAME ARG ARGVALUE ... ...
The response to the next request follows the error.
If there is no error,
success VALUE ARG ARGVALUE ... ...
and then chunked data.
The protocol should start with an easily recognizable version number.
The versioning might want to allow for different encodings (basic_io, xml, etc.)
It is likely that people will want to push to a server running an old bzr release from a newer bzr release, or vice versa. We should try as much as possible to support this, or at worst to fail gracefully.
A new URL scheme is available for bzr branches, say bzr+ssh://host/path, bzr+http://host/path, bzr://host:port/path.
takes a revision, inserts into the remote store, appends to the revision history, updates working directory
send complete history of branch (e.g. as tarball)
Push should send a changeset in packed-up form; this will fail to append if the branches have diverged. The client should send the bulk of the changeset as the bulk input to the request.
Is the server required to read all of the input, even if it knows that the request has failed? In that case the client might do better to check the revision history first to get some early indication if the push will even be possible.
Bound branches will form a new revision from the working directory, attempt to push it, and if that succeeds then also add the revision to the local bound branch.
- Packing/unpacking of server requests/responses.
- Format for packing up whole revisions, including all text.
- Insert a layer for these high-level operations. For example, commit must form up a complete revision including the texts from the working tree, then push that to a second (possibly remote) function that actually stores it.
With this change, users will push data onto the Launchpad SuperMirror using the smartserver protocol rather than sftp to the conch server. This requires running some bzrlib code there, which needs security consideration. It will beeasier for Launchpad to tell directly when changes have been pushed, and to make sure that users don't do invalid operations like corrupting the repository.
Where should these operations fit into the object model? Should there be a SmartServerBranch? Many of the operations on branch may be too low-level to work well over the remote protocol.
Should we go through this operation layer even locally?
This design does not allow for the server to fail an error after it starts transmitting bulk data. This may mean the server needs to buffer the response until it's sure it has totally succeeded.
Questions and Answers
[Matthieu Moy regarding 'emacs dvc could start...'] How could this approach support concurrent calls to bzr. There are at least two use-cases for this: running missing for several remote locations is faster if you do it concurrently, and running a slow operation in the background (say, bzr branch on a large, remote project), I want short operations to be ran in parallel, quickly. --MatthieuMoy
- You could start any number of bzr servers, though only one per branch at any time. This is like the way that emacs can have multiple cvs processes running.
- That means implementing a scheduler inside DVC. Since launching a synchronous process in Emacs means freezing Emacs, any non-trivial operation is done through an asynchronous process. With your proposal, running a bzr command would mean
- look for a running bzr server for the current branch. If none is found, start it
- If there is a running process, then put the current request in a queue
- otherwise, send the request to the server
- when a request terminates, see if the pending requests queue for this branch is empty. If not, process the next request.
believe me, that's much more work on the Emacs side than using bzr service. another problem is that we'd have to parse the output of bzr service looking for some kind of termination signal, whereas the process sentinel is called automatically for asynchronous process with process termination.
- emacs wouldn't be required to make sure there's only one process per branch; that'd be done by locking on the branch. You would have the option of keeping the processes around for later use, or running just one for each request.
- Then, it doesn't answer the question. If I have several concurrent processes, and only one server (with therefore only one stdin/stdout), then I can't run concurrent requests. They will be queued either by Emacs or by bzr. I still don't see any added value compared to bzr service, but there are several disadvantages. Using bzr service in DVC is zero effort for me: i just have to point bzr-executable to the client instead of bzr itself. If the communication protocol between Emacs and bzr server is more than just stdin/stdout, then we'll end up reimplementing the client of bzr service into Emacs.
[Matthieu Moy regarding 'emacs dvc could start...'] Is this really different from what bzr service offers. I believe the bzr service approach is superior in this case (but the use-cases below are perfectly valid and sufficient to justify bzr server anyway).
John' bzr service doesn't provide structured output, so dvc would still have some issues with parsing output that's intended for human consumption. It might be good to redo the C client on top of this protocol.
That's two distinct problems. If you add a --xml flag to some bzr commands, then bzr service will automatically benefit from it. No need for a smart server for that. If you don't add it, then /I/ can add it with a plugin if needed. FYI, I've implemented the parser for "bzr status" in DVC. It's 28 lines of lisp. Using a parser for a generic language such as XML, we'd still have to fill in our internal data structure. That would hardly save 10 lines of lisp, and wouldn't be better performance-wise.
- OK, perhaps just quoted or XML output is enough for this use.
Mercurial's network protocol might be of interest, esp how it sends "changegroups" which is more or less a changeset represented as chunks. See http://www.selenic.com/mercurial/wiki/index.cgi/WireProtocol