Bazaar

Bazaar

 




Wiki Tools

  • Find Page
  • Recent Changes
  • Page History
  • Attachments

Try the Plugin

A plugin implementing this feature is now available for testing. See https://launchpad.net/bzr-keywords. As explained on that page, the plugin requires a branch supporting content-filtering, a feature planned for inclusion in a future version of Bazaar. Please help test this plugin so we can maximise its usefulness and quality.

Plugin features include:

  • enabled per file pattern in selected branches via rules
  • for different patterns, keywords can be set to off, on and xml_escape, the latter providing escaping of values for inclusion into XML and HTML files

  • keywords supported include: Date, Author, Author-Email, Revision-Id, Path, Filename, Directory, File-Id, Now, User, User-Email
  • users can configure how they want date-time keyword values formatted by adding settings to bazaar.conf
  • keywords are optionally expanded by bzr cat and bzr export including various publishing styles that drop the $ markers to improve readability

  • to assist those working with multiple VCS tools on the one tree, unknown keywords are still stored in compressed form during a commit and their expanded value is left unchanged in the working tree
  • other plugins can register additional keywords
  • keyword values are lazily evaluated so there is limited impact on performance and you only "pay" for the keywords you use.

Design notes and comments leading to the development of this plugin are shown below ...

Introduction

We wish to support expansion of $Id$ style keywords, as used in cvs and svn. Expansion should be off by default, and optionally turned on per file.

PRCS has a nice system for doing the expansion on the line following the keyword, which helps when you can to e.g. get only the version into a string without the dollar signs.

Implementation overview

  • Require per-file versioned properties (like svn) to record whether expansion should be on or off for a particular file.
  • All access to the working file should be through the WorkingTree class.

  • Expand and unexpand methods in the workingtree object.
  • Consideration of how this will interact with hashcache.
  • Anything that alters Branch.last_revision() results (e.g. commit, pull) must update all keyword files.

JanHudec:

  • In LineEndings I suggested, that plugable filters for file contents should exist. That would also take care of keyword expansion. One issue to solve is, how to specify which filters to use for a particular file. I believe there should be local default, versioned property and local override.

IanClatworthy:

  • I'm putting together a plugin for this to test out the workingtree content filtering feature. See the Discussion section below.

Alternatives

JohnArbashMeinel writes:

  • Oh, and if you want to do this sort of thing the same way I do, you might look at the "version-info" command. It is designed to have various (pluggable) output formats, so that you can make your build script as simple as:

    bzr version-info --all --format=python > version.py

    I have a personal C++ format, but my trees tend to have >1 version files for all the libraries, so it has to be a custom format for each library. Python creates a namespace for variables based on what file they are in. In C++ you need to create a new namespace *inside* the file text. And in C, you pretty much need to create a custom prefix to have a namespace. So I don't know if we can create a universal --format=C, that doesn't require extra parameters.

Discussion

  • Discussion of keyword expansion on the list, April 2006.

IanClatworthy writes:

  • Reading through the discussion, I think it makes sense to use different solutions for different problems:
    • version-info is good for formatting tree-wide metadata into selected files
    • the $Foo$ keyword expansion should be file specific metadata only.
    Looking at other systems:
    • Hg has a keyword extension (http://www.selenic.com/mercurial/wiki/index.cgi/KeywordExtension) that supports all the Subversion keywords in a "text compatible" way, right down to having meaningless ,v bits in among the strings. I guess that improves compatibility with some third-party tools. You can plug in new keywords as well if you want. Having said that, their design notes (http://www.selenic.com/mercurial/wiki/index.cgi/KeywordPlan) explicitly discourage keyword expansion altogether.

    • In comparison, Git supports just a single keyword $Id$. This is a 40 character string, i.e. it's a Git-specific piece of metadata which is not "text compatible" with what RCS/CVS/Subversion would insert.

    I think this area is one where "less is more", though the Git approach is too spartan IMO. Both Hg and Git require explicit nomination (via patterns) of the files that will have expansion performed. That's wise as it avoids accidental expansion when none was expected.

MartinMarcher writes:

  • As said on KeywordExpansion I'd like the option to format the Date according to strftime. Maybe even some simple hooks where one could write something like that:

    Methods/Hooks to modify KewordExpansion:

       1   #!python
       2   """
       3   All methods that do influence KeywordExpansion
       4   need to have a signature of `<methodname>()` and return a string
       5   """
       6   def product():
       7           return "BuzzWordProduct: %s, by %s" % (release(), who())
       8   
       9   def release():
      10           return "5.3"
      11   
      12   def who():
      13           return "alice"
    
    Original Source:
       1   #!python
       2   #!/usr/bin/env python
       3   # %$product%$
       4   # %$release%$
       5   
       6   def our_cool_new_function():
       7           pass
    
    Expanded Source:
       1   #!python
       2   #!/usr/bin/env python
       3   # %$Product: BuzzWordProduct: 5.3, by alice %$
       4   # %$Release: 5.3 %$
       5   
       6   def our_cool_new_function():
       7           pass
    
    Use Case

    Some of our repos are only checked in from a single person. That would show each and every commit as being done by the same person, we'd like to be able to specify something so that it will be expand according to our own logic. Also we often just give away the code to non-tech customers that then complain about "unlocalized" stuff in the code (am I making sense here). That would be instead of mm/dd/yyyy (or UTC) we'd like to be able, by using something like the above, to specify our own logic and being forced to set up another system that than replaces it. In our SVN repos that are left we do specify something like %%User%% (so that it doesn't get invoked by svn keywords) and let our own replacement run over it. -- The expansion markers may not be optimal but I hope you get the idea...

    JohnArbashMeinel responds:

    • (moved since the discussion should really be here) I really think the api should include a bit of context into the calling function, rather than no-parameters. That way you can expand "$author$" into the real author of the commit, rather than a fixed string.
    martin.marcher responds:
    • I honestly have no idea about wether this is even possible, so forget the signature I proposed above. It was more like a suggestion of a feature I really miss. The point in the method signature is just to:
      1. have a consistent signature thru ought the "hook"
      2. Keep it simple. - I guess if one really needs to hack in some more functionality you don't get around diving into bzr anyway.

      Compare SVN's $author$ with something that would be expanded to bzr whoami, or even some more info: like automagically be able to generate say: pythons author, version, <future metadata>, without the need of actually modifying bzr but just add simple methods that extract the info from an appropriate place - $$lp:martin.marcher$$ --> $$lp: https://launchpad.net/~martin.marcher $$ I'm not a programmer so I can only speak from a user's point of view that occasionally does some scripting

Keywords Plugin Design Notes

I'm planning to support a small number of useful, meaningful file-specific keywords, namely:

  • Date - the date and time in UTC when this file was last modified
    • Please: do provide an option to output the Date formatted as wanted by the user.
  • Author - the author or committer of the change.

If you really must have something else, please let me (IanClatworthy) know the exact Use Case. We can then have the debate re whether relpath is better or worse than file-id, revno is better or worse than revision-id, etc.