So this week I started submitting a seventy-odd commits long branch where every commit was machine generated (but hand reviewed) with the amazing commit message of "component: refresh patches". Whilst this was easy to automate the message isn't acceptable to merge and I was facing the prospect of copy/pasting the same commit message over and over during an interactive rebase. That did not sound like fun. I ended up writing a tiny tool to do this and thought I'd do my annual blog post about it, mainly so I can find it again when I need to do it again next year...
Wise readers will know that Git can rewrite all sorts of things in commits programatically using git-filter-branch
and this has a --msg-filter
argument which sounds like just what I need. But first a note: git-filter-branch
can destroy your branches if you're not careful!
git filter-branch --msg-filter
has a simple behaviour: give it a command to be executed by the shell, the old commit message is piped in via standard input, and whatever appears on standard output is the new commit message. Sounds simple but in a way it's too simple, as even the example in the documentation has a glaring problem.
Anyway, this should work. I have a commit message in a predictable format (
import sys, re
input_re = re.compile(open(sys.argv[1]).read())
template = open(sys.argv[2]).read()
original_message = sys.stdin.read()
match = input_re.match(original_message)
if match:
print(template.format(**match.groupdict()))
else:
print(original_message)
Invoke this with two filenames: a regular expression to match on the input, and a template for the new commit message. If the regular expression matches then any named groups are extracted and passed to the template which is output using the new-style format()
operation. If it doesn't match then the input is simply output to preserve commit messages.
This is my input regular expression:
^(?P<recipe>.+): refresh patches
And this is my output template:
{recipe}: refresh patches
The patch tool will apply patches by default with "fuzz", which is where if the
hunk context isn't present but what is there is close enough, it will force the
patch in.
Whilst this is useful when there's just whitespace changes, when applied to
source it is possible for a patch applied with fuzz to produce broken code which
still compiles (see #10450). This is obviously bad.
We'd like to eventually have do_patch() rejecting any fuzz on these grounds. For
that to be realistic the existing patches with fuzz need to be rebased and
reviewed.
Signed-off-by: Ross Burton <ross.burton@intel.com>
A quick run through filter-branch
and I'm ready to send:
git filter-branch --msg-filter 'rewriter.py input output' origin/master...HEAD