Solving buildhistory slowness

Posted by Ross Burton on November 28, 2013

The buildhistory class in oe-core is incredibly useful for analysing the changes in packages and images over time, but when doing frequently builds all of this metadata builds up and the resulting git repository can be quite unwieldy. I recently noticed that updating my buildhistory repository was often taking several minutes, with git frantically doing huge amounts of I/O. This wasn't surprising after realising that my buildhistory repository was now 2.9GB, covering every build I've done since April. Historical metrics are useful but I only ever go back a few days, so this is slightly over the top. Deleting the entire repository is one idea, but a better solution would be to drop everything but the last week or so.

Luckily Paul Eggleton had already been looking into this so pointed me at a StackOverflow page which used "git graft points" to erase history. The basic theory is that it's possible to tell git that a certain commit has specific parents, or in this case no parent, so it becomes the end of history. A quick git filter-branch and a re-clone later to clean out the stale history and the repository is far smaller.

$ git rev-parse "HEAD@{1 month ago}" > .git/info/grafts

This tells git that the commit a month before HEAD has no parents. The documentation for graft points explains the syntax, but for this purpose that's all you need to know.

$ git filter-branch

This rewrites the repository from the new start of history. This isn't a quick operation: the manpage for filter-branch suggests using a tmpfs as a working directory and I have to agree it would have been a good idea.

$ git clone file:///your/path/here/buildhistory buildhistory.new
$ rm -rf buildhistory
$ mv buildhistory.new buildhistory

After filter-branch all of the previous objects still exist in reflogs and so on, so this is the easiest way of reducing the repository to just the objects needed for the revised history. My newly shrunk repository is a fraction of the original size, and more importantly doesn't take several minutes to run git status in.

tags: git, tech, yocto