Better bash completion?

Bash completion is great and everything, but I spend more time than is advisable dealing with numerous timestamped files.

$ mv core-image-sato-qemux86-64-20140204[tab]
core-image-sato-qemux86-64-20140204194448.rootfs.ext3
core-image-sato-qemux86-64-20140204202414.rootfs.ext3
core-image-sato-qemux86-64-20140204203642.rootfs.ext3

This isn’t an obvious choice as I now need to remember long sequences of numbers. Does anyone know if bash can be told to highlight the bit I’m being asked to pick from, something like this:

$ mv core-image-sato-qemux86-64-20140204[tab]
core-image-sato-qemux86-64-20140204194448.rootfs.ext3
core-image-sato-qemux86-64-20140204202414.rootfs.ext3
core-image-sato-qemux86-64-20140204203642.rootfs.ext3

Remote X11 on OS X

I thought I’d blog this just in case someone else is having problems using XQuartz on OS X as a server to remote X11 applications (i.e. using ssh -X somehost).

At first this works but after some time (20 minutes, to be exact) you’ll get “can’t open display: localhost:10.0″ errors when applications attempt to connect to the X server. This is because the X forwarding is “untrusted” and that has a 20 minute timeout. There are two solution here: increase the X11 timeout (the maximum is 596 hours) or enable trusted forwarding.

It’s probably only best to enable trusted forwarding if you’re connecting to machines you, well, trust. The option is ForwardX11Trusted yes and this can be set globally in /etc/ssh_config or per host in ~/.ssh/config.

Network Oddity

This is… strange. Two machines, connected through cat5 and gigabit adaptors/hub.

$ iperf -c melchett.local -d
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to melchett.local, TCP port 5001
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
[  4] local 192.168.1.7 port 35197 connected with 192.168.1.10 port 5001
[  5] local 192.168.1.7 port 5001 connected with 192.168.1.10 port 33692
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.0 sec  1.08 GBytes   926 Mbits/sec
[  5]  0.0-10.0 sec  1.05 GBytes   897 Mbits/sec

Simultaneous transfers get ~900MBits/s.

$ iperf -c melchett.local -r
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to melchett.local, TCP port 5001
TCP window size: 22.9 KByte (default)
------------------------------------------------------------
[  5] local 192.168.1.7 port 35202 connected with 192.168.1.10 port 5001
[ ID] Interval       Transfer     Bandwidth
[  5]  0.0-10.0 sec   210 MBytes   176 Mbits/sec
[  4] local 192.168.1.7 port 5001 connected with 192.168.1.10 port 33693
[  4]  0.0-10.0 sec  1.10 GBytes   941 Mbits/sec

Testing each direction independently results in only 176MBits/sec on the transfer to the iperf server (melchett). This is 100% reproducible, and the same results appear if I swap iperf client and servers.

I’ve swapped one of the cables involved but the other is harder to get to, but I don’t see how physical damage could cause this sort of performance issue. Oh Internet, any ideas?

Solving buildhistory slowness

The buildhistory class in oe-core is incredibly useful for analysing the changes in packages and images over time, but when doing frequently builds all of this metadata builds up and the resulting git repository can be quite unwieldy. I recently noticed that updating my buildhistory repository was often taking several minutes, with git frantically doing huge amounts of I/O. This wasn’t surprising after realising that my buildhistory repository was now 2.9GB, covering every build I’ve done since April. Historical metrics are useful but I only ever go back a few days, so this is slightly over the top. Deleting the entire repository is one idea, but a better solution would be to drop everything but the last week or so.

Luckily Paul Eggleton had already been looking into this so pointed me at a StackOverflow page which used “git graft points” to erase history. The basic theory is that it’s possible to tell git that a certain commit has specific parents, or in this case no parent, so it becomes the end of history. A quick git filter-branch and a re-clone later to clean out the stale history and the repository is far smaller.

$ git rev-parse "HEAD@{1 month ago}" > .git/info/grafts

This tells git that the commit a month before HEAD has no parents. The documentation for graft points explains the syntax, but for this purpose that’s all you need to know.

$ git filter-branch

This rewrites the repository from the new start of history. This isn’t a quick operation: the manpage for filter-branch suggests using a tmpfs as a working directory and I have to agree it would have been a good idea.

$ git clone file:///your/path/here/buildhistory buildhistory.new
$ rm -rf buildhistory
$ mv buildhistory.new buildhistory

After filter-branch all of the previous objects still exist in reflogs and so on, so this is the easiest way of reducing the repository to just the objects needed for the revised history. My newly shrunk repository is a fraction of the original size, and more importantly doesn’t take several minutes to run git status in.

Using netconsole in the Yocto Project

Debugging problems which mean init goes crazy is tricky, especially so on modern Intel hardware that doesn’t have anything resembling a serial port you can connect to.

Luckily this isn’t a new problem, as Linux supports a network console which will send the console messages over UDP packets to a specific machine. This is mostly easy to use but there are some caveats that are not obvious.

The prerequisites are that netconsole support is enabled, and your ethernet driver is built in to the kernel and not a module. Luckily, the stock Yocto kernels have netconsole enabled and the atom-pc machine integrates the driver for my hardware.

Then, on the target machine, you pass netconsole=... to the kernel. The kernel documentation explains this quite well:

netconsole=[src-port]@[src-ip]/[],[tgt-port]@/[tgt-macaddr]

   where
        src-port      source for UDP packets (defaults to 6665)
        src-ip        source IP to use (interface address)
        dev           network interface (eth0)
        tgt-port      port for logging agent (6666)
        tgt-ip        IP address for logging agent
        tgt-macaddr   ethernet MAC address for logging agent (broadcast)

The biggest gotcha is that you (obviously) need a source IP address, and netconsole starts before the networking normally comes up. Apart from that you can generally get away with minimal settings:

netconsole=@192.168.1.21/,@192.168.1.17/

Note that apparently some routers may not forward the broadcast packets correctly, so you may need to specify the target MAC address.

On the target machine run something like netcat to capture the packets:

$ netcat -l -u -p 6666 | tee console.log

If you get the options wrong the kernel will tell you why, so if you don’t get any logging iterate a working argument on an image that works, using dmesg to see what the problem is.

Finally instead of typing in this argument every time you boot, you can add it to the boot loader in your local.conf:

APPEND += "netconsole=@192.168.1.21/,@192.168.1.17/"

(APPEND being the name of the variable that is passed to the kernel by the boot loader.)

Update: when the journal starts up systemd will stop logging to the console, so if you want to get all systemd messages also pass systemd.log_target=kmsg.

Mutually Exclusive PulseAudio streams

The question of mutually exclusive streams in PulseAudio came to mind earlier, and thanks to Arun and Jens in #gupnp I discovered the PulseAudio supports them already. The use-case here is a system where there are many ways of playing music, but instead of mixing them PA should pause the playing stream when a new one starts.

Configuring this with PulseAudio is trivial, using the module-role-cork module:

$ pactl load-module module-role-cork trigger_roles=music cork_roles=music

This means that when a new stream with the “music” role starts, “cork” (pause to everyone else) all streams that have the “music” role. Handily this module won’t pause the trigger stream, so this implements exclusive playback.

Testing is simple with gst-launch

$ PULSE_PROP='media.role=music' gst-launch-0.10 audiotestsrc ! pulsesink
Setting pipeline to PAUSED ...
Pipeline is PREROLLING ...
Pipeline is PREROLLED ...
Setting pipeline to PLAYING ...
New clock: GstPulseSinkClock

At this point, starting another gst-launch results in this stream being paused:

Setting state to PAUSED as requested by /GstPipeline:pipeline0/GstPulseSink:pulsesink0...

Note that it won’t automatically un-cork when the newer stream disappears, but for what I want this is the desired behaviour anyway.

Yocto Project Build Times

Last month our friends at Codethink were guests on FLOSS Weekly, talking about Baserock. Baserock is a new embedded build system with some interesting features/quirks (depending on your point of view) that I won’t go into. What caught my attention was the discussion about build times for various embedded build systems.

Yocto, again, if you want to do a clean build it will take days to build your system, even if you do an incremental build, even if you just do a single change and test it, that will take hours.

(source: FLOSS Weekly #230, timestamp 13:21, slightly edited for clarity)

Now “days” for a clean build and “hours” for re-building an image with a single change is quite excessive for the Yocto Project, but also quite specific. I asked Rob Taylor where he was getting these durations from, and he corrected himself on Twitter:

I’m not sure if he meant “hours” for both a full build and an incremental build, or whether by “hours” for incremental he actually meant “minutes”, but I’ll leave this for now and talk about real build times.

Now, my build machine is new but nothing special. It’s built around an Intel Core i7-3770 CPU (quad-core, 3.4GHz) with 16GB of RAM (which is overkill, but more RAM means more disk cache which is always good), and two disks: a 250GB Western Digital Blue for /, and a 1TB Western Digital Green for /data (which is where the builds happen). This was built by PC Specialist for around ¬£600 (the budget was $1000 without taxes) and happily sits in my home study running a nightly build without waking the kids up. People with more money stripe /data across multiple disks, use SSDs, or 10GB tmpfs filesystems, but I had a budget to stick to.

So, let’s wipe my build directory and do another build from scratch (with sources already downloaded). As a reference image I’ll use core-image-sato, which includes an X server, GTK+, the Matchbox window manager suite and some demo applications. For completeness, this is using the 1.3 release – I expect the current master branch to be slightly faster as there’s some optimisations to the housekeeping that have landed.

$ rm -rf /data/poky-master/tmp/
$ time bitbake core-image-sato
Pseudo is not present but is required, building this first before the main build
Parsing of 817 .bb files complete (0 cached, 817 parsed). 1117 targets, 18 skipped, 0 masked, 0 errors.
...
NOTE: Tasks Summary: Attempted 5393 tasks of which 4495 didn't need to be rerun and all succeeded.

real 9m47.289s

Okay, that was a bit too fast. What happened is that I wiped my local build directory, but it’s pulling build components from the “shared state cache”, so it spent six minutes reconstructing a working tree from shared state, and then three minutes building the image itself. The shared state cache is fantastic, especially as you can share it between multiple machines. Anyway, by renaming the sstate directory it won’t be found, and then we can do a proper build from scratch.

$ rm -rf /data/poky-master/tmp/
$ mv /data/poky-master/sstate /data/poky-master/sstate-old
$ time bitbake core-image-sato
Pseudo is not present but is required, building this first before the main build
...
NOTE: Tasks Summary: Attempted 5117 tasks of which 352 didn't need to be rerun and all succeeded.

real 70m37.298s
user 326m45.417s
sys 37m13.304s

That’s a full build from scratch (with downloaded sources, we’re not benchmarking my ADSL) in just over an hour on affordable commodity hardware. As I said this isn’t some “minimal” image that boots straight to busybox, this is building a complete cross-compiling toolchain, the kernel, X.org, GTK+, GStreamer, the Matchbox window manager/panel/desktop, and finally several applications. In total, 431 source packages were built and packaged, numerous QA tests executed and flashable images generated.

My configuration is to build for Intel Atom but a build for an ARM, MIPS, or PowerPC target would also take a similar amount of time, as even what could be considered “native” targets (targeting Atom, building on i7) doesn’t always turn out to be native: for example carrier-grade Xeon’s have instructions that my i7 doesn’t have, and if you were building carrier-grade embedded software you’d want to ensure they were used.

So, next time someone claims Yocto Project/OpenEmbedded takes “days” or even “hours” to do a build, you can denounce that as FUD and point them here!

Devil’s Pie 0.23

This may come as a shock to some, but I’ve just tagged Devil’s Pie 0.23 (tarball here).

tl;dr: don’t use this, but if you insist it now works with libwnck3

The abridged changelog:

  • Port to libwnck3 (Christian Persch)
  • Add unfullscreen action (Mathias Dalh)
  • Remove exec action (deprecated by spawn)

I probably wouldn’t have ever released this as I’m generally not maintaining it and tend to push people towards Devil’s Pie 2 which funnily enough had a 0.23 release two days ago, but Christian asked nicely and I was waiting on a Yocto build to finish.

Guacamayo Media Server

Last night I merged the work by our lovely interns Emilia and Mihai to add a media server image to Guacamayo. Basically this is a DLNA “Digital Media Server” implemented using Rygel’s media-export plugin.

It’s early days, mainly because it only exposes the demo content so far and there is no easy way to add or remove media (SFTP as root doesn’t count!), but it’s certainly a solid step in the right direction.

Thanks Emilia and Mihai!

Stop oe-core complaining about development distros

As anyone who runs oe-corePoky on a development distribution, you’ll get a warning when you start bitbake because the distribution is unsupported:

WARNING: Host distribution "Debian GNU/Linux unstable (sid)" has not been validated with this version of the build system; you may possibly experience unexpected failures. It is recommended that you use a tested distribution.

Fair enough, but I’m tough enough to deal with that. Luckily you can silence this warning. Take the distribution identifier out of the warning and then append it to SANITY_TESTED_DISTROS in your local.conf, for example:

SANITY_TESTED_DISTROS += "Debian GNU/Linux unstable (sid)"

Voilà, no more spurious warnings.