Dear Interweb: GCC and Arbitrary Binary Sections
Mono/C♯ has this nice feature where arbitrary files can be linked into the final binary, and you can programmatically access them. I'd like to be able to do that in C too, I'm sure it is possible, I just don't know an easy way. I know that if you have a section foo, then ld will create __start_foo and __stop_foo symbols which point to the start and end of the section, so all I really want is an easy way to get ld to use the contents of an arbitrary file (say, ui.xml) as a section.
Anybody know how to do this? Update: thanks to Daniel Jacobowitz for giving enough clues to a working, and clean, solution. I'll blog this shortly.
NP: The Sound Of A Handshake, cLOUDDEAD
http://ktown.kde.org/~frerich/elfrc.html
You can even do it with ld in the same way as objcopy, more or less:
http://sourceware.org/ml/binutils/2007-02/msg00360.html
http://0pointer.de/public/bin-to-c.py
A few compilers have trouble with dealing whith really large strings, but GCC on ELF is fine. And this is still a lot more portable than munging with ELF directly.
To access the string just use the symbol name, and for the size use sizeof()-1. Also, just include the generated .c file with #include once wherever it is needed.
I have used something like this in a couple of projects in the past and it works quite well.
I'd only consider the link-in approach if you have users who are likely to go deleting required files and then complaining to you about the breakage.
Having to handle cases in the code where bad packaging has meant that files are missing, or the user didn't install it properly is a pain, and for some files the only useful way of editing the file is from the source itself.
Since I picked up on this post via Planet Debian you're probably talking about a package to be installed via a .deb that you or your makefile builds, so why are you concerned by improper by-hand misinstallations? You are in control of those things. And suggesting that some files might not be easily editable without separate tools by the end-user is a strawman, since that's an orthagonal issue and anyway they would be easily user-editable in the example you gave.
Think further. Right now maybe you have one application that needs to know about a certain GUI window-definition; you translate it to a blob and link it in statically. Later someone develops a complementary application which wants to use the same GUI-object, for consistency. The OS can't detect that it's the same thing, so memory gets wasted. If you and the other guy had just mmap'd the thing, the OS could have detected the shared need and given you the same RAM, copy-on-write if need be.
"handling the cases in the code" is also a lame excuse. You do the mmap() once... test it, abort if it fails. You don't have to test it every time you ever think of using it. And since you're doing the packaging, the one test shouldn't fail to begin with.
Dave