Dear Interweb: GCC and Arbitrary Binary Sections

Mono/C♯ has this nice feature where arbitrary files can be linked into the final binary, and you can programmatically access them. I'd like to be able to do that in C too, I'm sure it is possible, I just don't know an easy way. I know that if you have a section foo, then ld will create __start_foo and __stop_foo symbols which point to the start and end of the section, so all I really want is an easy way to get ld to use the contents of an arbitrary file (say, ui.xml) as a section.

Anybody know how to do this? Update: thanks to Daniel Jacobowitz for giving enough clues to a working, and clean, solution. I'll blog this shortly.

NP: The Sound Of A Handshake, cLOUDDEAD

12:30 Friday, 13 Jul 2007 [#] [computers] (10 comments)

Posted by Ed Page at Fri Jul 13 13:05:12 2007:
I have not personally used this, but a while ago I came across elfrc
http://ktown.kde.org/~frerich/elfrc.html
Posted by Ross at Fri Jul 13 13:15:09 2007:
Oh, interesting.  I was hoping that ld would have a way of doing it directly, so that I didn't need to build a tool first...
Posted by Daniel Jacobowitz at Fri Jul 13 13:40:34 2007:
Use objcopy to convert the binary to an ELF object, and then you can simultaneously tell objcopy to rename its start and end symbols however you want.  Or use gas's .incbin directive.  I recommend objcopy.  Try -I binary option; see the description in the man page for --binary-architecture (though you may not need to specify that manually)

You can even do it with ld in the same way as objcopy, more or less:

http://sourceware.org/ml/binutils/2007-02/msg00360.html
Posted by Benjamin Otte at Fri Jul 13 14:03:17 2007:
You can always just go the boring way of gdk-pixbuf-csource and convert your file to a char array. Guaranteed portable.
Posted by blah at Fri Jul 13 14:18:11 2007:
More portable is simply serializing the file into a const C string, possibly with a script like this one:

http://0pointer.de/public/bin-to-c.py

A few compilers have trouble with dealing whith really large strings, but GCC on ELF is fine. And this is still a lot more portable than munging with ELF directly.

To access the string just use the symbol name, and for the size use sizeof()-1. Also, just include the generated .c file with #include once wherever it is needed.

I have used something like this in a couple of projects in the past and it works quite well.
Posted by Dave at Fri Jul 13 16:27:05 2007:
This approach guarantees you need to relink anytime your datafile changes.  Why not just use mmap?  Then you can still access the contents through a pointer, but you and your users can easily tweak it.

I'd only consider the link-in approach if you have users who are likely to go deleting required files and then complaining to you about the breakage.
Posted by Ross at Fri Jul 13 16:30:03 2007:
I don't care about link time here.

Having to handle cases in the code where bad packaging has meant that files are missing, or the user didn't install it properly is a pain, and for some files the only useful way of editing the file is from the source itself.
Posted by Matthew W. S. Bell at Fri Jul 13 17:47:20 2007:
elfembed does this poorly.
Posted by Dave at Sat Jul 14 08:31:45 2007:
You don't care about link time here because you think you're the only one who's going to build {some package} on a regular basis?  Might I suggest you could be being shortsighted?

Since I picked up on this post via Planet Debian you're probably talking about a package to be installed via a .deb that you or your makefile builds, so why are you concerned by improper by-hand misinstallations?  You are in control of those things.  And suggesting that some files might not be easily editable without separate tools by the end-user is a strawman, since that's an orthagonal issue and anyway they would be easily user-editable in the example you gave.

Think further.  Right now maybe you have one application that needs to know about a certain GUI window-definition; you translate it to a blob and link it in statically.  Later someone develops a complementary application which wants to use the same GUI-object, for consistency.  The OS can't detect that it's the same thing, so memory gets wasted.  If you and the other guy had just mmap'd the thing, the OS could have detected the shared need and given you the same RAM, copy-on-write if need be.

"handling the cases in the code" is also a lame excuse.  You do the mmap() once... test it, abort if it fails.  You don't have to test it every time you ever think of using it.  And since you're doing the packaging, the one test shouldn't fail to begin with.

Dave
Posted by Ross at Sat Jul 14 10:20:10 2007:
I can't believe so many people are angry that I'm considering inlining the GtkUIManager XML file in a binary, when half of applications have it as an inline string and its impossible to reuse from another application.

Please, people, have some perspective.

Name:


E-mail:


URL:


Add 1 and 7 (required):


Comment: