Monday, June 18, 2012
Linux: ELF shared library versioning
http://plan99.net/~mike/writing-shared-libraries.html
For info on @ and @@:
http://www.trevorpounds.com/blog/?tag=symbol-versioning
Sunday, May 13, 2012
GCC demangling and stack traces
http://www.acsu.buffalo.edu/~charngda/backtrace.html
In fact, it's so useful, I'm afraid to lose that page, so I've copied it here:
Call stack trace generation
Call stack trace ("backtrace") is very useful in debugging. Here are several ways to retrieve the backtrace in a user program.(The contents are mostly from here and here)
Easiest approach: __builtin_return_address
GCC has a built-in function to retrieve call stack trace's addresses. For examplevoid do_backtrace() { printf("Frame 0: PC=%p\n", __builtin_return_address(0)); printf("Frame 1: PC=%p\n", __builtin_return_address(1)); printf("Frame 2: PC=%p\n", __builtin_return_address(2)); printf("Frame 3: PC=%p\n", __builtin_return_address(3)); }__builtin_return_address(0) is always current function's address. On the other hand, __builtin_return_address(1), __builtin_return_address(2), ... may not be available on all platforms.
What to do with these addresses ?
Addresses can be mapped to the binary executable or dynamic link libraries. This is always doable even if the binary executable has been stripped off the symbols.To see the mapping during runtime, parse the following plain-text file on the /proc file system:
/proc/self/mapsA utility called pmap can do the same.
If the address belongs to a DLL, it is possible to obtain the function name since DLLs are usually not stripped.
Addresses can be mapped to function names. Even if a binary executable is compiled without -g option, it still contains function names. To see the function names in the binary executable, do
nm -C -n a.outTo see the function names programmatically in the binary executable during run-time, read later paragraphs.
Addresses can be mapped to line numbers in source files. This extra information (in DWARF format) is added to the binary executable if it is compiled with -g option. To see line numbers in source files, do
objdump -WL a.out objdump --dwarf=decodedline a.outor even better:
addr2line -ifC a.out 0x123456where 0x123456 is the address of interest. To see line numbers in source files programmatically during run-time, read later paragraphs.
Approach 2: backtrace
backtrace and backtrace_symbols are functions in Glibc. To use backtrace_symbols, one must compile the program with -rdynamic option.One does not need to compile with -g option (but -rdynamic option cannot be used together with -static option) since backtrace_symbols cannot retrieve line number information. Actually, one can even strip off the symbols, and the backtrace_symbols will still work. This is because when -rdynamic is used, all global symbols are also stored in .dynsym section in the ELF-formatted executable binary, and this section cannot be stripped away. (To see the content of .dynsym section, use readelf -s a.out command, or readelf -p .dynstr a.out command.)
backtrace_symbols obtains symbol information from .dynsym section.
(The main purpose of .dynsym section is for dynamic link libraries to expose their symbols so the runtime linker ld.so can find them.)
Here is the sample program:
#include <execinfo.h> void do_backtrace() { #define BACKTRACE_SIZ 100 void *array[BACKTRACE_SIZ]; size_t size, i; char **strings; size = backtrace(array, BACKTRACE_SIZ); strings = backtrace_symbols(array, size); for (i = 0; i < size; ++i) { printf("%p : %s\n", array[i], strings[i]); } free(strings); }
For C++ programs, to get demangled names, use abi::__cxa_demangle (include the header cxxabi.h)
Approach 3: Improved backtrace
The backtrace_symbols in Glibc uses dladdr to obtain function names, but it cannot retrieve line numbers. Jeff Muizelaar has an improved version here which can do line numbers.If the user program is compiled without any special command-line options, then one can obtain function names (of course, provided the binary executable is not stripped.) Better yet, -rdynamic compiler option is not needed.
If the user program is compiled with -g option, one can obtain both line numbers and function names.
Note that to compile Jeff Muizelaar's backtrace_symbols implementation, make sure the following two macros are defined and appears as the first two lines of a user program (they must precede before all #include ...):
#define __USE_GNU #define _GNU_SOURCEand one needs Binary File Descriptor (BFD) library, which is now part of GNU binutils when linking Jeff's code to the user program.
Approach 4: libunwind
libunwind does pretty much what the original backtrace/backtrace_symbols do. Its main purpose, however, is to unwind the stack programmatically (even more powerful than setjmp/longjmp pair) through unw_step and unw_resume calls. One can also peek and modify the saved register values on stack via unw_get_reg, unw_get_freg, unw_set_reg, and unw_set_freg calls.If one just wants to retrieve the backtrace, use the following code:
#include <libunwind.h> void do_backtrace() { unw_cursor_t cursor; unw_context_t context; unw_getcontext(&context); unw_init_local(&cursor, &context); while (unw_step(&cursor) > 0) { unw_word_t offset, pc; char fname[64]; unw_get_reg(&cursor, UNW_REG_IP, &pc); fname[0] = '\0'; (void) unw_get_proc_name(&cursor, fname, sizeof(fname), &offset); printf ("%p : (%s+0x%x) [%p]\n", pc, fname, offset, pc); } }and linked the user program with -lunwind -lunwind-x86_64.
There is no need to compile the user program with -g option.
Wednesday, April 25, 2012
Blekko's NoSQL Database
http://highscalability.com/blog/2012/4/25/the-anatomy-of-search-technology-blekkos-nosql-database.html
Saturday, March 31, 2012
Micheal Dunn: How console games can reach the Facebook audience
Tuesday, February 7, 2012
Git at large companies
https://news.ycombinator.com/item?id=3548824
That mentions Amazon, which has embraced git without prohibiting other VCSs. That helped us to solve several problems with scale (100k small repos vs. 3 large repos). I can't comment much, but I can re-post what was said:
Amazon uses Perforce at the moment, and for the most part developers are unhappy with it, as well as the team that has to support it (single giant server prone to outages which block up a couple thousand developers, etc). We're in the process of moving to Git for all of our source.There you go. SOA (Service Oriented Architecture) solves the repo problem along with a bunch of others.On the other hand, what you're describing as a problem (what Facebook is describing as going to be a problem) is less likely to be one for Amazon as, with some exceptions that are in the process of fixing the issue, the majority of software at Amazon is developed as a service. Services are segregated into their own package, with most services being broken up into cohesive subpackages (a service my team is building will probably have ~10-13 packages when done), and we have a dependency modeling system for packages baked into everything, from build through deploy, which eliminates most of the cognitive overhead of breaking our services up this way.All of this translates very well into different Git repositories. What we lose is cohesive atomic commits across packages, which we do get with Perforce. The upshot is we have a team developing a system to handle that specific case.
The problems I've had with git at Amazon are minor, and we do have a very large code-base.