C语言 如何处理静态链接库之间的符号冲突?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/6940384/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to deal with symbol collisions between statically linked libraries?
提问by datenwolf
One of the most important rules and best practices when writing a library, is putting all symbols of the
library into a library specific namespace. C++ makes this easy, due to the namespacekeyword. In
C the usual approach is to prefix the identifiers with some library specific prefix.
编写库时最重要的规则和最佳实践之一是将库的所有符号放入特定于库的命名空间中。由于namespace关键字,C++ 使这变得容易。在 C 中,通常的方法是在标识符前加上一些特定于库的前缀。
Rules of the C standard put some constraints on those (for safe compilation): A C compiler may look at only the first
8 characters of an identifier, so foobar2k_eggsand foobar2k_spammay be interpreted as the same
identifiers validly – however every modern compiler allows for arbitrary long identifiers, so in our times
(the 21st century) we should not have to bother about this.
C 标准的规则对这些规则施加了一些限制(为了安全编译):AC 编译器可能只查看标识符的前 8 个字符,因此foobar2k_eggs和foobar2k_spam可以有效地解释为相同的标识符——但是每个现代编译器都允许任意长标识符,所以在我们这个时代(21 世纪),我们不应该为此烦恼。
But what if you're facing some libraries of which you cannot change the symbol names / idenfiers? Maybe you got only a static binary and the headers or don't want to, or are not allowed to adjust and recompile yourself.
但是,如果您面对一些无法更改符号名称/标识符的库,该怎么办?也许你只有一个静态二进制文件和头文件,或者不想,或者不允许自己调整和重新编译。
回答by datenwolf
At least in the case of staticlibraries you can work around it quite conveniently.
至少在静态库的情况下,您可以非常方便地解决它。
Consider those headers of libraries fooand bar. For the sake of this tutorial I'll also give you the source files
考虑库foo和bar 的那些头文件。为了本教程的缘故,我还将为您提供源文件
examples/ex01/foo.h
例子/ex01/foo.h
int spam(void);
double eggs(void);
examples/ex01/foo.c (this may be opaque/not available)
examples/ex01/foo.c(这可能不透明/不可用)
int the_spams;
double the_eggs;
int spam()
{
return the_spams++;
}
double eggs()
{
return the_eggs--;
}
example/ex01/bar.h
示例/ex01/bar.h
int spam(int new_spams);
double eggs(double new_eggs);
examples/ex01/bar.c (this may be opaque/not available)
examples/ex01/bar.c(这可能不透明/不可用)
int the_spams;
double the_eggs;
int spam(int new_spams)
{
int old_spams = the_spams;
the_spams = new_spams;
return old_spams;
}
double eggs(double new_eggs)
{
double old_eggs = the_eggs;
the_eggs = new_eggs;
return old_eggs;
}
We want to use those in a program foobar
我们想在程序 foobar 中使用它们
example/ex01/foobar.c
示例/ex01/foobar.c
#include <stdio.h>
#include "foo.h"
#include "bar.h"
int main()
{
const int new_bar_spam = 3;
const double new_bar_eggs = 5.0f;
printf("foo: spam = %d, eggs = %f\n", spam(), eggs() );
printf("bar: old spam = %d, new spam = %d ; old eggs = %f, new eggs = %f\n",
spam(new_bar_spam), new_bar_spam,
eggs(new_bar_eggs), new_bar_eggs );
return 0;
}
One problem becomes apparent immediately: C doesn't know overloading. So we have two times two functions with identical name but of different signature. So we need some way to distinguish those. Anyway, lets see what a compiler has to say about this:
一个问题立即变得明显:C 不知道重载。所以我们有两个名称相同但签名不同的函数。所以我们需要一些方法来区分它们。不管怎样,让我们看看编译器对此怎么说:
example/ex01/ $ make
cc -c -o foobar.o foobar.c
In file included from foobar.c:4:
bar.h:1: error: conflicting types for ‘spam'
foo.h:1: note: previous declaration of ‘spam' was here
bar.h:2: error: conflicting types for ‘eggs'
foo.h:2: note: previous declaration of ‘eggs' was here
foobar.c: In function ‘main':
foobar.c:11: error: too few arguments to function ‘spam'
foobar.c:11: error: too few arguments to function ‘eggs'
make: *** [foobar.o] Error 1
Okay, this was no surprise, it just told us, what we already knew, or at least suspected.
好吧,这并不奇怪,它只是告诉我们,我们已经知道的,或者至少是怀疑的。
So can we somehow resolve that identifer collision without modifying the original libraries' source code or headers? In fact we can.
那么我们能否在不修改原始库的源代码或头文件的情况下以某种方式解决标识符冲突?事实上我们可以。
First lets resolve the compile time issues. For this we surround the header includes with a
bunch of preprocessor #definedirectives that prefix all the symbols exported by the library.
Later we do this with some nice cozy wrapper-header, but just for the sake of demonstrating
what's going on were doing it verbatim in the foobar.csource file:
首先让我们解决编译时问题。为此,我们用一堆预处理器#define指令包围头包含,这些指令为库导出的所有符号添加前缀。稍后我们使用一些不错的包装头来完成此操作,但只是为了演示正在foobar.c源文件中逐字执行的操作:
example/ex02/foobar.c
示例/ex02/foobar.c
#include <stdio.h>
#define spam foo_spam
#define eggs foo_eggs
# include "foo.h"
#undef spam
#undef eggs
#define spam bar_spam
#define eggs bar_eggs
# include "bar.h"
#undef spam
#undef eggs
int main()
{
const int new_bar_spam = 3;
const double new_bar_eggs = 5.0f;
printf("foo: spam = %d, eggs = %f\n", foo_spam(), foo_eggs() );
printf("bar: old spam = %d, new spam = %d ; old eggs = %f, new eggs = %f\n",
bar_spam(new_bar_spam), new_bar_spam,
bar_eggs(new_bar_eggs), new_bar_eggs );
return 0;
}
Now if we compile this...
现在如果我们编译这个...
example/ex02/ $ make
cc -c -o foobar.o foobar.c
cc foobar.o foo.o bar.o -o foobar
bar.o: In function `spam':
bar.c:(.text+0x0): multiple definition of `spam'
foo.o:foo.c:(.text+0x0): first defined here
bar.o: In function `eggs':
bar.c:(.text+0x1e): multiple definition of `eggs'
foo.o:foo.c:(.text+0x19): first defined here
foobar.o: In function `main':
foobar.c:(.text+0x1e): undefined reference to `foo_eggs'
foobar.c:(.text+0x28): undefined reference to `foo_spam'
foobar.c:(.text+0x4d): undefined reference to `bar_eggs'
foobar.c:(.text+0x5c): undefined reference to `bar_spam'
collect2: ld returned 1 exit status
make: *** [foobar] Error 1
... it first looks like things got worse. But look closely: Actually the compilation stage went just fine. It's just the linker which is now complaining that there are symbols colliding and it tells us the location (source file and line) where this happens. And as we can see those symbols are unprefixed.
......看起来事情变得更糟了。但是仔细看:实际上编译阶段进行得很好。只是链接器现在抱怨符号冲突,它告诉我们发生这种情况的位置(源文件和行)。正如我们所看到的,这些符号是无前缀的。
Let's take a look at the symbol tables with the nmutility:
让我们看看使用nm实用程序的符号表:
example/ex02/ $ nm foo.o
0000000000000019 T eggs
0000000000000000 T spam
0000000000000008 C the_eggs
0000000000000004 C the_spams
example/ex02/ $ nm bar.o
0000000000000019 T eggs
0000000000000000 T spam
0000000000000008 C the_eggs
0000000000000004 C the_spams
So now we're challenged with the exercise to prefix those symbols in some opaque binary. Yes, I know in the course of this example we have the sources and could change this there. But for now, just assume you have only those .ofiles, or a .a(which actually is just a bunch of .o).
所以现在我们面临的挑战是在一些不透明的二进制文件中为这些符号添加前缀。是的,我知道在这个例子的过程中我们有资源并且可以在那里改变它。但是现在,假设您只有那些.o文件,或.a(实际上只是一堆.o)。
objcopyto the rescue
objcopy来救援
There is one tool particularily interesting for us: objcopy
有一个工具对我们来说特别有趣:objcopy
objcopy works on temporary files, so we can use it as if it were operating in-place. There is one option/operation called --prefix-symbolsand you have 3 guesses what it does.
objcopy 处理临时文件,因此我们可以像就地操作一样使用它。有一个名为--prefix-symbols 的选项/操作,您有 3 种猜测它的作用。
So let's throw this fella onto our stubborn libraries:
所以让我们把这个家伙扔到我们顽固的图书馆里:
example/ex03/ $ objcopy --prefix-symbols=foo_ foo.o
example/ex03/ $ objcopy --prefix-symbols=bar_ bar.o
nmshows us that this seemed to work:
nm向我们展示了这似乎有效:
example/ex03/ $ nm foo.o
0000000000000019 T foo_eggs
0000000000000000 T foo_spam
0000000000000008 C foo_the_eggs
0000000000000004 C foo_the_spams
example/ex03/ $ nm bar.o
000000000000001e T bar_eggs
0000000000000000 T bar_spam
0000000000000008 C bar_the_eggs
0000000000000004 C bar_the_spams
Lets try linking this whole thing:
让我们试着把这整个事情联系起来:
example/ex03/ $ make
cc foobar.o foo.o bar.o -o foobar
And indeed, it worked:
事实上,它奏效了:
example/ex03/ $ ./foobar
foo: spam = 0, eggs = 0.000000
bar: old spam = 0, new spam = 3 ; old eggs = 0.000000, new eggs = 5.000000
Now I leave it as an exercise to the reader to implement a tool/script that automatically extracts the symbols of a library using nm, writes a wrapper header file of the structure
现在我把它作为练习留给读者来实现一个工具/脚本,该工具/脚本使用nm自动提取库的符号,编写结构的包装头文件
/* wrapper header wrapper_foo.h for foo.h */
#define spam foo_spam
#define eggs foo_eggs
/* ... */
#include <foo.h>
#undef spam
#undef eggs
/* ... */
and applies the symbol prefix to the static library's object files using objcopy.
并使用objcopy将符号前缀应用于静态库的对象文件。
What about shared libraries?
共享库呢?
In principle the same could be done with shared libraries. However shared libraries, the name tells it, are shared among multiple programs, so messing with a shared library in this way is not such a good idea.
原则上,共享库也可以这样做。但是,顾名思义,共享库是在多个程序之间共享的,因此以这种方式弄乱共享库并不是一个好主意。
You will not get around writing a trampoline wrapper. Even worse you cannot link against the shared library on the object file level, but are forced to do dynamic loading. But this deserves its very own article.
您不会绕过编写蹦床包装器。更糟糕的是,您无法在目标文件级别链接共享库,而是被迫进行动态加载。但这值得它自己的文章。
Stay tuned, and happy coding.
请继续关注,祝您编码愉快。
回答by R.. GitHub STOP HELPING ICE
Rules of the C standard put some constraints on those (for safe compilation): A C compiler may look at only the first 8 characters of an identifier, so foobar2k_eggs and foobar2k_spam may be interpreted as the same identifiers validly – however every modern compiler allows for arbitrary long identifiers, so in our times (the 21st century) we should not have to bother about this.
C 标准的规则对这些规则施加了一些限制(为了安全编译):AC 编译器可能只查看标识符的前 8 个字符,因此 foobar2k_eggs 和 foobar2k_spam 可能被有效地解释为相同的标识符——但是每个现代编译器都允许任意长标识符,因此在我们这个时代(21 世纪),我们不必为此烦恼。
This is not just an extension of modern compilers; the current C standard also requiresthe compiler to support reasonably long external names. I forget the exact length but it's something like 31 characters now if I remember right.
这不仅仅是现代编译器的扩展;当前的 C 标准还要求编译器支持相当长的外部名称。我忘记了确切的长度,但如果我没记错的话,现在大概是 31 个字符。
But what if you're facing some libraries of which you cannot change the symbol names / idenfiers? Maybe you got only a static binary and the headers or don't want to, or are not allowed to adjust and recompile yourself.
但是,如果您面对一些无法更改符号名称/标识符的库,该怎么办?也许你只有一个静态二进制文件和头文件,或者不想,或者不允许自己调整和重新编译。
Then you're stuck. Complain to the author of the library. I once encountered such a bug where users of my application were unable to build it on Debian due to Debian's libSDLlinking libsoundfile, which (at least at the time) polluted the global namespace horribly with variables like dsp(I kid you not!). I complained to Debian, and they fixed their packages and sent the fix upstream, where I assume it was applied, since I never heard of the problem again.
那你就卡住了。向图书馆的作者投诉。我曾经遇到过这样的错误,由于 Debian 的libSDL链接libsoundfile,我的应用程序的用户无法在 Debian 上构建它,这(至少在当时)用诸如dsp(我不骗你!)之类的变量严重污染了全局命名空间。我向 Debian 抱怨,他们修复了他们的软件包并将修复发送到上游,我认为它已被应用,因为我再也没有听说过这个问题。
I really think this is the best approach, because it solves the problem for everyone. Any local hack you do will leave the problem in the library for the next unfortunate user to encounter and fight with again.
我真的认为这是最好的方法,因为它为每个人解决了问题。您所做的任何本地 hack 都会将问题留在库中,供下一个不幸的用户再次遇到并与之抗争。
If you really do need a quick fix, and you have source, you could add a bunch of -Dfoo=crappylib_foo -Dbar=crappylib_baretc. to the makefile to fix it. If not, use the objcopysolution you found.
如果你真的需要快速修复,并且你有源代码,你可以-Dfoo=crappylib_foo -Dbar=crappylib_bar在 makefile 中添加一堆等来修复它。如果没有,请使用objcopy您找到的解决方案。
回答by JJJSchmidt
If you're using GCC, the --allow-multiple-definition linker switch is a handy debugging tool. This hogties the linker into using the first definition (and not whining about it). More about it here.
如果您使用 GCC,则 --allow-multiple-definition 链接器开关是一个方便的调试工具。这使链接器使用第一个定义(而不是抱怨它)。更多关于它在这里。
This has helped me during development when I have the source to a vendor-supplied library available and need to trace into a library function for some reason or other. The switch allows you to compile and link in a local copy of a source file and still link to the unmodified static vendor library. Don't forget to yank the switch back out of the make symbols once the voyage of discovery is complete. Shipping release code with intentional name space collisions is prone to pitfalls including unintentionalname space collisions.
当我拥有供应商提供的库的源代码并且由于某种原因需要跟踪库函数时,这在开发过程中对我有所帮助。该开关允许您在源文件的本地副本中编译和链接,并且仍然链接到未修改的静态供应商库。探索之旅完成后,不要忘记将开关从 make 符号中拔出。带有故意命名空间冲突的发布代码很容易出现陷阱,包括无意的命名空间冲突。

![C语言 为什么可以将字符串分配给 char* 指针,而不能分配给 char[] 数组?](/res/img/loading.gif)