遗留 C/C++ 项目中的死代码检测

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/229069/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 13:53:45  来源:igfitidea点击:

Dead code detection in legacy C/C++ project

c++automationstatic-analysislegacy-codedead-code

提问by Nazgob

How would you go about dead code detection in C/C++ code? I have a pretty large code base to work with and at least 10-15% is dead code. Is there any Unix based tool to identify this areas? Some pieces of code still use a lot of preprocessor, can automated process handle that?

您将如何在 C/C++ 代码中进行死代码检测?我有一个相当大的代码库可以使用,至少有 10-15% 是死代码。是否有任何基于 Unix 的工具来识别这些区域?有些代码仍然使用大量预处理器,自动化流程可以处理吗?

采纳答案by Johan

You could use a code coverage analysis tool for this and look for unused spots in your code.

您可以为此使用代码覆盖率分析工具并查找代码中未使用的点。

A popular tool for the gcc toolchain is gcov, together with the graphical frontend lcov (http://ltp.sourceforge.net/coverage/lcov.php).

gcc 工具链的一个流行工具是 gcov,以及图形前端 lcov ( http://ltp.sourceforge.net/coverage/lcov.php)。

If you use gcc, you can compile with gcov support, which is enabled by the '--coverage' flag. Next, run your application or run your test suite with this gcov enabled build.

如果您使用 gcc,则可以使用 gcov 支持进行编译,该支持由“--coverage”标志启用。接下来,使用这个启用 gcov 的构建运行您的应用程序或运行您的测试套件。

Basically gcc will emit some extra files during compilation and the application will also emit some coverage data while running. You have to collect all of these (.gcdo and .gcda files). I'm not going in full detail here, but you probably need to set two environment variables to collect the coverage data in a sane way: GCOV_PREFIX and GCOV_PREFIX_STRIP...

基本上 gcc 会在编译期间发出一些额外的文件,并且应用程序在运行时也会发出一些覆盖率数据。您必须收集所有这些(.gcdo 和 .gcda 文件)。我不会在这里详细介绍,但您可能需要设置两个环境变量以合理的方式收集覆盖数据:GCOV_PREFIX 和 GCOV_PREFIX_STRIP...

After the run, you can put all the coverage data together and run it through the lcov toolsuite. Merging of all the coverage files from different test runs is also possible, albeit a bit involved.

运行后,您可以将所有覆盖率数据放在一起,并通过 lcov 工具套件运行。合并来自不同测试运行的所有覆盖文件也是可能的,尽管有点复杂。

Anyhow, you end up with a nice set of webpages showing some coverage information, pointing out the pieces of code that have no coverage and hence, were not used.

无论如何,您最终会得到一组很好的网页,显示一些覆盖信息,指出没有覆盖因此没有使用的代码片段。

Off course, you need to double check if the portions of code are not used in any situation and a lot depends on how good your tests exercise the codebase. But at least, this will give an idea about possible dead-code candidates...

当然,您需要仔细检查代码部分是否在任何情况下都没有使用,这在很大程度上取决于您的测试对代码库的运用有多好。但至少,这将给出关于可能的死代码候选者的想法......

回答by Steve Jessop

Compile it under gcc with -Wunreachable-code.

在 gcc 下用 -Wunreachable-code 编译它。

I think that the more recent the version, the better results you'll get, but I may be wrong in my impression that it's something they've been actively working on. Note that this does flow analysis, but I don't believe it tells you about "code" which is already dead by the time it leaves the preprocessor, because that's never parsed by the compiler. It also won't detect e.g. exported functions which are never called, or special case handling code which just so happen to be impossible because nothing ever calls the function with that parameter - you need code coverage for that (and run the functional tests, not the unit tests. Unit tests are supposedto have 100% code coverage, and hence execute code paths which are 'dead' as far as the application is concerned). Still, with these limitations in mind it's an easy way to get started finding the most completely bollixed routines in the code base.

我认为版本越新,你得到的结果就越好,但我的印象可能是错误的,认为这是他们一直在积极研究的东西。请注意,这会进行流程分析,但我不相信它会告诉您在离开预处理器时已经死亡的“代码”,因为编译器从未解析过它。它也不会检测例如从未调用过的导出函数,或碰巧不可能的特殊情况处理代码,因为没有任何东西使用该参数调用函数 - 您需要代码覆盖(并运行功能测试,而不是单元测试。单元测试应该是具有 100% 的代码覆盖率,因此执行就应用程序而言“死”的代码路径)。尽管如此,考虑到这些限制,这是一种开始在代码库中查找最完整的 bollixed 例程的简单方法。

This CERT advisory lists some other tools for static dead code detection

此 CERT 建议列出了一些其他用于静态死代码检测的工具

回答by Max Lybbert

Both Mozillaand Open Officehave home-grown solutions.

双方的Mozilla开放式办公有本土的解决方案。

回答by Thomas L Holaday

g++ 4.01 -Wunreachable-code warns about code that is unreachable within a function, but does not warn about unused functions.

g++ 4.01 -Wunreachable-code 对函数内无法访问的代码发出警告,但不会对未使用的函数发出警告。

int foo() { 
    return 21; // point a
}

int bar() {
  int a = 7;
  return a;
  a += 9;  // point b
  return a;
}

int main(int, char **) {
    return bar();
}

g++ 4.01 will issue a warning about point b, but say nothing about foo() (point a) even though it is unreachable in this file. This behavior is correct although disappointing, because a compiler cannot know that function foo() is not declared extern in some other compilation unit and invoked from there; only a linker can be sure.

g++ 4.01 将发出关于 b 点的警告,但对 foo()(a 点)一无所知,即使它在此文件中无法访问。尽管令人失望,但这种行为是正确的,因为编译器无法知道函数 foo() 未在其他编译单元中声明为 extern 并从那里调用;只有链接器可以确定。

回答by andreas buykx

Your approach depends on the availability (automated) tests. If you have a test suite that you trust to cover a sufficient amount of functionality, you can use a coverage analysis, as previous answers already suggested.

您的方法取决于可用性(自动化)测试。如果您有一个测试套件,您相信它可以涵盖足够数量的功能,那么您可以使用覆盖分析,正如之前的答案已经建议的那样。

If you are not so fortunate, you might want to look into source code analysis tools like SciTools' Understand that can help you analyse your code using a lot of built in analysis reports. My experience with that tool dates from 2 years ago, so I can't give you much detail, but what I do remember is that they had an impressive support with very fast turnaround times of bug fixes and answers to questions.

如果您不是那么幸运,您可能想研究源代码分析工具,如SciTools 的“理解”,它可以帮助您使用大量内置分析报告分析您的代码。我使用该工具的经验可以追溯到 2 年前,所以我不能提供太多细节,但我记得的是,他们提供了令人印象深刻的支持,错误修复和问题答案的周转时间非常快。

I found a page on static source code analysisthat lists many other tools as well.

我找到了一个关于静态源代码分析的页面,其中还列出了许多其他工具。

If that doesn't help you sufficiently either, and you're specifically interested in finding out the preprocessor-related dead code, I would recommend you post some more details about the code. For example, if it is mostly related to various combinations of #ifdef settings you could write scripts to determine the (combinations of) settings and find out which combinations are never actually built, etc.

如果这也不能充分帮助您,并且您对找出与预处理器相关的死代码特别感兴趣,我建议您发布有关代码的更多详细信息。例如,如果它主要与#ifdef 设置的各种组合有关,您可以编写脚本来确定(组合)设置并找出哪些组合从未真正构建过,等等。

回答by Pascal Cuoq

For C code only and assuming that the source code of the whole project is available, launch an analysis with the Open Source tool Frama-C. Any statement of the program that displays red in the GUI is dead code.

仅针对 C 代码并假设整个项目的源代码可用,使用开源工具Frama-C启动分析。在 GUI 中显示红色的任何程序语句都是死代码。

If you have "dead code" problems, you may also be interested in removing "spare code", code that is executed but does not contribute to the end result. This requires you to provide an accurate modelization of I/O functions (you wouldn't want to remove a computation that appears to be "spare" but that is used as an argument to printf). Frama-C has an option for pointing out spare code.

如果您有“死代码”问题,您可能还想删除“备用代码”,即已执行但对最终结果没有贡献的代码。这要求您提供 I/O 函数的准确模型(您不希望删除看似“备用”但用作 的参数的计算printf)。Frama-C 有一个用于指出备用代码的选项。

回答by Ira Baxter

Dead code analysis like this requires a global analysis of your entire project. You can't get this information by analyzing translation units individually (well, you can detect dead entities if they are entirely within a single translation unit, but I don't think that's what you are really looking for).

像这样的死代码分析需要对整个项目进行全局分析。您无法通过单独分析翻译单元来获得此信息(好吧,如果它们完全在单个翻译单元内,您可以检测到死实体,但我认为这不是您真正要寻找的)。

We've used our DMS Software Reengineering Toolkit to implement exactly this for Java code, by parsing all the compilation-units involved at once, building symbol tables for everything and chasing down all the references. A top level definition with no references and no claim of being an external API item is dead. This tool also automatically strips out the dead code, and at the end you can choose what you want: the report of dead entities, or the code stripped of those entities.

我们已经使用我们的 DMS 软件再工程工具包为 Java 代码实现了这一点,通过一次解析所有涉及的编译单元,为所有内容构建符号表并追踪所有引用。没有引用并且没有声称是外部 API 项的顶级定义已经死了。这个工具还会自动去除死代码,最后你可以选择你想要的:死实体的报告,或者那些实体被剥离的代码。

DMS also parses C++ in a variety of dialects (EDIT Feb 2014: including MS and GCC versions of C++14 [EDIT Nov 2017: now C++17]) and builds all the necessary symbol tables. Tracking down the dead references would be straightforward from that point. DMS could also be used to strip them out. See http://www.semanticdesigns.com/Products/DMS/DMSToolkit.html

DMS 还以各种方言解析 C++(2014 年 2 月编辑:包括 C++14 的 MS 和 GCC 版本 [编辑 2017 年 11 月:现在 C++17])并构建所有必要的符号表。从那时起,追踪死引用将很简单。DMS 也可用于去除它们。请参阅http://www.semanticdesigns.com/Products/DMS/DMSToolkit.html

回答by Ashwin

Bullseyecoverage tool would help. It is not free though.

Bullseye覆盖工具会有所帮助。虽然它不是免费的。