从 C/C++ 代码中删除注释

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2394017/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 23:19:10  来源:igfitidea点击:

Remove comments from C/C++ code

c++ccomments

提问by Mike

Is there an easy way to remove comments from a C/C++ source file without doing any preprocessing. (ie, I think you can use gcc -E but this will expand macros.) I just want the source code with comments stripped, nothing else should be changed.

有没有一种简单的方法可以在不进行任何预处理的情况下从 C/C++ 源文件中删除注释。(即,我认为您可以使用 gcc -E 但这会扩展宏。)我只想删除带有注释的源代码,不应该更改其他任何内容。

EDIT:

编辑:

Preference towards an existing tool. I don't want to have to write this myself with regexes, I foresee too many surprises in the code.

对现有工具的偏好。我不想自己用正则表达式写这个,我预见到代码中有太多惊喜。

回答by Josh Lee

Run the following command on your source file:

在源文件上运行以下命令:

gcc -fpreprocessed -dD -E test.c

Thanks to KennyTM for finding the right flags. Here's the result for completeness:

感谢 KennyTM 找到了正确的标志。这是完整性的结果:

test.c:

测试.c:

#define foo bar
foo foo foo
#ifdef foo
#undef foo
#define foo baz
#endif
foo foo
/* comments? comments. */
// c++ style comments

gcc -fpreprocessed -dD -E test.c:

gcc -fpreprocessed -dD -E test.c

#define foo bar
foo foo foo
#ifdef foo
#undef foo
#define foo baz
#endif
foo foo

回答by Jonathan Leffler

It depends on how perverse your comments are. I have a program sccto strip C and C++ comments. I also have a test file for it, and I tried GCC (4.2.1 on MacOS X) with the options in the currently selected answer - and GCC doesn't seem to do a perfect job on some of the horribly butchered comments in the test case.

这取决于你的评论有多反常。我有一个程序scc可以去除 C 和 C++ 注释。我也有一个测试文件,我尝试了 GCC(MacOS X 上的 4.2.1)和当前选择的答案中的选项 - 而 GCC 似乎并没有在一些可怕的屠杀评论中做得很好测试用例。

NB: This isn't a real-life problem - people don't write such ghastly code.

注意:这不是现实生活中的问题——人们不会写出如此可怕的代码。

Consider the (subset - 36 of 135 lines total) of the test case:

考虑测试用例的(子集 - 总共 135 行中的 36 行):

/\
*\
Regular
comment
*\
/
The regular C comment number 1 has finished.

/\
\/ This is not a C++/C99 comment!

This is followed by C++/C99 comment number 3.
/\
\
\
/ But this is a C++/C99 comment!
The C++/C99 comment number 3 has finished.

/\
\* This is not a C or C++ comment!

This is followed by regular C comment number 2.
/\
*/ This is a regular C comment *\
but this is just a routine continuation *\
and that was not the end either - but this is *\
\
/
The regular C comment number 2 has finished.

This is followed by regular C comment number 3.
/\
\
\
\
* C comment */

On my Mac, the output from GCC (gcc -fpreprocessed -dD -E subset.c) is:

在我的 Mac 上,GCC ( gcc -fpreprocessed -dD -E subset.c)的输出是:

/\
*\
Regular
comment
*\
/
The regular C comment number 1 has finished.

/\
\/ This is not a C++/C99 comment!

This is followed by C++/C99 comment number 3.
/\
\
\
/ But this is a C++/C99 comment!
The C++/C99 comment number 3 has finished.

/\
\* This is not a C or C++ comment!

This is followed by regular C comment number 2.
/\
*/ This is a regular C comment *\
but this is just a routine continuation *\
and that was not the end either - but this is *\
\
/
The regular C comment number 2 has finished.

This is followed by regular C comment number 3.
/\
\
\
\
* C comment */

The output from 'scc' is:

'scc' 的输出是:

The regular C comment number 1 has finished.

/\
\/ This is not a C++/C99 comment!

This is followed by C++/C99 comment number 3.
/\
\
\
/ But this is a C++/C99 comment!
The C++/C99 comment number 3 has finished.

/\
\* This is not a C or C++ comment!

This is followed by regular C comment number 2.

The regular C comment number 2 has finished.

This is followed by regular C comment number 3.

The output from 'scc -C' (which recognizes double-slash comments) is:

'scc -C'(识别双斜线注释)的输出是:

The regular C comment number 1 has finished.

/\
\/ This is not a C++/C99 comment!

This is followed by C++/C99 comment number 3.

The C++/C99 comment number 3 has finished.

/\
\* This is not a C or C++ comment!

This is followed by regular C comment number 2.

The regular C comment number 2 has finished.

This is followed by regular C comment number 3.


Source for SCC now available on GitHub

SCC 的源代码现已在 GitHub 上提供

The current version of SCC is 6.60 (dated 2016-06-12), though the Git versions were created on 2017-01-18 (in the US/Pacific time zone). The code is available from GitHub at https://github.com/jleffler/scc-snapshots. You can also find snapshots of the previous releases (4.03, 4.04, 5.05) and two pre-releases (6.16, 6.50) — these are all tagged release/x.yz.

SCC 的当前版本是 6.60(日期为 2016 年 6 月 12 日),尽管 Git 版本是在 2017 年 1 月 18 日(美国/太平洋时区)创建的。该代码可从 GitHub 获取,网址https://github.com/jleffler/scc-snapshots。您还可以找到之前版本(4.03、4.04、5.05)和两个预发布版本(6.16、6.50)的快照——这些都标记为release/x.yz

The code is still primarily developed under RCS. I'm still working out how I want to use sub-modules or a similar mechanism to handle common library files like stderr.cand stderr.h(which can also be found in https://github.com/jleffler/soq).

该代码仍主要在 RCS 下开发。我仍在研究如何使用子模块或类似的机制来处理像stderr.cstderr.h这样的常见库文件(也可以在https://github.com/jleffler/soq 中找到)。

SCC version 6.60 attempts to understand C++11, C++14 and C++17 constructs such as binary constants, numeric punctuation, raw strings, and hexadecimal floats. It defaults to C11 mode operation. (Note that the meaning of the -Cflag — mentioned above — flipped between version 4.0x described in the main body of the answer and version 6.60 which is currently the latest release.)

SCC 6.60 版尝试了解 C++11、C++14 和 C++17 构造,例如二进制常量、数字标点符号、原始字符串和十六进制浮点数。它默认为 C11 模式操作。(请注意,-C上面提到的标志的含义在答案主体中描述的 4.0x 版和当前最新版本的 6.60 版之间翻转。)

回答by lhf

gcc -fpreprocessed -dD -E did not work for me but this program does it:

gcc -fpreprocessed -dD -E 对我不起作用,但这个程序做到了:

#include <stdio.h>

static void process(FILE *f)
{
 int c;
 while ( (c=getc(f)) != EOF )
 {
  if (c=='\'' || c=='"')            /* literal */
  {
   int q=c;
   do
   {
    putchar(c);
    if (c=='\') putchar(getc(f));
    c=getc(f);
   } while (c!=q);
   putchar(c);
  }
  else if (c=='/')              /* opening comment ? */
  {
   c=getc(f);
   if (c!='*')                  /* no, recover */
   {
    putchar('/');
    ungetc(c,f);
   }
   else
   {
    int p;
    putchar(' ');               /* replace comment with space */
    do
    {
     p=c;
     c=getc(f);
    } while (c!='/' || p!='*');
   }
  }
  else
  {
   putchar(c);
  }
 }
}

int main(int argc, char *argv[])
{
 process(stdin);
 return 0;
}

回答by che

There is a stripcmtprogram than can do this:

有一个stripcmt程序可以做到这一点:

StripCmt is a simple utility written in C to remove comments from C, C++, and Java source files. In the grand tradition of Unix text processing programs, it can function either as a FIFO (First In - First Out) filter or accept arguments on the command line.

StripCmt 是一个用 C 编写的简单实用程序,用于从 C、C++ 和 Java 源文件中删除注释。在 Unix 文本处理程序的宏伟传统中,它既可以用作 FIFO(先进先出)过滤器,也可以在命令行上接受参数。

(per hlovdal's answer to: question about Python code for this)

(根据hlovdal的回答:关于 Python 代码的问题

回答by Vladimir

This is a perl script to remove //one-line and /* multi-line */ comments

这是一个删除 //one-line 和 /* multi-line */ 注释的 perl 脚本

  #!/usr/bin/perl

  undef $/;
  $text = <>;

  $text =~ s/\/\/[^\n\r]*(\n\r)?//g;
  $text =~ s/\/\*+([^*]|\*(?!\/))*\*+\///g;

  print $text;

It requires your source file as a command line argument. Save the script to a file, let say remove_comments.pl and call it using the following command: perl -w remove_comments.pl [your source file]

它需要您的源文件作为命令行参数。将脚本保存到一个文件中,比如说 remove_comments.pl 并使用以下命令调用它: perl -w remove_comments.pl [你的源文件]

Hope it will be helpful

希望它会有所帮助

回答by Halil Kaskavalci

I had this problem as well. I found this tool (Cpp-Decomment) , which worked for me. However it ignores if the comment line extends to next line. Eg:

我也有这个问题。我找到了这个工具(Cpp-Decomment),它对我有用。但是它会忽略注释行是否延伸到下一行。例如:

// this is my comment \
comment continues ...

In this case, I couldn't find a way in the program so just searched for ignored lines and fixed in manually. I believe there would be an option for that or maybe you could change the program's source file to do so.

在这种情况下,我在程序中找不到方法,所以只搜索忽略的行并手动修复。我相信会有一个选项,或者您可以更改程序的源文件来这样做。

回答by Christian Hujer

Because you use C, you might want to use something that's "natural" to C. You can use the C preprocessor to just remove comments. The examples given below work with the C preprocessor from GCC. They should work the same or in similar ways with other C perprocessors as well.

因为您使用 C,所以您可能希望使用对 C 来说“自然”的东西。您可以使用 C 预处理器来删除注释。下面给出的示例使用 GCC 的 C 预处理器。它们也应该与其他 C 处理器以相同或相似的方式工作。

For C, use

对于 C,使用

cpp -dD -fpreprocessed -o output.c input.c

It also works for removing comments from JSON, for example like this:

它也适用于从 JSON 中删除注释,例如:

cpp -P -o - - <input.json >output.json

In case your C preprocessor is not accessible directly, you can try to replace cppwith cc -E, which calls the C compiler telling it to stop after the preprocessor stage. In case your C compiler binary is not ccyou can replace ccwith the name of your C compiler binary, for example clang. Note that not all preprocessors support -fpreprocessed.

如果您的 C 预处理器无法直接访问,您可以尝试替换cppcc -E,它会调用 C 编译器告诉它在预处理器阶段后停止。如果您的 C 编译器二进制文件不是,cc您可以替换cc为您的 C 编译器二进制文件的名称,例如clang. 请注意,并非所有预处理器都支持-fpreprocessed.

回答by qeatzy

I write a C program using standard C library, around 200 lines, which removes comments of C source code file. qeatzy/removeccomments

我使用标准 C 库编写了一个 C 程序,大约 200 行,它删除了 C 源代码文件的注释。 qeatzy/删除评论

behavior

行为

  1. C style comment that span multi-line or occupy entire line gets zeroed out.
  2. C style comment in the middle of a line remain unchanged. eg, void init(/* do initialization */) {...}
  3. C++ style comment that occupy entire line gets zeroed out.
  4. C string literal being respected, via checking "and \".
  5. handles line-continuation. If previous line ending with \, current line is part of previous line.
  6. line number remain the same. Zeroed out lines or part of line become empty.
  1. 跨越多行或占据整行的 C 风格注释被清零。
  2. 一行中间的 C 风格注释保持不变。例如,void init(/* do initialization */) {...}
  3. 占据整行的 C++ 风格的注释被清零。
  4. 通过检查"\".
  5. 处理行延续。如果上一行以 结尾\,则当前行是上一行的一部分。
  6. 行号保持不变。归零的行或部分行变为空。

testing & profiling

测试和分析

I tested with largest cpython source code that contains many comments. In this case it do the job correctlyand fast, 2-5 fasterthan gcc

我使用包含许多注释的最大 cpython 源代码进行了测试。在这种情况下,它可以正确快速完成工作,比 gcc快2-5

time gcc -fpreprocessed -dD -E Modules/unicodeobject.c > res.c 2>/dev/null
time ./removeccomments < Modules/unicodeobject.c > result.c

usage

用法

/path/to/removeccomments < input_file > output_file

回答by chunyang.wen

Recently I wrote some Ruby code to solve this problem. I have considered following exceptions:

最近我写了一些Ruby代码来解决这个问题。我考虑过以下例外情况:

  • comment in strings
  • multiple line comment on one line, fix greedy match.
  • multiple lines on multiple lines
  • 字符串中的注释
  • 一行多行注释,修复贪婪匹配。
  • 多行多行

Here is the code:

这是代码

It uses following code to preprocess each line in case those comments appear in strings. If it appears in your code, uh, bad luck. You can replace it with a more complex strings.

它使用以下代码预处理每一行,以防这些注释出现在字符串中。如果它出现在你的代码中,呃,运气不好。您可以用更复杂的字符串替换它。

  • MUL_REPLACE_LEFT = "MUL_REPLACE_LEFT"
  • MUL_REPLACE_RIGHT = "MUL_REPLACE_RIGHT"
  • SIG_REPLACE = "SIG_REPLACE"
  • MUL_REPLACE_LEFT = " MUL_REPLACE_LEFT"
  • MUL_REPLACE_RIGHT = " MUL_REPLACE_RIGHT"
  • SIG_REPLACE = " SIG_REPLACE"

USAGE: ruby -w inputfile outputfile

用法: ruby -w inputfile outputfile

回答by Poseidon_Geek

I Believe If you use one statement you can easily remove Comments from C

我相信如果您使用一个语句,您可以轻松地从 C 中删除 Comments

perl -i -pe ‘s/\\*(.*)/g' file.c This command Use for removing * C style comments 
perl -i -pe 's/\\(.*)/g' file.cpp This command Use for removing \ C++ Style Comments

Only Problem with this command it cant remove comments that contains more than one line.but by using this regEx you can easily implement logic for Multiline Removing comments

此命令的唯一问题是它无法删除包含多行的注释。但是通过使用此正则表达式,您可以轻松实现多行删除注释的逻辑