C语言 scanf 正则表达式 - C

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/15664664/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-02 05:50:47  来源:igfitidea点击:

scanf regex - C

cregexscanf

提问by pasadinhas

I needed to read a string until the following sequence is written: \nx\n :

我需要读取一个字符串,直到写入以下序列: \nx\n :

(.....)\n
x\n

\n is the new line character and (.....) can be any characters that may include other \n characters.

\n 是换行符, (.....) 可以是任何可能包含其他 \n 字符的字符。

scanf allows regular expressions as far as I know, but i can't make it to read a string untill this pattern. Can you help me with the scanf format string?

据我所知,scanf 允许使用正则表达式,但在此模式之前我无法读取字符串。你能帮我扫描 scanf 格式字符串吗?



I was trying something like:

我正在尝试类似的东西:

char input[50000];
scanf(" %[^(\nx\n)]", input);

but it doesn't work.

但它不起作用。

回答by dasblinkenlight

scanfallows regular expressions as far as I know

scanf据我所知,允许正则表达式

Unfortunately, it does not allow regular expressions: the syntax is misleadingly close, but there is nothing even remotely similar to the regex in the implementation of scanf. All that's there is a support for character classesof regex, so %[<something>]is treated implicitly as [<something>]*. That's why your call of scanftranslates into read a string consisting of characters other than '(', ')', 'x', and '\n'.

不幸的是,它不允许正则表达式:语法非常接近,但在scanf. 所有这些都支持正则表达式的字符类,因此%[<something>]被隐式视为[<something>]*. 这就是为什么您的调用scanf转换为读取由 . 以外的字符组成的字符串'(', ')', 'x', and '\n'

To solve your problem at hand, you can set up a loop that read the input character by character. Every time you get a '\n', check that

为了解决您手头的问题,您可以设置一个循环来逐个字符地读取输入。每次你得到一个'\n',检查一下

  • You have at least three characters in the input that you've seen so far,
  • That the character immediately before '\n'is an 'x', and
  • That the character before the 'x'is another '\n'
  • 到目前为止,您所看到的输入中至少有三个字符,
  • 紧接在前面的字符'\n''x', 和
  • 前面的字符'x'是另一个'\n'

If all of the above is true, you have reached the end of your anticipated input sequence; otherwise, your loop should continue.

如果以上所有情况都为真,则您已到达预期输入序列的末尾;否则,您的循环应该继续。

回答by zwol

scanfdoes notsupport regular expressions. It has limited support for character classes but that's not at all the same thing.

scanf支持正则表达式。它对字符类的支持有限,但这根本不是一回事。

Never use scanf, fscanf, or sscanf, because:

永远不要使用scanf, fscanf, 或sscanf, 因为:

  1. Numeric overflow triggers undefined behavior. The C runtime is allowed to crash your programjust because someone typed too many digits.
  2. Some format specifiers (notably %s) are unsafe in exactly the same way getsis unsafe, i.e. they will cheerfully write past the end of the provided buffer and crash your program.
  3. They make it extremely difficult to handle malformed input robustly.
  1. 数字溢出触发未定义行为。C 运行时允许您的程序崩溃,因为有人输入了太多数字。
  2. 一些格式说明符(特别是%s)是不安全的,就像gets不安全一样,它们会愉快地写到提供的缓冲区的末尾并导致程序崩溃。
  3. 它们使得稳健地处理格式错误的输入变得极其困难。

You don't need regular expressions for this case; read a line at a time with getlineand stop when the line read is just "x". However, the standard (not ISO C, but POSIX) regular expression library routines are called regcompand regexec.

在这种情况下,您不需要正则表达式;一次读取一行并在读取的行getline仅为“x”时停止。但是,标准(不是 ISO C,而是 POSIX)正则表达式库例程被称为regcompregexec