bash SED 删除带有 REGEX 模式的行

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/40241433/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 15:20:51  来源:igfitidea点击:

SED to remove a Line with REGEX Pattern

regexbashunixsed

提问by Imkls

i've got a hundreds of files with thousands of lines, which i need to delete some lines that follows a pattern,so i went to SED with regex .The struct of files is something like this

我有数百个包含数千行的文件,我需要删除一些遵循模式的行,所以我用正则表达式去了 SED。文件的结构是这样的

A,12121212121212,foo,bar,lorem
C,32JL,JL
C,32JL,JL
C,32JL,JL
C,32JL,JL
A,21212121212121,foo,bar,lorem
C,32JL,JL
C,32JL,JL
C,32JL,JL
A,9999,88888,77777

I need to delete All the lines that starts with "A" and ends with "lorem"

我需要删除所有以“ A”开头并以“ lorem”结尾的行

Expected output-

预期输出-

C,32JL,JL
C,32JL,JL
C,32JL,JL
C,32JL,JL
C,32JL,JL
C,32JL,JL
C,32JL,JL
A,9999,88888,77777

I've made the Regex :

我已经制作了正则表达式:

^(A).*(lorem)

And it match in my text editor (Sublime,UltraEdit)

它在我的文本编辑器中匹配(Sublime,UltraEdit)

In the UNIX shell

在 UNIX 外壳中

sed '/^(A).*(lorem)/d' file.txt

But somehow it doesn't work,it shows the whole file, and i can't figure out why.

但不知何故它不起作用,它显示了整个文件,我不知道为什么。

Can someone help me please?

有人能帮助我吗?

回答by Aaron

The others gave you correct solutions but didn't explain why your regex didn't work. The ()surely were useless, but if you had used the regex with other tools/languages, you might very well have had the expected result.

其他人为您提供了正确的解决方案,但没有解释为什么您的正则表达式不起作用。该()肯定是无用的,但如果你已经使用其他工具/语言的正则表达式,你很可能会不得不预期的结果。

It didn't work with sedbecause it will by default use POSIX's basic regular expressions, where the characters for grouping are \(and \), while (and )will match literal characters. There were no such brackets in your input text, so it didn't match.

它不起作用,sed因为它默认使用POSIX 的基本正则表达式,其中用于分组的字符是\(and \),而()将匹配文字字符。您的输入文本中没有这样的括号,因此不匹配。

Your regular expression would have worked if you had used GNU's sed -ror BSD's sed -E, the flag switching to POSIX's extended regular expressions where (and )are used to group and \(\)match the literal brackets.

如果您使用了 GNUsed -r或 BSD 的正则表达式,您的正则表达式会起作用sed -E,该标志切换到 POSIX 的扩展正则表达式,其中()用于分组和\(\)匹配文字括号。

In conclusion, the following commands will do the same thing :

总之,以下命令将执行相同的操作:

  • sed '/^A.*lorem$/d' file.txt
  • sed -r '/^(A).*(lorem)$/d' file.txt(with GNU sed)
  • sed -E '/^(A).*(lorem)$/d' file.txt(with BSD sed and modern GNU sed)
  • sed '/^\(A\).*\(lorem\)$/d' file.txt
  • sed '/^A.*lorem$/d' file.txt
  • sed -r '/^(A).*(lorem)$/d' file.txt(使用 GNU sed)
  • sed -E '/^(A).*(lorem)$/d' file.txt(使用 BSD sed 和现代 GNU sed)
  • sed '/^\(A\).*\(lorem\)$/d' file.txt

回答by James Brown

$ sed '/^A.*lorem$/d' file.txt
  • ^A: starts with an A
  • .*: stuff in the middle
  • lorem$: ends with lorem
  • ^A: 开头 A
  • .*: 中间的东西
  • lorem$: 以。。结束 lorem

回答by Chem-man17

Remove the brackets.

取下括号。

Using your code, the appropriate one-liner becomes-

使用您的代码,适当的单行代码变为-

sed '/^A.*lorem/d' file.txt

If you want to be more rigourous, you can look at James's answer which more correctly terminates the regex as-

如果你想更严格,你可以看看詹姆斯的回答,它更正确地终止了正则表达式——

sed '/^A.*lorem$/d' file.txt

Both will work.

两者都会起作用。