string 如何在 Perl 中提取两个行分隔符之间的行?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1212799/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 00:30:27  来源:igfitidea点击:

How do I extract lines between two line delimiters in Perl?

perlstringextractdelimiter

提问by jbatista

I have an ASCII log file with some content I would like to extract. I've never taken time to learn Perl properly, but I figure this is a good tool for this task.

我有一个 ASCII 日志文件,其中包含我想提取的一些内容。我从来没有花时间正确学习 Perl,但我认为这是完成这项任务的好工具。

The file is structured like this:

该文件的结构如下:

... 
... some garbage 
... 
... garbage START
what i want is 
on different
lines 
END 
... 
... more garbage ...
next one START 
more stuff I want, again
spread 
through 
multiple lines 
END 
...
more garbage

So, I'm looking for a way to extract the lines between each STARTand ENDdelimiter strings. How can I do this?

所以,我正在寻找一种方法来提取每个字符串STARTEND分隔符字符串之间的行。我怎样才能做到这一点?

So far, I've only found some examples on how to print a line with the STARTstring, or other documentation items that are somewhat related with what I'm looking for.

到目前为止,我只找到了一些关于如何使用START字符串打印一行的示例,或与我正在查找的内容有些相关的其他文档项。

回答by Telemachus

You want the flip-flop operator (better known as the range operator) ..

您需要触发器运算符(通常称为范围运算符) ..

#!/usr/bin/env perl
use strict;
use warnings;

while (<>) {
  if (/START/../END/) {
    next if /START/ || /END/;
    print;
  }
}

Replace the call to printwith whatever you actually want to do (e.g., push the line into an array, edit it, format it, whatever). I'm next-ing past the lines that actually have STARTor END, but you may not want that behavior. See this articlefor a discussion of this operator and other useful Perl special variables.

将调用替换为print您真正想做的事情(例如,将行推入数组,对其进行编辑、格式化,等等)。我正在next跳过实际具有START或的行END,但您可能不希望这种行为。有关运算符和其他有用的 Perl 特殊变量的讨论,请参阅本文

回答by brian d foy

From perlfaq6's answer to How can I pull out lines between two patterns that are themselves on different lines?

perlfaq6如何在两个本身在不同行上的模式之间拉出线的回答



You can use Perl's somewhat exotic .. operator (documented in perlop):

您可以使用 Perl 的有点奇特的 .. 运算符(记录在 perlop 中):

perl -ne 'print if /START/ .. /END/' file1 file2 ...

If you wanted text and not lines, you would use

如果你想要文本而不是线条,你可以使用

perl -0777 -ne 'print "\n" while /START(.*?)END/gs' file1 file2 ...

But if you want nested occurrences of START through END, you'll run up against the problem described in the question in this section on matching balanced text.

但是,如果您想要 START 到 END 的嵌套出现,您将遇到本节中关于匹配平衡文本的问题中描述的问题。

Here's another example of using ..:

这是使用 .. 的另一个示例:

while (<>) {
    $in_header =   1  .. /^$/;
    $in_body   = /^$/ .. eof;
# now choose between them
} continue {
    $. = 0 if eof;  # fix $.
}

回答by dala

Not too bad for coming from a "virtual newcommer". One thing you could do, is to put the "$found=1" inside of the "if($found == 0)" block so that you don't do that assignment every time between $start and $stop.

来自“虚拟新人”还不错。您可以做的一件事是将“$found=1”放在“if($found == 0)”块中,这样您就不会每次在$start 和$stop 之间都执行该分配。

Another thing that is a bit ugly, in my opinion, is that you open the same filehandler each time you enter the $start/$stop-block.

在我看来,另一件有点难看的事情是每次输入 $start/$stop-block 时都打开同一个文件处理程序。

This shows a way around that:

这显示了一种解决方法:

#!/usr/bin/perl

use strict;
use warnings;

my $start='CINFILE=$';
my $stop='^#$';
my $filename;
my $output;
my $counter=1;
my $found=0;

while (<>) {

    # Find block of lines to extract                                                           
    if( /$start/../$stop/ ) {

        # Start of block                                                                       
        if( /$start/ ) {
            $filename=sprintf("boletim_%06d.log",$counter);
            open($output,'>>'.$filename) or die $!;
        }
        # End of block                                                                         
        elsif ( /$end/ ) {
            close($output);
            $counter++;
            $found = 0;
        }
        # Middle of block                                                                      
        else{
            if($found == 0) {
                print $output (split(/ /))[1];
                $found=1;
            }
            else {
                print $output $_;
            }
        }

    }
    # Find block of lines to extract                                                           

}

回答by Dirk

How can I grab multiple lines after a matching line in Perl?

如何在 Perl 中的匹配行之后抓取多行?

How's that one? In that one, the END string is $^, you can change it to your END string.

那个怎么样?在那个中,END 字符串是 $^,您可以将其更改为您的 END 字符串。

I am also a novice, but the solutions there provide quite a few methods... let me know more specifically what it is you want that differs from the above link.

我也是新手,但是那里的解决方案提供了很多方法......让我更具体地说明你想要什么与上面的链接不同。

回答by ghostdog74

while (<>) {
    chomp;      # strip record separator
    if(/END/) { $f=0;}
    if (/START/) {
        s/.*START//g;
        $f=1;
    }
    print $_ ."\n" if $f;
}

try to write some code next time round

下次尝试写一些代码

回答by jbatista

After Telemachus' reply, things started pouring out. This works as the solution I'm looking at after all.

在泰勒马科斯的回答之后,事情开始涌现。毕竟,这可以作为我正在寻找的解决方案。

  1. I'm trying to extract lines delimited by two strings (one, with a line ending with "CINFILE="; other, with a line containing a single "#") in separate lines, excluding the delimiter lines. This I can do with Telemachus' solution.
  2. The first line has a space I want to remove. I'm also including it.
  3. I'm also trying to extract each line-set into separate files.
  1. 我试图在不同的行中提取由两个字符串分隔的行(一个,一行以“CINFILE=”结尾;另一个,一行包含一个“#”),不包括分隔符行。我可以用 Telemachus 的解决方案来做到这一点。
  2. 第一行有一个我想删除的空格。我也包括在内。
  3. 我还试图将每个行集提取到单独的文件中。

This works for me, although the code can be classified as ugly; this is because I'm currently a virtually newcomer to Perl. Anyway here goes:

这对我有用,虽然代码可以归类为丑陋的;这是因为我目前几乎是 Perl 的新手。无论如何,这里是:

#!/usr/bin/env perl
use strict;
use warnings;

my $start='CINFILE=$';
my $stop='^#$';
my $filename;
my $output;
my $counter=1;
my $found=0;

while (<>) {
  if (/$start/../$stop/) {
    $filename=sprintf("boletim_%06d.log",$counter);
    open($output,'>>'.$filename) or die $!;
    next if /$start/ || /$stop/;
    if($found == 0) { print $output (split(/ /))[1]; }
    else { print $output $_; }
    $found=1;
  } else { if($found == 1) { close($output); $counter++; $found=0; } }
}

I hope it benefits others as well. Cheers.

我希望它也有益于其他人。干杯。