string 从另一个字符串 -Perl 中提取所需的子字符串

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/11294116/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 01:31:40  来源:igfitidea点击:

Extract the required substring from another string -Perl

stringperlsplitgrep

提问by Amey

I want to extract a substring from a line in Perl. Let me explain giving an example:

我想从 Perl 的一行中提取一个子字符串。我举个例子来解释一下:

fhjgfghjk3456mm   735373653736
icasd 666666666666
111111111111

In the above lines, I only want to extract the 12 digit number. I tried using splitfunction:

在上面几行中,我只想提取 12 位数字。我尝试使用split功能:

my @cc = split(/[0-9]{12}/,$line);
print @cc;

But what it does is removes the matched part of the string and stores the residue in @cc. I want the part matching the pattern to be printed. How do I that?

但它所做的是删除字符串的匹配部分并将剩余部分存储在@cc. 我想要打印与图案匹配的部分。我怎么办?

采纳答案by PinkElephantsOnParade

The $1 built-in variable stores the last match from a regex. Also, if you perform a regex on a whole string, it will return the whole string. The best solution here is to put parentheses around your match then print $1.

$1 内置变量存储来自正则表达式的最后一个匹配项。此外,如果您对整个字符串执行正则表达式,它将返回整个字符串。这里最好的解决方案是在你的匹配项周围加上括号,然后打印 $1。

my $strn = "fhjgfghjk3456mm 735373653736\nicasd\n666666666666 111111111111";
$strn =~ m/([0-9]{12})/;
print ;

This makes our regex match JUST the twelve digit number and then we return that match with $1.

这使得我们的正则表达式只匹配 12 位数字,然后我们用 $1 返回匹配。

回答by simbabque

You can do it with regular expressions:

你可以用正则表达式做到这一点:

#!/usr/bin/perl
my $string = 'fhjgfghjk3456mm 735373653736 icasd 666666666666 111111111111';
while ($string =~ m/\b(\d{12})\b/g) {
  say ;
}

Test the regex here: http://rubular.com/r/Puupx0zR9w

在这里测试正则表达式:http: //rubular.com/r/Puupx0zR9w

use YAPE::Regex::Explain;
print YAPE::Regex::Explain->new(qr/\b(\d+)\b/)->explain();

The regular expression:

(?-imsx:\b(\d+)\b)

matches as follows:

NODE                     EXPLANATION
----------------------------------------------------------------------
(?-imsx:                 group, but do not capture (case-sensitive)
                         (with ^ and $ matching normally) (with . not
                         matching \n) (matching whitespace and #
                         normally):
----------------------------------------------------------------------
  \b                       the boundary between a word char (\w) and
                           something that is not a word char
----------------------------------------------------------------------
  (                        group and capture to :
----------------------------------------------------------------------
    \d+                      digits (0-9) (1 or more times (matching
                             the most amount possible))
----------------------------------------------------------------------
  )                        end of 
----------------------------------------------------------------------
  \b                       the boundary between a word char (\w) and
                           something that is not a word char
----------------------------------------------------------------------
)                        end of grouping
----------------------------------------------------------------------

回答by Rawkode

#!/bin/perl
my $var = 'fhjgfghjk3456mm 735373653736 icasd 666666666666 111111111111';
if($var =~ m/(\d{12})/) {
  print "Twelve digits: .";
}

回答by cdtits

#!/usr/bin/env perl

undef $/;
$text = <DATA>;
@res = $text =~ /\b\d{12}\b/g;
print "@res\n";

__DATA__
fhjgfghjk3456mm   735373653736
icasd 666666666666
111111111111