PHP:提取括号内文本的最佳方法?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/196520/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-24 21:54:19  来源:igfitidea点击:

PHP: Best way to extract text within parenthesis?

phpparsingstring

提问by Wilco

What's the best/most efficient way to extract text set between parenthesis? Say I wanted to get the string "text" from the string "ignore everything except this (text)" in the most efficient manner possible.

在括号之间提取文本集的最佳/最有效方法是什么?假设我想以最有效的方式从字符串“忽略除此(文本)之外的所有内容”中获取字符串“文本”。

So far, the best I've come up with is this:

到目前为止,我想出的最好的是:

$fullString = "ignore everything except this (text)";
$start = strpos('(', $fullString);
$end = strlen($fullString) - strpos(')', $fullString);

$shortString = substr($fullString, $start, $end);

Is there a better way to do this? I know in general using regex tends to be less efficient, but unless I can reduce the number of function calls, perhaps this would be the best approach? Thoughts?

有一个更好的方法吗?我知道通常使用正则表达式效率较低,但除非我可以减少函数调用的数量,否则这可能是最好的方法?想法?

回答by Owen

i'd just do a regex and get it over with. unless you are doing enough iterations that it becomes a huge performance issue, it's just easier to code (and understand when you look back on it)

我只是做一个正则表达式并完成它。除非您进行了足够多的迭代,否则它会成为一个巨大的性能问题,否则编码会更容易(并在您回顾时理解)

$text = 'ignore everything except this (text)';
preg_match('#\((.*?)\)#', $text, $match);
print $match[1];

回答by Edward Z. Yang

So, actually, the code you posted doesn't work: substr()'sparameters are $string, $start and $length, and strpos()'sparameters are $haystack, $needle. Slightly modified:

因此,实际上,您发布的代码不起作用:substr()'s参数是 $string、$start 和$lengthstrpos()'s参数是$haystack, $needle。稍作修改:

$str = "ignore everything except this (text)";
$start  = strpos($str, '(');
$end    = strpos($str, ')', $start + 1);
$length = $end - $start;
$result = substr($str, $start + 1, $length - 1);

Some subtleties: I used $start + 1in the offset parameter in order to help PHP out while doing the strpos()search on the second parenthesis; we increment $startone and reduce $lengthto exclude the parentheses from the match.

一些微妙之处:我$start + 1在 offset 参数中使用,以便在strpos()搜索第二个括号时帮助 PHP ;我们增加一并$start减少$length以从匹配中排除括号。

Also, there's no error checking in this code: you'll want to make sure $startand $enddo not === false before performing the substr.

此外,还有没有错误在此代码检查:你要确保$start$end不===假执行前substr

As for using strpos/substrversus regex; performance-wise, this code will beat a regular expression hands down. It's a little wordier though. I eat and breathe strpos/substr, so I don't mind this too much, but someone else may prefer the compactness of a regex.

至于使用strpos/substr与正则表达式;在性能方面,此代码将击败正则表达式。不过有点啰嗦。我吃和呼吸strpos/substr,所以我不太介意这一点,但其他人可能更喜欢正则表达式的紧凑性。

回答by Rob

Use a regular expression:

使用正则表达式:

if( preg_match( '!\(([^\)]+)\)!', $text, $match ) )
    $text = $match[1];

回答by Sachin Murali G

This is a sample code to extract all the text between '[' and ']' and store it 2 separate arrays(ie text inside parentheses in one array and text outside parentheses in another array)

这是一个示例代码,用于提取 '[' 和 ']' 之间的所有文本并将其存储在 2 个单独的数组中(即一个数组中括号内的文本和另一个数组中括号外的文本)

   function extract_text($string)
   {
    $text_outside=array();
    $text_inside=array();
    $t="";
    for($i=0;$i<strlen($string);$i++)
    {
        if($string[$i]=='[')
        {
            $text_outside[]=$t;
            $t="";
            $t1="";
            $i++;
            while($string[$i]!=']')
            {
                $t1.=$string[$i];
                $i++;
            }
            $text_inside[] = $t1;

        }
        else {
            if($string[$i]!=']')
            $t.=$string[$i];
            else {
                continue;
            }

        }
    }
    if($t!="")
    $text_outside[]=$t;

    var_dump($text_outside);
    echo "\n\n";
    var_dump($text_inside);
  }

Output: extract_text("hello how are you?"); will produce:

输出:extract_text("你好,你好吗?"); 将产生:

array(1) {
  [0]=>
  string(18) "hello how are you?"
}

array(0) {
}

extract_text("hello [http://www.google.com/test.mp3] how are you?"); will produce

extract_text("你好 [http://www.google.com/test.mp3] 你好吗?"); 会产生

array(2) {
  [0]=>
  string(6) "hello "
  [1]=>
  string(13) " how are you?"
}


array(1) {
  [0]=>
  string(30) "http://www.google.com/test.mp3"
}

回答by vijay

This function may be useful.

这个功能可能有用。

    public static function getStringBetween($str,$from,$to, $withFromAndTo = false)
    {
       $sub = substr($str, strpos($str,$from)+strlen($from),strlen($str));
       if ($withFromAndTo)
         return $from . substr($sub,0, strrpos($sub,$to)) . $to;
       else
         return substr($sub,0, strrpos($sub,$to));
    }
    $inputString = "ignore everything except this (text)";
    $outputString = getStringBetween($inputString, '(', ')'));
    echo $outputString; 
    //output will be test

    $outputString = getStringBetween($inputString, '(', ')', true));
    echo $outputString; 
    //output will be (test)

strpos() => which is used to find the position of first occurance in a string.

strpos() => 用于查找字符串中第一次出现的位置。

strrpos() => which is used to find the position of first occurance in a string.

strrpos() => 用于查找字符串中第一次出现的位置。

回答by user628176

function getStringsBetween($str, $start='[', $end=']', $with_from_to=true){
$arr = [];
$last_pos = 0;
$last_pos = strpos($str, $start, $last_pos);
while ($last_pos !== false) {
    $t = strpos($str, $end, $last_pos);
    $arr[] = ($with_from_to ? $start : '').substr($str, $last_pos + 1, $t - $last_pos - 1).($with_from_to ? $end : '');
    $last_pos = strpos($str, $start, $last_pos+1);
}
return $arr; }

this is a little improvement to the previous answer that will return all patterns in array form:

这是对先前答案的一点改进,它将以数组形式返回所有模式:

getStringsBetween('[T]his[] is [test] string [pattern]') will return:

getStringsBetween('[T]his[] is [test] string [pattern]') 将返回:

回答by Wiktor Stribi?ew

The already posted regex solutions - \((.*?)\)and \(([^\)]+)\)- do not return the innermoststrings between an open and close brackets. If a string is Text (abc(xyz 123)they bothreturna (abc(xyz 123)as a whole match, and not (xyz 123).

已经发布的正则表达式解决方案 -\((.*?)\)并且\(([^\)]+)\)- 不返回括号和右括号之间的最里面的字符串。如果一个字符串是Text (abc(xyz 123)他们返回一个(abc(xyz 123)作为一个完整的匹配,而不是(xyz 123)

The pattern that matches substrings (use with preg_matchto fetch the first and preg_match_allto fetch all occurrences) in parentheses without other open and close parentheses in between is, if the match should include parentheses:

匹配括号中的子字符串(用于preg_match获取第一个和preg_match_all获取所有出现的)的模式,中间没有其他左括号和右括号,如果匹配应该包含括号:

\([^()]*\)

Or, you want to get values without parentheses:

或者,您想获得不带括号的值:

\(([^()]*)\)        // get Group 1 values after a successful call to preg_match_all, see code below
\(\K[^()]*(?=\))    // this and the one below get the values without parentheses as whole matches 
(?<=\()[^()]*(?=\)) // less efficient, not recommended

Replace *with +if there must be at least 1 char between (and ).

更换*+,如果必须有至少1字符之间()

Details:

详情

  • \(- an opening round bracket (must be escaped to denote a literal parenthesis as it is used outside a character class)
  • [^()]*- zero or morecharacters other than (and )(note these (and )do not have to be escaped inside a character class as inside it, (and )cannot be used to specify a grouping and are treated as literal parentheses)
  • \)- a closing round bracket (must be escaped to denote a literal parenthesis as it is used outside a character class).
  • \(- 一个左圆括号(必须转义以表示文字括号,因为它在字符类之外使用)
  • [^()]*-除and之外的零个或多个字符(注意这些,不必像在字符类内部那样在字符类内部转义,并且不能用于指定分组并被视为文字括号)()()()
  • \)- 结束圆括号(必须转义以表示文字括号,因为它在字符类之外使用)。

The \(\Kpart in an alternative regex matches (and omits from the match value (with the \Kmatch reset operator). (?<=\()is a positive lookbehind that requires a (to appear immediately to the left of the current location, but the (is not added to the match value since lookbehind (lookaround) patterns are not consuming. (?=\()is a positive lookahead that requires a )char to appear immediately to the right of the current location.

\(\K替代正则表达式中的部分匹配(并从匹配值中省略(使用\K匹配重置运算符)。(?<=\()是一个正向后视,需要 a(立即出现在当前位置的左侧,但(不会添加到匹配值中,因为后视(环视)模式不消耗。(?=\()是一个正向前瞻,需要一个)字符立即出现在当前位置的右侧。

PHP code:

PHP代码

$fullString = 'ignore everything except this (text) and (that (text here))';
if (preg_match_all('~\(([^()]*)\)~', $fullString, $matches)) {
    print_r($matches[0]); // Get whole match values
    print_r($matches[1]); // Get Group 1 values
}

Output:

输出:

Array ( [0] => (text)  [1] => (text here) )
Array ( [0] => text    [1] => text here   )

回答by rüff0

i think this is the fastest way to get the words between the first parenthesis in a string.

我认为这是在字符串中的第一个括号之间获取单词的最快方法。

$string = 'ignore everything except this (text)';
$string = explode(')', (explode('(', $string)[1]))[0];
echo $string;