php 使用 preg_match 匹配 IMG 标签的 SRC 属性

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2180255/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 05:22:46  来源:igfitidea点击:

Matching SRC attribute of IMG tag using preg_match

phpregexparsingpreg-matchsrc

提问by KyokoHunter

I'm attempting to run preg_match to extract the SRC attribute from the first IMG tag in an article (in this case, stored in $row->introtext).

我试图运行 preg_match 从文章中的第一个 IMG 标记中提取 SRC 属性(在这种情况下,存储在 $row->introtext 中)。

preg_match('/\< *[img][^\>]*[src] *= *[\"\']{0,1}([^\"\']*)/i', $row->introtext, $matches);

Instead of getting something like

而不是得到类似的东西

images/stories/otakuzoku1.jpg

from

<img src="images/stories/otakuzoku1.jpg" border="0" alt="Inside Otakuzoku's store" />

I get just

我得到了

0

The regex should be right, but I can't tell why it appears to be matching the border attribute and not the src attribute.

正则表达式应该是正确的,但我不知道为什么它似乎匹配边框属性而不是 src 属性。

Alternatively, if you've had the patience to read this far without skipping straight to the reply field and typing 'use a HTML/XML parser', can a good tutorial for one be recommended as I'm having trouble finding one at all that's applicable to PHP 4.

或者,如果您有耐心读到这里而没有直接跳到回复字段并输入“使用 HTML/XML 解析器”,那么可以推荐一个很好的教程,因为我根本找不到一个适用于 PHP 4。

PHP 4.4.7

PHP 4.4.7

回答by CalebD

Your expression is incorrect. Try:

你的表达不正确。尝试:

preg_match('/< *img[^>]*src *= *["\']?([^"\']*)/i', $row->introtext, $matches);

Note the removal of brackets around img and src and some other cleanups.

请注意删除 img 和 src 周围的括号以及其他一些清理。

回答by GZipp

Here's a way to do it with built-in functions (php >= 4):

这是使用内置函数(php >= 4)实现的方法:

$parser = xml_parser_create();
xml_parse_into_struct($parser, $html, $values);
foreach ($values as $key => $val) {
    if ($val['tag'] == 'IMG') {
        $first_src = $val['attributes']['SRC'];
        break;
    }
}

echo $first_src;  // images/stories/otakuzoku1.jpg

回答by Ajmal Salim

If you need to use preg_match()itself, try this:

如果你需要使用preg_match()它自己,试试这个:

 preg_match('/(?<!_)src=([\'"])?(.*?)\1/',$content, $matches);

回答by Bart Kiers

Try:

尝试:

include ("htmlparser.inc"); // from: http://php-html.sourceforge.net/

$html = 'bla <img src="images/stories/otakuzoku1.jpg" border="0" alt="Inside Otakuzoku\'s store" /> noise <img src="das" /> foo';

$parser = new HtmlParser($html);

while($parser->parse()) {
    if($parser->iNodeName == 'img') {
        echo $parser->iNodeAttributes['src'];
        break;
    }
}

which will produce:

这将产生:

images/stories/otakuzoku1.jpg

It should work with PHP 4.x.

它应该适用于 PHP 4.x。

回答by WNRosenberg

The regex I used was much simpler. My code assumes that the string being passed to it contains exactly one img tag with no other markup:

我使用的正则表达式要简单得多。我的代码假定传递给它的字符串只包含一个没有其他标记的 img 标签:

$pattern = '/src="([^"]*)"/';

See my answer here for more info: How to extract img src, title and alt from html using php?

有关更多信息,请参阅我的答案:How to extract img src, title and alt from html using php?

回答by mickmackusa

This task should be executed by a dom parser because regex is dom-ignorant.

这个任务应该由 dom 解析器执行,因为 regex 是 dom-ignorant。

Code: (Demo)

代码:(演示

$row = (object)['introtext' => '<div>test</div><img src="source1"><p>text</p><img src="source2"><br>'];

$dom = new DOMDocument();
$dom->loadHTML($row->introtext);
echo $dom->getElementsByTagName('img')->item(0)->getAttribute('src');

Output:

输出:

source1

This says:

这说:

  1. Parse the whole html string
  2. Isolate all of the img tags
  3. Isolate the first img tag
  4. Isolate its src attribute value
  1. 解析整个html字符串
  2. 隔离所有 img 标签
  3. 隔离第一个img标签
  4. 隔离其 src 属性值

Clean, appropriate, easy to read and manage.

干净、合适、易于阅读和管理。