php 使用 preg_match 匹配 IMG 标签的 SRC 属性
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2180255/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Matching SRC attribute of IMG tag using preg_match
提问by KyokoHunter
I'm attempting to run preg_match to extract the SRC attribute from the first IMG tag in an article (in this case, stored in $row->introtext).
我试图运行 preg_match 从文章中的第一个 IMG 标记中提取 SRC 属性(在这种情况下,存储在 $row->introtext 中)。
preg_match('/\< *[img][^\>]*[src] *= *[\"\']{0,1}([^\"\']*)/i', $row->introtext, $matches);
Instead of getting something like
而不是得到类似的东西
images/stories/otakuzoku1.jpg
from
从
<img src="images/stories/otakuzoku1.jpg" border="0" alt="Inside Otakuzoku's store" />
I get just
我得到了
0
The regex should be right, but I can't tell why it appears to be matching the border attribute and not the src attribute.
正则表达式应该是正确的,但我不知道为什么它似乎匹配边框属性而不是 src 属性。
Alternatively, if you've had the patience to read this far without skipping straight to the reply field and typing 'use a HTML/XML parser', can a good tutorial for one be recommended as I'm having trouble finding one at all that's applicable to PHP 4.
或者,如果您有耐心读到这里而没有直接跳到回复字段并输入“使用 HTML/XML 解析器”,那么可以推荐一个很好的教程,因为我根本找不到一个适用于 PHP 4。
PHP 4.4.7
PHP 4.4.7
回答by CalebD
Your expression is incorrect. Try:
你的表达不正确。尝试:
preg_match('/< *img[^>]*src *= *["\']?([^"\']*)/i', $row->introtext, $matches);
Note the removal of brackets around img and src and some other cleanups.
请注意删除 img 和 src 周围的括号以及其他一些清理。
回答by GZipp
Here's a way to do it with built-in functions (php >= 4):
这是使用内置函数(php >= 4)实现的方法:
$parser = xml_parser_create();
xml_parse_into_struct($parser, $html, $values);
foreach ($values as $key => $val) {
if ($val['tag'] == 'IMG') {
$first_src = $val['attributes']['SRC'];
break;
}
}
echo $first_src; // images/stories/otakuzoku1.jpg
回答by Ajmal Salim
If you need to use preg_match()itself, try this:
如果你需要使用preg_match()它自己,试试这个:
preg_match('/(?<!_)src=([\'"])?(.*?)\1/',$content, $matches);
回答by Bart Kiers
Try:
尝试:
include ("htmlparser.inc"); // from: http://php-html.sourceforge.net/
$html = 'bla <img src="images/stories/otakuzoku1.jpg" border="0" alt="Inside Otakuzoku\'s store" /> noise <img src="das" /> foo';
$parser = new HtmlParser($html);
while($parser->parse()) {
if($parser->iNodeName == 'img') {
echo $parser->iNodeAttributes['src'];
break;
}
}
which will produce:
这将产生:
images/stories/otakuzoku1.jpg
It should work with PHP 4.x.
它应该适用于 PHP 4.x。
回答by WNRosenberg
The regex I used was much simpler. My code assumes that the string being passed to it contains exactly one img tag with no other markup:
我使用的正则表达式要简单得多。我的代码假定传递给它的字符串只包含一个没有其他标记的 img 标签:
$pattern = '/src="([^"]*)"/';
See my answer here for more info: How to extract img src, title and alt from html using php?
有关更多信息,请参阅我的答案:How to extract img src, title and alt from html using php?
回答by mickmackusa
This task should be executed by a dom parser because regex is dom-ignorant.
这个任务应该由 dom 解析器执行,因为 regex 是 dom-ignorant。
Code: (Demo)
代码:(演示)
$row = (object)['introtext' => '<div>test</div><img src="source1"><p>text</p><img src="source2"><br>'];
$dom = new DOMDocument();
$dom->loadHTML($row->introtext);
echo $dom->getElementsByTagName('img')->item(0)->getAttribute('src');
Output:
输出:
source1
This says:
这说:
- Parse the whole html string
- Isolate all of the img tags
- Isolate the first img tag
- Isolate its src attribute value
- 解析整个html字符串
- 隔离所有 img 标签
- 隔离第一个img标签
- 隔离其 src 属性值
Clean, appropriate, easy to read and manage.
干净、合适、易于阅读和管理。

