php Preg_match_all <a href

Question

提问by streetparade

Hello i want to extract links <a href="/portal/clients/show/entityId/2121" >and i want a regex which givs me /portal/clients/show/entityId/2121 the number at last 2121 is in other links different any idea?

你好，我想提取链接 <a href="/portal/clients/show/entityId/2121" >，我想要一个正则表达式，它给我 /portal/clients/show/entityId/2121 最后的数字 2121 在其他链接中不同，知道吗？

Answer 1

采纳答案by Yacoby

Regex for parsing links is something like this:

解析链接的正则表达式是这样的：

'/<a\s+(?:[^"'>]+|"[^"]*"|'[^']*')*href=("[^"]+"|'[^']+'|[^<>\s]+)/i'

Given how horrible that is, I would recommend using Simple HTML Domfor getting the links at least. You could then check links using some very basic regex on the link href.

鉴于这有多可怕，我建议至少使用Simple HTML Dom来获取链接。然后，您可以在链接 href 上使用一些非常基本的正则表达式来检查链接。

Answer 2

回答by karim79

Simple PHP HTML Dom Parserexample:

简单的 PHP HTML Dom 解析器示例：

// Create DOM from string
$html = str_get_html($links);

//or
$html = file_get_html('www.example.com');

foreach($html->find('a') as $link) {
    echo $link->href . '<br />';
}

Answer 3

回答by soulmerge

Don't use regular expressions for proccessing xml/html. This can be done very easily using the builtin dom parser:

不要使用正则表达式来处理 xml/html。这可以使用内置的 dom 解析器很容易地完成：

$doc = new DOMDocument();
$doc->loadHTML($htmlAsString);
$xpath = new DOMXPath($doc);
$nodeList = $xpath->query('//a/@href');
for ($i = 0; $i < $nodeList->length; $i++) {
    # Xpath query for attributes gives a NodeList containing DOMAttr objects.
    # http://php.net/manual/en/class.domattr.php
    echo $nodeList->item($i)->value . "<br/>\n";
}

Answer 4

回答by streetparade

This is my solution:

这是我的解决方案：

<?php
// get links
$website = file_get_contents("http://www.example.com"); // download contents of www.example.com
preg_match_all("<a href=\x22(.+?)\x22>", $website, $matches); // save all links \x22 = "

// delete redundant parts
$matches = str_replace("a href=", "", $matches); // remove a href=
$matches = str_replace("\"", "", $matches); // remove "

// output all matches
print_r($matches[1]);
?>

I recommend to avoid using xml-based parsers, because you will not always know, whether the document/website has been well formed.

我建议避免使用基于 xml 的解析器，因为您不会总是知道文档/网站是否格式良好。

Best regards

此致

Answer 5

回答by Max

When "parsing" html I mostly rely on PHPQuery: http://code.google.com/p/phpquery/rather then regex.

在“解析”html 时，我主要依赖 PHPQuery：http: //code.google.com/p/phpquery/而不是正则表达式。

Answer 6

回答by Bart Kiers

Paring links from HTML can be done using am HTML parser.

可以使用 am HTML 解析器完成来自 HTML 的配对链接。

When you have all links, simple get the index of the last forward slash, and you have your number. No regex needed.

当您拥有所有链接时，只需获取最后一个正斜杠的索引，即可获得您的编号。不需要正则表达式。

php Preg_match_all <a href

提问by streetparade

采纳答案by Yacoby

回答by karim79

回答by soulmerge

回答by streetparade

回答by Max

回答by Bart Kiers

相关推荐

最近更新

标签

php Preg_match_all <a href

提问by streetparade

采纳答案by Yacoby

回答by karim79

回答by soulmerge

回答by streetparade

回答by Max

回答by Bart Kiers

相关推荐

PHP 检查 MySQL 最后一行

php HTML表单的提交按钮如何在不打开用户电子邮件客户端并随后重定向到另一个页面的情况下向网站管理员发送电子邮件？

php 从字符串中获取所有图像 url

php 在 Codeigniter 中上传多个文件

相关推荐

最近更新

标签