PHP 解析 HTML 代码

Question

提问by Francisc

Possible Duplicate:
Best methods to parse HTML

可能的重复：
解析 HTML 的最佳方法

How can I parse HTML code held in a PHP variable if it something like:

如果类似于以下内容，我如何解析保存在 PHP 变量中的 HTML 代码：

<h1>T1</h1>Lorem ipsum.<h1>T2</h1>The quick red fox...<h1>T3</h1>... jumps over the lazy brown FROG!

I want to only get the text that's between the headingsand I understand that it's not a good idea to use Regular Expressions.

我只想获取标题之间的文本，我知道使用正则表达式不是一个好主意。

Answer 1

回答by shamittomar

Use PHP Document Object Model:

使用 PHP文档对象模型：

<?php
   $str = '<h1>T1</h1>Lorem ipsum.<h1>T2</h1>The quick red fox...<h1>T3</h1>... jumps over the lazy brown FROG';
   $DOM = new DOMDocument;
   $DOM->loadHTML($str);

   //get all H1
   $items = $DOM->getElementsByTagName('h1');

   //display all H1 text
   for ($i = 0; $i < $items->length; $i++)
        echo $items->item($i)->nodeValue . "<br/>";
?>

This outputs as:

这输出为：

 T1
 T2
 T3

[EDIT]: After OP Clarification:

[编辑]：在 OP 澄清后：

If you want the content like Lorem ipsum.etc, you can directly use this regex:

如果你想要像Lorem ipsum这样的内容。等等，你可以直接使用这个正则表达式：

<?php
   $str = '<h1>T1</h1>Lorem ipsum.<h1>T2</h1>The quick red fox...<h1>T3</h1>... jumps over the lazy brown FROG';
   echo preg_replace("#<h1.*?>.*?</h1>#", "", $str);
?>

this outputs:

这输出：

Lorem ipsum.The quick red fox...... jumps over the lazy brown FROG

Lorem ipsum.The quick red fox......跳过懒惰的棕色青蛙

PHP 解析 HTML 代码

提问by Francisc

回答by shamittomar

相关推荐

最近更新

标签

PHP 解析 HTML 代码

提问by Francisc

回答by shamittomar

相关推荐

使用 PHP 从 JPG 中删除 EXIF 数据

如何获取和更改 URL 变量 PHP

用于分页的 php 示例脚本

php 如何使用注释抑制 PHPCS 警告？

相关推荐

最近更新

标签