如何解析 HTML 或将 HTML 转换为 XML，以便我从网站中提取信息（在 C# 中）

Question

提问by Jerry

Possible Duplicate:
What is the best way to parse html in C#?

可能的重复：
在 C# 中解析 html 的最佳方法是什么？

Is there a way to parse HTML or convert HTML to XML so I extract the information out of the website easily?

有没有办法解析 HTML 或将 HTML 转换为 XML，以便我轻松地从网站中提取信息？

I'm working with C#.

我正在使用 C#。

Thank you,

谢谢，

Answer 1

采纳答案by Habib

HTMLAgilityPackis what you are looking for. Check out this tutorial Parsing HTML Document with HTMLAgilityPack

HTMLAgilityPack正是您要找的。查看本教程使用 HTMLAgilityPack 解析 HTML 文档

Answer 2

回答by Michael

You can use the COM objects in Microsoft HTML Object Libraryto load HTML, and then use it's object model to navigate around. An example is shown below:

您可以使用 COM 对象Microsoft HTML Object Library来加载 HTML，然后使用它的对象模型来导航。一个例子如下所示：

string html;
WebClient webClient = new WebClient();
using (Stream stream = webClient.OpenRead(new Uri("http://www.google.com")))
using (StreamReader reader = new StreamReader(stream))
{
  html = reader.ReadToEnd();
}
IHTMLDocument2 doc = (IHTMLDocument2)new HTMLDocument();
doc.write(html);
foreach (IHTMLElement el in doc.all)
  Console.WriteLine(el.tagName);

如何解析 HTML 或将 HTML 转换为 XML，以便我从网站中提取信息（在 C# 中）

提问by Jerry

采纳答案by Habib

回答by Michael

相关推荐

最近更新

标签

如何解析 HTML 或将 HTML 转换为 XML，以便我从网站中提取信息（在 C# 中）

提问by Jerry

采纳答案by Habib

回答by Michael

相关推荐

C# 将 DataRow[] 转换为 DataTable 而不会丢失其 DataSet

C# 无法在 DataGridViewCheckBoxColumn 中选中复选框？

C# 如何将 sql 查询的结果映射到对象上？

C# 如何单击 Webbrowser 控件中的按钮？

相关推荐

最近更新

标签