C# Html 敏捷包。加载和抓取网页

Question

提问by thatsIT

Is this the bestway to get a webpage when scraping?

这是抓取时获取网页的最佳方式吗？

HttpWebRequest oReq = (HttpWebRequest)WebRequest.Create(url);
HttpWebResponse resp = (HttpWebResponse)oReq.GetResponse();

var doc = new HtmlAgilityPack.HtmlDocument();

doc.Load(resp.GetResponseStream());
var element = doc.GetElementbyId("//start-left");
var element2 = doc.DocumentNode.SelectSingleNode("//body");
string html = doc.DocumentNode.OuterHtml;

I've seen HtmlWeb().Loadto get a webpage. Is that a better alternative to load and the scrape the webpage?

我已经看到HtmlWeb().Load得到一个网页。这是加载和抓取网页的更好选择吗？

Ok i'll try that instead.

好的，我会尝试的。

HtmlDocument doc = web.Load(url);

Now when i got my docand didn't get so mutch properties. No one like SelectSingleNode. The only one I can use is GetElementById, and that works but I whant to get a class.

现在，当我得到我的doc但没有得到如此多的财产时。没有人喜欢SelectSingleNode。我唯一可以使用的是GetElementById，它有效，但我想上课。

Do I need to do it like this?

我需要这样做吗？

var htmlBody = doc.DocumentNode.SelectSingleNode("//body");
htmlBody.SelectSingleNode("//paging");

Answer 1

回答by Jacob Proffitt

Much easier to use HtmlWeb.

使用 HtmlWeb 更容易。

string Url = "http://something";
HtmlWeb web = new HtmlWeb();
HtmlDocument doc = web.Load(Url);

C# Html 敏捷包。加载和抓取网页

提问by thatsIT

回答by Jacob Proffitt

相关推荐

最近更新

标签

C# Html 敏捷包。加载和抓取网页

提问by thatsIT

回答by Jacob Proffitt

相关推荐

C# 为多种语言制作网站的最佳方式

C# WPF Datagrid 绑定和列显示

C# 为什么会话对象会抛出空引用异常？

C# 两个日期之间的天、小时、分钟、秒

相关推荐

最近更新

标签