如何在 C# 中下载 HTML 源代码
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/599275/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How can I download HTML source in C#
提问by NotDan
How can I get the HTML source given a web address in c#?
如何在 C# 中获取给定网址的 HTML 源代码?
采纳答案by CMS
You can download files with the WebClient class:
您可以使用WebClient 类下载文件:
using System.Net;
using (WebClient client = new WebClient ()) // WebClient class inherits IDisposable
{
client.DownloadFile("http://yoursite.com/page.html", @"C:\localfile.html");
// Or you can get the file content without saving it
string htmlCode = client.DownloadString("http://yoursite.com/page.html");
}
回答by Diego Jancic
basically:
基本上:
using System.Net;
using System.Net.Http; // in LINQPad, also add a reference to System.Net.Http.dll
WebRequest req = HttpWebRequest.Create("http://google.com");
req.Method = "GET";
string source;
using (StreamReader reader = new StreamReader(req.GetResponse().GetResponseStream()))
{
source = reader.ReadToEnd();
}
Console.WriteLine(source);
回答by Xilmiki
@cms way is the more recent, suggested in MS website, but I had a hard problem to solve, with both method posted here, now I post the solution for all!
@cms 方法是最近的,在 MS 网站上建议,但我有一个很难解决的问题,两种方法都张贴在这里,现在我为所有人发布解决方案!
problem:if you use an url like this: www.somesite.it/?p=1500
in some case you get an internal server error (500),
although in web browser this www.somesite.it/?p=1500
perfectly work.
问题:如果您使用这样的 url:www.somesite.it/?p=1500
在某些情况下,您会收到内部服务器错误 (500),尽管在 Web 浏览器中这www.somesite.it/?p=1500
完全有效。
solution:you have to move out parameters, working code is:
解决方案:你必须移出参数,工作代码是:
using System.Net;
//...
using (WebClient client = new WebClient ())
{
client.QueryString.Add("p", "1500"); //add parameters
string htmlCode = client.DownloadString("www.somesite.it");
//...
}
回答by Xenon
You can get it with:
您可以通过以下方式获取:
var html = new System.Net.WebClient().DownloadString(siteUrl)
回答by Hakan F?st?k
The newest, most recent, up to date answer
This post is really old (it's 7 years old when I answered it), so no one of the other answers used the new and recommended way, which is HttpClient
class.
最新的、最新的、最新的答案
这篇文章真的很旧(我回答它时已经 7 岁了),所以其他答案都没有使用新的和推荐的方式,即HttpClient
课堂。
HttpClient
HttpClient
被认为是新的 API,它应该取代旧的 (WebClient
WebClient
和WebRequest
WebRequest
)string url = "page url";
HttpClient client = new HttpClient();
using (HttpResponseMessage response = client.GetAsync(url).Result)
{
using (HttpContent content = response.Content)
{
string result = content.ReadAsStringAsync().Result;
}
}
for more information about how to use the HttpClient
class (especially in async cases), you can refer this question
有关如何使用HttpClient
该类的更多信息(尤其是在异步情况下),您可以参考这个问题