如何在 C# 中下载 HTML 源代码

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/599275/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-04 09:46:53  来源:igfitidea点击:

How can I download HTML source in C#

c#

提问by NotDan

How can I get the HTML source given a web address in c#?

如何在 C# 中获取给定网址的 HTML 源代码?

采纳答案by CMS

You can download files with the WebClient class:

您可以使用WebClient 类下载文件:

using System.Net;

using (WebClient client = new WebClient ()) // WebClient class inherits IDisposable
{
    client.DownloadFile("http://yoursite.com/page.html", @"C:\localfile.html");

    // Or you can get the file content without saving it
    string htmlCode = client.DownloadString("http://yoursite.com/page.html");
}

回答by Diego Jancic

basically:

基本上:

using System.Net;
using System.Net.Http;  // in LINQPad, also add a reference to System.Net.Http.dll

WebRequest req = HttpWebRequest.Create("http://google.com");
req.Method = "GET";

string source;
using (StreamReader reader = new StreamReader(req.GetResponse().GetResponseStream()))
{
    source = reader.ReadToEnd();
}

Console.WriteLine(source);

回答by Xilmiki

@cms way is the more recent, suggested in MS website, but I had a hard problem to solve, with both method posted here, now I post the solution for all!

@cms 方法是最近的,在 MS 网站上建议,但我有一个很难解决的问题,两种方法都张贴在这里,现在我为所有人发布解决方案!

problem:if you use an url like this: www.somesite.it/?p=1500in some case you get an internal server error (500), although in web browser this www.somesite.it/?p=1500perfectly work.

问题:如果您使用这样的 url:www.somesite.it/?p=1500在某些情况下,您会收到内部服务器错误 (500),尽管在 Web 浏览器中这www.somesite.it/?p=1500完全有效。

solution:you have to move out parameters, working code is:

解决方案:你必须移出参数,工作代码是:

using System.Net;
//...
using (WebClient client = new WebClient ()) 
{
    client.QueryString.Add("p", "1500"); //add parameters
    string htmlCode = client.DownloadString("www.somesite.it");
    //...
}

here official documentation

这里官方文档

回答by Xenon

You can get it with:

您可以通过以下方式获取:

var html = new System.Net.WebClient().DownloadString(siteUrl)

回答by Hakan F?st?k

The newest, most recent, up to date answer
This post is really old (it's 7 years old when I answered it), so no one of the other answers used the new and recommended way, which is HttpClientclass.

最新的、最新的、最新的答案
这篇文章真的很旧(我回答它时已经 7 岁了),所以其他答案都没有使用新的和推荐的方式,即HttpClient课堂。



HttpClientHttpClient被认为是新的 API,它应该取代旧的 (WebClientWebClientWebRequestWebRequest)

string url = "page url";
HttpClient client = new HttpClient();
using (HttpResponseMessage response = client.GetAsync(url).Result)
{
   using (HttpContent content = response.Content)
   {
      string result = content.ReadAsStringAsync().Result;
   }
}

for more information about how to use the HttpClientclass (especially in async cases), you can refer this question

有关如何使用HttpClient该类的更多信息(尤其是在异步情况下),您可以参考这个问题