如何在 C# 中下载 HTML 源代码

Question

提问by NotDan

How can I get the HTML source given a web address in c#?

如何在 C# 中获取给定网址的 HTML 源代码？

Answer 1

采纳答案by CMS

You can download files with the WebClient class:

您可以使用WebClient 类下载文件：

using System.Net;

using (WebClient client = new WebClient ()) // WebClient class inherits IDisposable
{
    client.DownloadFile("http://yoursite.com/page.html", @"C:\localfile.html");

    // Or you can get the file content without saving it
    string htmlCode = client.DownloadString("http://yoursite.com/page.html");
}

Answer 2

回答by Diego Jancic

basically:

基本上：

using System.Net;
using System.Net.Http;  // in LINQPad, also add a reference to System.Net.Http.dll

WebRequest req = HttpWebRequest.Create("http://google.com");
req.Method = "GET";

string source;
using (StreamReader reader = new StreamReader(req.GetResponse().GetResponseStream()))
{
    source = reader.ReadToEnd();
}

Console.WriteLine(source);

Answer 3

回答by Xilmiki

@cms way is the more recent, suggested in MS website, but I had a hard problem to solve, with both method posted here, now I post the solution for all!

@cms 方法是最近的，在 MS 网站上建议，但我有一个很难解决的问题，两种方法都张贴在这里，现在我为所有人发布解决方案！

problem:if you use an url like this: www.somesite.it/?p=1500in some case you get an internal server error (500), although in web browser this www.somesite.it/?p=1500perfectly work.

问题：如果您使用这样的 url：www.somesite.it/?p=1500在某些情况下，您会收到内部服务器错误 (500)，尽管在 Web 浏览器中这www.somesite.it/?p=1500完全有效。

solution:you have to move out parameters, working code is:

解决方案：你必须移出参数，工作代码是：

using System.Net;
//...
using (WebClient client = new WebClient ()) 
{
    client.QueryString.Add("p", "1500"); //add parameters
    string htmlCode = client.DownloadString("www.somesite.it");
    //...
}

here official documentation

这里官方文档

Answer 4

回答by Xenon

You can get it with:

您可以通过以下方式获取：

var html = new System.Net.WebClient().DownloadString(siteUrl)

Answer 5

回答by Hakan F?st?k

The newest, most recent, up to date answer
This post is really old (it's 7 years old when I answered it), so no one of the other answers used the new and recommended way, which is HttpClientclass.

最新的、最新的、最新的答案
这篇文章真的很旧（我回答它时已经 7 岁了），所以其他答案都没有使用新的和推荐的方式，即HttpClient课堂。

HttpClientHttpClient被认为是新的 API，它应该取代旧的 (WebClientWebClient和WebRequestWebRequest)

string url = "page url";
HttpClient client = new HttpClient();
using (HttpResponseMessage response = client.GetAsync(url).Result)
{
   using (HttpContent content = response.Content)
   {
      string result = content.ReadAsStringAsync().Result;
   }
}

for more information about how to use the HttpClientclass (especially in async cases), you can refer this question

有关如何使用HttpClient该类的更多信息（尤其是在异步情况下），您可以参考这个问题

如何在 C# 中下载 HTML 源代码

提问by NotDan

采纳答案by CMS

回答by Diego Jancic

回答by Xilmiki

回答by Xenon

回答by Hakan F?st?k

相关推荐

最近更新

标签

如何在 C# 中下载 HTML 源代码

提问by NotDan

采纳答案by CMS

回答by Diego Jancic

回答by Xilmiki

回答by Xenon

回答by Hakan F?st?k

相关推荐

C# 等待两个线程完成

C# 如何使用 LINQ-to-XML 选择特定节点

C# DLL 配置文件

在 C# 中通过 HttpWebRequest 实现 Digest 身份验证

相关推荐

最近更新

标签