.net WebClient.DownloadString 由于编码问题导致字符损坏,但浏览器没问题

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/7137165/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-03 15:50:21  来源:igfitidea点击:

WebClient.DownloadString results in mangled characters due to encoding issues, but the browser is OK

.netunicodeutf-8webclient

提问by Domenic

The following code:

以下代码:

var text = (new WebClient()).DownloadString("http://export.arxiv.org/api/query?search_query=au:Freidel_L*&start=0&max_results=20"));

results in a variable textthat contains, among many other things, the string

结果是一个变量text,其中包含许多其他内容,字符串

"$?o$-Minkowski space, scalar field, and the issue of Lorentz invariance"

“$?o$-Minkowski 空间、标量场和洛伦兹不变性问题”

However, when I visit that URL in Firefox, I get

但是,当我在 Firefox 中访问该 URL 时,我得到

$κ$-Minkowski space, scalar field, and the issue of Lorentz invariance

$κ$-Minkowski 空间、标量场和洛伦兹不变性问题

which is actually correct. I also tried

这实际上是正确的。我也试过

var data = (new WebClient()).DownloadData("http://export.arxiv.org/api/query?search_query=au:Freidel_L*&start=0&max_results=20");
var text = System.Text.UTF8Encoding.Default.GetString(data);

but this gave the same problem.

但这给了同样的问题。

I'm not sure where the fault lies here. Is the feed lying about being UTF8-encoded, and the browser is smart enough to figure that out, but not WebClient? Is the feed properly UTF8-encoded, but WebClientis failing in some other way? What can I do to mitigate this?

我不确定这里的错误在哪里。提要是否在说谎是 UTF8 编码,浏览器是否足够聪明来解决这个问题,但不是WebClient?提要是否正确 UTF8 编码,但WebClient以其他方式失败?我能做些什么来缓解这种情况?

回答by LostInComputer

It's not lying. You should set the webclient's encoding first before calling DownloadString.

这不是说谎。您应该在调用 DownloadString 之前先设置 webclient 的编码。

using(WebClient webClient = new WebClient())
{
webClient.Encoding = Encoding.UTF8;
string s = webClient.DownloadString("http://export.arxiv.org/api/query?search_query=au:Freidel_L*&start=0&max_results=20");
}

As for why your alternative isn't working, it's because the usage is incorrect. Its should be:

至于为什么您的替代方案不起作用,那是因为用法不正确。它应该是:

System.Text.Encoding.UTF8.GetString()