C# 帮助使用 StreamReader 读取外来字符
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/592824/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
C# Help reading foreign characters using StreamReader
提问by
I'm using the code below to read a text file that contains foreign characters, the file is encoded ANSI and looks fine in notepad. The code below doesn't work, when the file values are read and shown in the datagrid the characters appear as squares, could there be another problem elsewhere?
我正在使用下面的代码读取包含外来字符的文本文件,该文件采用 ANSI 编码,在记事本中看起来不错。下面的代码不起作用,当读取文件值并在数据网格中显示时,字符显示为正方形,其他地方是否还有其他问题?
StreamReader reader = new StreamReader(inputFilePath, System.Text.Encoding.ANSI);
using (reader = File.OpenText(inputFilePath))
Thanks
谢谢
Update 1: I have tried all encodings found under System.Text.Encoding
. and all fail to show the file correctly.
更新 1:我已经尝试了在System.Text.Encoding
. 并且都无法正确显示文件。
Update 2: I've changed the file encoding (resaved the file) to unicode and used System.Text.Encoding.Unicode
and it worked just fine. So why did notepad read it correctly? And why didn't System.Text.Encoding.Unicode
read the ANSI file?
更新 2:我已将文件编码(重新保存文件)更改为 unicode 并使用System.Text.Encoding.Unicode
,并且效果很好。那么为什么记事本可以正确读取它呢?为什么不System.Text.Encoding.Unicode
读取ANSI文件?
采纳答案by Quintin Robinson
Yes, it could be with the actual encoding of the file, probably unicode. Try UTF-8 as that is the most common form of unicode encoding. Otherwise if the file ASCII then standard ASCII encoding should work.
是的,它可能与文件的实际编码有关,可能是 unicode。尝试使用 UTF-8,因为这是最常见的 unicode 编码形式。否则,如果文件 ASCII 则标准 ASCII 编码应该可以工作。
回答by Jakob Christensen
Try a different encoding such as Encoding.UTF8. You can also try letting StreamReader find the encoding itself:
尝试不同的编码,例如 Encoding.UTF8。您也可以尝试让 StreamReader 找到编码本身:
StreamReader reader = new StreamReader(inputFilePath, System.Text.Encoding.UTF8, true)
Edit: Just saw your update. Try letting StreamReader do the guessing.
编辑:刚刚看到你的更新。尝试让 StreamReader 进行猜测。
回答by Jerome Laban
You may also try the Default encoding, which uses the current system's ANSI codepage.
您也可以尝试使用当前系统的 ANSI 代码页的默认编码。
StreamReader reader = new StreamReader(inputFilePath, Encoding.Default, true)
When you try using the Notepad "Save As" menu with the original file, look at the encoding combo box. It will tell you which encoding notepad guessed is used by the file.
当您尝试对原始文件使用记事本“另存为”菜单时,请查看编码组合框。它会告诉您文件使用了猜测的记事本编码。
Also, if it is an ANSI file, the detectEncodingFromByteOrderMarks parameter will probably not help much.
此外,如果它是 ANSI 文件,则 detectEncodingFromByteOrderMarks 参数可能不会有太大帮助。
回答by Jon Skeet
Using Encoding.Unicode won't accurately decode an ANSI file in the same way that a JPEG decoder won't understand a GIF file.
使用 Encoding.Unicode 不会像 JPEG 解码器无法理解 GIF 文件那样准确地解码 ANSI 文件。
I'm surprised that Encoding.Default
didn't work for the ANSI file if it really wasANSI - if you ever find out exactly whichcode page Notepad was using, you could use Encoding.GetEncoding(int)
.
我很惊讶,Encoding.Default
如果 ANSI 文件确实是ANSI,那么它对 ANSI 文件不起作用- 如果您确切地知道记事本正在使用哪个代码页,您可以使用Encoding.GetEncoding(int)
.
In general, where possible I'd recommend using UTF-8.
一般来说,在可能的情况下,我建议使用 UTF-8。
回答by Anonymous
File.OpenText() always uses an UTF-8 StreamReader implicitly. Create your own StreamReader instance instead and specify the desired encoding. like
File.OpenText() 始终隐式使用 UTF-8 StreamReader。而是创建您自己的 StreamReader 实例并指定所需的编码。喜欢
using (StreamReader reader = new StreamReader(@"C:\test.txt", Encoding.Default)
{
// ...
}
回答by serop
I had the same problem and my solution was simple: instead of
我遇到了同样的问题,我的解决方案很简单:而不是
Encoding.ASCII
use
用
Encoding.GetEncoding("iso-8859-1")
The answer was found here.
答案在这里找到。
Edit: more solutions. This maybe more accurate one:
编辑:更多解决方案。这可能更准确:
Encoding.GetEncoding(1252);
Also, in some cases this will work for you too if your OS default encoding matches file encoding:
此外,在某些情况下,如果您的操作系统默认编码与文件编码匹配,这也适用于您:
Encoding.Default;
回答by Luís Ponciano
I solved my problem of reading portuguese characters, changing the source file on notepad++.
我解决了阅读葡萄牙语字符的问题,在 notepad++ 上更改了源文件。
C#
C#
var url = System.Web.HttpContext.Current.Server.MapPath(@"~/Content/data.json");
string s = string.Empty;
using (System.IO.StreamReader sr = new System.IO.StreamReader(url, System.Text.Encoding.UTF8,true))
{
s = sr.ReadToEnd();
}
回答by Muhamad Suliman
for Arabic, I used Encoding.GetEncoding(1256)
. it is working good.
对于阿拉伯语,我使用Encoding.GetEncoding(1256)
. 它运行良好。
回答by jagge123
For swedish ? ? ? the only solution form the ones above working was:
对于瑞典语?? ? 上述工作的唯一解决方案是:
Encoding.GetEncoding("iso-8859-1")
Hopefully this will save someone time.
希望这会节省一些人的时间。
回答by A. Lartey
I'm also reading an exported file which contains french and German languages. I used Encoding.GetEncoding("iso-8859-1"), true which worked out without any challenges.
我也在阅读一个包含法语和德语的导出文件。我使用了 Encoding.GetEncoding("iso-8859-1"), true 没有任何挑战。