.net 如何在 Windows 环境中检查 .txt 文件是 ASCII 还是 UTF-8 格式?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/6947749/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to check if a .txt file is in ASCII or UTF-8 format in Windows environment?
提问by rk1962
I have converted a .txt file from ASCII to UTF-8 using UltraEdit. However, I am not sure how to verify if it is in UTF-8 format in Windows environment.
我已使用 UltraEdit 将 .txt 文件从 ASCII 转换为 UTF-8。但是,我不确定如何在 Windows 环境中验证它是否为 UTF-8 格式。
Thank you!
谢谢!
采纳答案by Mark Ransom
Text files in Windows don't have a format. There's an unofficial convention that if the file starts with the BOM codepoint in UTF-8 formatthat it's UTF-8, but that convention isn't universally supported. That would be the 3 byte sequence "\xef\xbf\xbe", i.e. ???in the Latin-1 character set.
Windows 中的文本文件没有格式。有一个非官方约定,如果文件以 UTF-8 格式的BOM 代码点开头,则它是 UTF-8,但该约定并未得到普遍支持。那将是 3 字节序列"\xef\xbf\xbe",即???在 Latin-1 字符集中。
回答by Ofer Zelig
Open the file in Notepad. Click 'Save As...'. In the 'Encoding:' combo box you will see the current file format.
在记事本中打开文件。单击“另存为...”。在“编码:”组合框中,您将看到当前的文件格式。
回答by Miguel Hermoso
Open the file using Notepad++ and check the "Encoding" menu, you can check the current Encoding and/or Convert to a set of encodings available.
使用 Notepad++ 打开文件并检查“编码”菜单,您可以检查当前的编码和/或转换为一组可用的编码。
回答by SLaks
回答by Luminator
If you use Windows 10 and has Windows Subsystem for Linux (WSL), it can be easily done by typing "file " from the shell.
如果您使用 Windows 10 并具有适用于 Linux 的 Windows 子系统 (WSL),则可以通过从 shell 键入“file”轻松完成。
For example:
例如:
$ file code.cpp
code.cpp: C source, UTF-8 Unicode (with BOM) text, with CRLF line terminators
回答by Eric Moon
I had a directory of files that I wanted to check. I created an Excel macro to determine ANSI vs. UTF-8. This worked for me.
我有一个要检查的文件目录。我创建了一个 Excel 宏来确定 ANSI 与 UTF-8。这对我有用。
Sub GetTextFileEncoding()
Dim sFile As String
Dim sPath As String
Dim sTextLine As String
Dim iRow As Integer
'Set Defaults and Initial Values
iRow = 1
sPath = "C:textfiles\"
sFile = Dir(sPath & "*.txt")
Do While Len(sFile) > 0
'Get FileType
'Debug.Print sFile & " - " & FileEncodeType(sPath & sFile)
'Show on Excel Worksheet
Cells(iRow, 1).Value = sFile
Cells(iRow, 2).Value = FileEncodeType(sPath & sFile)
'Get next file
sFile = Dir
'Increment Row
iRow = iRow + 1
Loop
End Sub
Function FileEncodeType(sFile As String) As String
Dim bEF As Boolean
Dim bBB As Boolean
Dim bBF As Boolean
bEF = False
bBB = False
bBF = False
Open sFile For Input As #1
If Not EOF(1) Then
'Read first line
Line Input #1, textline
'Debug.Print textline
For i = 1 To 3
'Debug.Print Asc(Mid(textline, i, 1)) & " - " & Mid(textline, i, 1)
Select Case i
Case 1
If Asc(Mid(textline, i, 1)) = 239 Then
bEF = True
End If
Case 2
If Asc(Mid(textline, i, 1)) = 187 Then
bBB = True
End If
Case 3
If Asc(Mid(textline, i, 1)) = 191 Then
bBF = True
End If
Case 4
End Select
Next
End If
Close #1
If bEF And bBB And bBF Then
FileEncodeType = "UTF-8"
Else
FileEncodeType = "ANSI"
End If
End Function

