在 C# 中确定字符串的编码

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1025332/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-06 05:48:12  来源:igfitidea点击:

Determine a string's encoding in C#

c#stringencoding

提问by krebstar

Is there any way to determine a string's encoding in C#?

有没有办法在 C# 中确定字符串的编码?

Say, I have a filename string, but I don't know if it is encoded in UnicodeUTF-16 or the system-default encoding, how do I find out?

比如说,我有一个文件名字符串,但我不知道它是用UnicodeUTF-16 编码还是系统默认编码,我如何知道?

采纳答案by devdimi

Check out Utf8Checker it is simple class that does exactly this in pure managed code. http://utf8checker.codeplex.com

看看 Utf8Checker 它是一个简单的类,它在纯托管代码中完全做到了这一点。 http://utf8checker.codeplex.com

Notice: as already pointed out "determine encoding" makes sense only for byte streams. If you have a string it is already encoded from someone along the way who already knew or guessed the encoding to get the string in the first place.

注意:正如已经指出的那样,“确定编码”仅对字节流有意义。如果您有一个字符串,那么它已经是从已经知道或猜到编码的人那里编码的,以便首先获得字符串。

回答by Mitch Wheat

It depends where the string 'came from'. A .NET string is Unicode (UTF-16). The only way it could be different if you, say, read the data from a database into a byte array.

这取决于字符串“来自”的位置。.NET 字符串是 Unicode (UTF-16)。如果您将数据库中的数据读入字节数组,则情况可能会有所不同。

This CodeProject article might be of interest: Detect Encoding for in- and outgoing text

这篇 CodeProject 文章可能会引起您的兴趣:检测输入和输出文本的编码

Jon Skeet's Strings in C# and .NETis an excellent explanation of .NET strings.

Jon Skeet 的C# 和 .NET中的字符串很好地解释了 .NET 字符串。

回答by Tao

Another option, very late in coming, sorry:

另一种选择,很晚才来,抱歉:

http://www.architectshack.com/TextFileEncodingDetector.ashx

http://www.architectshack.com/TextFileEncodingDetector.ashx

This small C#-only class uses BOMS if present, tries to auto-detect possible unicode encodings otherwise, and falls back if none of the Unicode encodings is possible or likely.

这个仅 C# 的小类使用 BOMS(如果存在),否则会尝试自动检测可能的 unicode 编码,如果没有任何 Unicode 编码是可能的或不可能的,则回退。

It sounds like UTF8Checker referenced above does something similar, but I think this is slightly broader in scope - instead of just UTF8, it also checks for other possible Unicode encodings (UTF-16 LE or BE) that might be missing a BOM.

听起来上面引用的 UTF8Checker 做了类似的事情,但我认为这在范围上稍微广泛一些 - 不仅仅是 UTF8,它还检查可能缺少 BOM 的其他可能的 Unicode 编码(UTF-16 LE 或 BE)。

Hope this helps someone!

希望这可以帮助某人!

回答by Simon Bridge

I know this is a bit late - but to be clear:

我知道这有点晚了 - 但要清楚:

A string doesn't really have encoding... in .NET the a string is a collection of char objects. Essentially, if it is a string, it has already been decoded.

字符串实际上并没有编码……在.NET 中,字符串是字符对象的集合。本质上,如果它是一个字符串,它已经被解码了。

However if you are reading the contents of a file, which is made of bytes, and wish to convert that to a string, then the file's encoding must be used.

但是,如果您正在读取由字节组成的文件的内容,并希望将其转换为字符串,则必须使用该文件的编码。

.NET includes encoding and decoding classes for: ASCII, UTF7, UTF8, UTF32 and more.

.NET 包括编码和解码类:ASCII、UTF7、UTF8、UTF32 等。

Most of these encodings contain certain byte-order marks that can be used to distinguish which encoding type was used.

大多数这些编码都包含某些字节顺序标记,可用于区分使用的编码类型。

The .NET class System.IO.StreamReader is able to determine the encoding used within a stream, by reading those byte-order marks;

.NET 类 System.IO.StreamReader 能够通过读取这些字节顺序标记来确定流中使用的编码;

Here is an example:

下面是一个例子:

    /// <summary>
    /// return the detected encoding and the contents of the file.
    /// </summary>
    /// <param name="fileName"></param>
    /// <param name="contents"></param>
    /// <returns></returns>
    public static Encoding DetectEncoding(String fileName, out String contents)
    {
        // open the file with the stream-reader:
        using (StreamReader reader = new StreamReader(fileName, true))
        {
            // read the contents of the file into a string
            contents = reader.ReadToEnd();

            // return the encoding.
            return reader.CurrentEncoding;
        }
    }

回答by Dan W

The code below has the following features:

下面的代码具有以下特点:

  1. Detection or attempted detection of UTF-7, UTF-8/16/32 (bom, no bom, little & big endian)
  2. Falls back to the local default codepage if no Unicode encoding was found.
  3. Detects (with high probability) unicode files with the BOM/signature missing
  4. Searches for charset=xyz and encoding=xyz inside file to help determine encoding.
  5. To save processing, you can 'taste' the file (definable number of bytes).
  6. The encoding and decoded text file is returned.
  7. Purely byte-based solution for efficiency
  1. 检测或尝试检测 UTF-7、UTF-8/16/32(bom、no bom、小端和大端)
  2. 如果未找到 Unicode 编码,则回退到本地默认代码页。
  3. 检测(以高概率)缺少 BOM/签名的 unicode 文件
  4. 在文件中搜索 charset=xyz 和 encoding=xyz 以帮助确定编码。
  5. 为了节省处理,您可以“品尝”文件(可定义的字节数)。
  6. 返回编码和解码的文本文件。
  7. 纯粹基于字节的解决方案以提高效率

As others have said, no solution can be perfect (and certainly one can't easily differentiate between the various 8-bit extended ASCII encodings in use worldwide), but we can get 'good enough' especially if the developer also presents to the user a list of alternative encodings as shown here: What is the most common encoding of each language?

正如其他人所说,没有任何解决方案是完美的(当然也不能轻易区分全球使用的各种 8 位扩展 ASCII 编码),但我们可以得到“足够好”,特别是如果开发人员也向用户展示此处显示的替代编码列表:每种语言最常见的编码是什么?

A full list of Encodings can be found using Encoding.GetEncodings();

可以使用以下方法找到完整的编码列表 Encoding.GetEncodings();

// Function to detect the encoding for UTF-7, UTF-8/16/32 (bom, no bom, little
// & big endian), and local default codepage, and potentially other codepages.
// 'taster' = number of bytes to check of the file (to save processing). Higher
// value is slower, but more reliable (especially UTF-8 with special characters
// later on may appear to be ASCII initially). If taster = 0, then taster
// becomes the length of the file (for maximum reliability). 'text' is simply
// the string with the discovered encoding applied to the file.
public Encoding detectTextEncoding(string filename, out String text, int taster = 1000)
{
    byte[] b = File.ReadAllBytes(filename);

    //////////////// First check the low hanging fruit by checking if a
    //////////////// BOM/signature exists (sourced from http://www.unicode.org/faq/utf_bom.html#bom4)
    if (b.Length >= 4 && b[0] == 0x00 && b[1] == 0x00 && b[2] == 0xFE && b[3] == 0xFF) { text = Encoding.GetEncoding("utf-32BE").GetString(b, 4, b.Length - 4); return Encoding.GetEncoding("utf-32BE"); }  // UTF-32, big-endian 
    else if (b.Length >= 4 && b[0] == 0xFF && b[1] == 0xFE && b[2] == 0x00 && b[3] == 0x00) { text = Encoding.UTF32.GetString(b, 4, b.Length - 4); return Encoding.UTF32; }    // UTF-32, little-endian
    else if (b.Length >= 2 && b[0] == 0xFE && b[1] == 0xFF) { text = Encoding.BigEndianUnicode.GetString(b, 2, b.Length - 2); return Encoding.BigEndianUnicode; }     // UTF-16, big-endian
    else if (b.Length >= 2 && b[0] == 0xFF && b[1] == 0xFE) { text = Encoding.Unicode.GetString(b, 2, b.Length - 2); return Encoding.Unicode; }              // UTF-16, little-endian
    else if (b.Length >= 3 && b[0] == 0xEF && b[1] == 0xBB && b[2] == 0xBF) { text = Encoding.UTF8.GetString(b, 3, b.Length - 3); return Encoding.UTF8; } // UTF-8
    else if (b.Length >= 3 && b[0] == 0x2b && b[1] == 0x2f && b[2] == 0x76) { text = Encoding.UTF7.GetString(b,3,b.Length-3); return Encoding.UTF7; } // UTF-7


    //////////// If the code reaches here, no BOM/signature was found, so now
    //////////// we need to 'taste' the file to see if can manually discover
    //////////// the encoding. A high taster value is desired for UTF-8
    if (taster == 0 || taster > b.Length) taster = b.Length;    // Taster size can't be bigger than the filesize obviously.


    // Some text files are encoded in UTF8, but have no BOM/signature. Hence
    // the below manually checks for a UTF8 pattern. This code is based off
    // the top answer at: https://stackoverflow.com/questions/6555015/check-for-invalid-utf8
    // For our purposes, an unnecessarily strict (and terser/slower)
    // implementation is shown at: https://stackoverflow.com/questions/1031645/how-to-detect-utf-8-in-plain-c
    // For the below, false positives should be exceedingly rare (and would
    // be either slightly malformed UTF-8 (which would suit our purposes
    // anyway) or 8-bit extended ASCII/UTF-16/32 at a vanishingly long shot).
    int i = 0;
    bool utf8 = false;
    while (i < taster - 4)
    {
        if (b[i] <= 0x7F) { i += 1; continue; }     // If all characters are below 0x80, then it is valid UTF8, but UTF8 is not 'required' (and therefore the text is more desirable to be treated as the default codepage of the computer). Hence, there's no "utf8 = true;" code unlike the next three checks.
        if (b[i] >= 0xC2 && b[i] <= 0xDF && b[i + 1] >= 0x80 && b[i + 1] < 0xC0) { i += 2; utf8 = true; continue; }
        if (b[i] >= 0xE0 && b[i] <= 0xF0 && b[i + 1] >= 0x80 && b[i + 1] < 0xC0 && b[i + 2] >= 0x80 && b[i + 2] < 0xC0) { i += 3; utf8 = true; continue; }
        if (b[i] >= 0xF0 && b[i] <= 0xF4 && b[i + 1] >= 0x80 && b[i + 1] < 0xC0 && b[i + 2] >= 0x80 && b[i + 2] < 0xC0 && b[i + 3] >= 0x80 && b[i + 3] < 0xC0) { i += 4; utf8 = true; continue; }
        utf8 = false; break;
    }
    if (utf8 == true) {
        text = Encoding.UTF8.GetString(b);
        return Encoding.UTF8;
    }


    // The next check is a heuristic attempt to detect UTF-16 without a BOM.
    // We simply look for zeroes in odd or even byte places, and if a certain
    // threshold is reached, the code is 'probably' UF-16.          
    double threshold = 0.1; // proportion of chars step 2 which must be zeroed to be diagnosed as utf-16. 0.1 = 10%
    int count = 0;
    for (int n = 0; n < taster; n += 2) if (b[n] == 0) count++;
    if (((double)count) / taster > threshold) { text = Encoding.BigEndianUnicode.GetString(b); return Encoding.BigEndianUnicode; }
    count = 0;
    for (int n = 1; n < taster; n += 2) if (b[n] == 0) count++;
    if (((double)count) / taster > threshold) { text = Encoding.Unicode.GetString(b); return Encoding.Unicode; } // (little-endian)


    // Finally, a long shot - let's see if we can find "charset=xyz" or
    // "encoding=xyz" to identify the encoding:
    for (int n = 0; n < taster-9; n++)
    {
        if (
            ((b[n + 0] == 'c' || b[n + 0] == 'C') && (b[n + 1] == 'h' || b[n + 1] == 'H') && (b[n + 2] == 'a' || b[n + 2] == 'A') && (b[n + 3] == 'r' || b[n + 3] == 'R') && (b[n + 4] == 's' || b[n + 4] == 'S') && (b[n + 5] == 'e' || b[n + 5] == 'E') && (b[n + 6] == 't' || b[n + 6] == 'T') && (b[n + 7] == '=')) ||
            ((b[n + 0] == 'e' || b[n + 0] == 'E') && (b[n + 1] == 'n' || b[n + 1] == 'N') && (b[n + 2] == 'c' || b[n + 2] == 'C') && (b[n + 3] == 'o' || b[n + 3] == 'O') && (b[n + 4] == 'd' || b[n + 4] == 'D') && (b[n + 5] == 'i' || b[n + 5] == 'I') && (b[n + 6] == 'n' || b[n + 6] == 'N') && (b[n + 7] == 'g' || b[n + 7] == 'G') && (b[n + 8] == '='))
            )
        {
            if (b[n + 0] == 'c' || b[n + 0] == 'C') n += 8; else n += 9;
            if (b[n] == '"' || b[n] == '\'') n++;
            int oldn = n;
            while (n < taster && (b[n] == '_' || b[n] == '-' || (b[n] >= '0' && b[n] <= '9') || (b[n] >= 'a' && b[n] <= 'z') || (b[n] >= 'A' && b[n] <= 'Z')))
            { n++; }
            byte[] nb = new byte[n-oldn];
            Array.Copy(b, oldn, nb, 0, n-oldn);
            try {
                string internalEnc = Encoding.ASCII.GetString(nb);
                text = Encoding.GetEncoding(internalEnc).GetString(b);
                return Encoding.GetEncoding(internalEnc);
            }
            catch { break; }    // If C# doesn't recognize the name of the encoding, break.
        }
    }


    // If all else fails, the encoding is probably (though certainly not
    // definitely) the user's local codepage! One might present to the user a
    // list of alternative encodings as shown here: https://stackoverflow.com/questions/8509339/what-is-the-most-common-encoding-of-each-language
    // A full list can be found using Encoding.GetEncodings();
    text = Encoding.Default.GetString(b);
    return Encoding.Default;
}

回答by vilicvane

My solution is to use built-in stuffs with some fallbacks.

我的解决方案是使用带有一些后备的内置内容。

I picked the strategy from an answer to another similar question on stackoverflow but I can't find it now.

我从关于 stackoverflow 的另一个类似问题的答案中选择了该策略,但现在找不到了。

It checks the BOM first using the built-in logic in StreamReader, if there's BOM, the encoding will be something other than Encoding.Default, and we should trust that result.

它首先使用 StreamReader 中的内置逻辑检查 BOM,如果有 BOM,则编码将不是Encoding.Default,我们应该相信该结果。

If not, it checks whether the bytes sequence is valid UTF-8 sequence. if it is, it will guess UTF-8 as the encoding, and if not, again, the default ASCII encoding will be the result.

如果不是,则检查字节序列是否是有效的 UTF-8 序列。如果是,它会猜测 UTF-8 作为编码,如果不是,同样,默认的 ASCII 编码将是结果。

static Encoding getEncoding(string path) {
    var stream = new FileStream(path, FileMode.Open);
    var reader = new StreamReader(stream, Encoding.Default, true);
    reader.Read();

    if (reader.CurrentEncoding != Encoding.Default) {
        reader.Close();
        return reader.CurrentEncoding;
    }

    stream.Position = 0;

    reader = new StreamReader(stream, new UTF8Encoding(false, true));
    try {
        reader.ReadToEnd();
        reader.Close();
        return Encoding.UTF8;
    }
    catch (Exception) {
        reader.Close();
        return Encoding.Default;
    }
}

回答by Nyerguds

Note: this was an experiment to see how UTF-8 encoding worked internally. The solution offered by vilicvane, to use a UTF8Encodingobject that is initialised to throw an exception on decoding failure, is much simpler, and basically does the same thing.

注意:这是一个实验,目的是了解 UTF-8 编码在内部是如何工作的。vilicvane 提供的解决方案,使用UTF8Encoding初始化为在解码失败时抛出异常的对象,要简单得多,并且基本上做同样的事情。



I wrote this piece of code to differentiate between UTF-8 and Windows-1252. It shouldn't be used for gigantic text files though, since it loads the entire thing into memory and scans it completely. I used it for .srt subtitle files, just to be able to save them back in the encoding in which they were loaded.

我写了这段代码来区分 UTF-8 和 Windows-1252。不过,它不应该用于巨大的文本文件,因为它会将整个内容加载到内存中并对其进行完整扫描。我将它用于 .srt 字幕文件,只是为了能够将它们保存回加载它们的编码中。

The encoding given to the function as ref should be the 8-bit fallback encoding to use in case the file is detected as not being valid UTF-8; generally, on Windows systems, this will be Windows-1252. This doesn't do anything fancy like checking actual valid ascii ranges though, and doesn't detect UTF-16 even on byte order mark.

作为 ref 提供给函数的编码应该是 8 位回退编码,以防检测到文件不是有效的 UTF-8;通常,在 Windows 系统上,这将是 Windows-1252。这并没有像检查实际有效的 ascii 范围那样做任何花哨的事情,并且即使在字节顺序标记上也不会检测到 UTF-16。

The theory behind the bitwise detection can be found here: https://ianthehenry.com/2015/1/17/decoding-utf-8/

按位检测背后的理论可以在这里找到:https: //ianthehenry.com/2015/1/17/decoding-utf-8/

Basically, the bit range of the first byte determines how many after it are part of the UTF-8 entity. These bytes after it are always in the same bit range.

基本上,第一个字节的位范围决定了它后面有多少是 UTF-8 实体的一部分。它后面的这些字节总是在相同的位范围内。

/// <summary>
/// Reads a text file, and detects whether its encoding is valid UTF-8 or ascii.
/// If not, decodes the text using the given fallback encoding.
/// Bit-wise mechanism for detecting valid UTF-8 based on
/// https://ianthehenry.com/2015/1/17/decoding-utf-8/
/// </summary>
/// <param name="docBytes">The bytes read from the file.</param>
/// <param name="encoding">The default encoding to use as fallback if the text is detected not to be pure ascii or UTF-8 compliant. This ref parameter is changed to the detected encoding.</param>
/// <returns>The contents of the read file, as String.</returns>
public static String ReadFileAndGetEncoding(Byte[] docBytes, ref Encoding encoding)
{
    if (encoding == null)
        encoding = Encoding.GetEncoding(1252);
    Int32 len = docBytes.Length;
    // byte order mark for utf-8. Easiest way of detecting encoding.
    if (len > 3 && docBytes[0] == 0xEF && docBytes[1] == 0xBB && docBytes[2] == 0xBF)
    {
        encoding = new UTF8Encoding(true);
        // Note that even when initialising an encoding to have
        // a BOM, it does not cut it off the front of the input.
        return encoding.GetString(docBytes, 3, len - 3);
    }
    Boolean isPureAscii = true;
    Boolean isUtf8Valid = true;
    for (Int32 i = 0; i < len; ++i)
    {
        Int32 skip = TestUtf8(docBytes, i);
        if (skip == 0)
            continue;
        if (isPureAscii)
            isPureAscii = false;
        if (skip < 0)
        {
            isUtf8Valid = false;
            // if invalid utf8 is detected, there's no sense in going on.
            break;
        }
        i += skip;
    }
    if (isPureAscii)
        encoding = new ASCIIEncoding(); // pure 7-bit ascii.
    else if (isUtf8Valid)
        encoding = new UTF8Encoding(false);
    // else, retain given encoding. This should be an 8-bit encoding like Windows-1252.
    return encoding.GetString(docBytes);
}

/// <summary>
/// Tests if the bytes following the given offset are UTF-8 valid, and
/// returns the amount of bytes to skip ahead to do the next read if it is.
/// If the text is not UTF-8 valid it returns -1.
/// </summary>
/// <param name="binFile">Byte array to test</param>
/// <param name="offset">Offset in the byte array to test.</param>
/// <returns>The amount of bytes to skip ahead for the next read, or -1 if the byte sequence wasn't valid UTF-8</returns>
public static Int32 TestUtf8(Byte[] binFile, Int32 offset)
{
    // 7 bytes (so 6 added bytes) is the maximum the UTF-8 design could support,
    // but in reality it only goes up to 3, meaning the full amount is 4.
    const Int32 maxUtf8Length = 4;
    Byte current = binFile[offset];
    if ((current & 0x80) == 0)
        return 0; // valid 7-bit ascii. Added length is 0 bytes.
    Int32 len = binFile.Length;
    for (Int32 addedlength = 1; addedlength < maxUtf8Length; ++addedlength)
    {
        Int32 fullmask = 0x80;
        Int32 testmask = 0;
        // This code adds shifted bits to get the desired full mask.
        // If the full mask is [111]0 0000, then test mask will be [110]0 0000. Since this is
        // effectively always the previous step in the iteration I just store it each time.
        for (Int32 i = 0; i <= addedlength; ++i)
        {
            testmask = fullmask;
            fullmask += (0x80 >> (i+1));
        }
        // figure out bit masks from level
        if ((current & fullmask) == testmask)
        {
            if (offset + addedlength >= len)
                return -1;
            // Lookahead. Pattern of any following bytes is always 10xxxxxx
            for (Int32 i = 1; i <= addedlength; ++i)
            {
                if ((binFile[offset + i] & 0xC0) != 0x80)
                    return -1;
            }
            return addedlength;
        }
    }
    // Value is greater than the maximum allowed for utf8. Deemed invalid.
    return -1;
}

回答by Arithmomaniac

The SimpleHelpers.FileEncodingNuget package wraps a C# port of the Mozilla Universal Charset Detectorinto a dead-simple API:

SimpleHelpers.FileEncodingNuGet包包装了一个Mozilla的通用字符检测器的C#端口到死简单的API:

var encoding = FileEncoding.DetectFileEncoding(txtFile);

回答by gsw945

I found new library on GitHub: CharsetDetector/UTF-unknown

我在 GitHub 上找到了新库:CharsetDetector/UTF-unknown

Charset detector build in C# - .NET Core 2-3, .NET standard 1-2 & .NET 4+

用 C# 构建的字符集检测器 - .NET Core 2-3、.NET 标准 1-2 和 .NET 4+

it's also a port of the Mozilla Universal Charset Detectorbased on other repositories.

它也是基于其他存储库的Mozilla 通用字符集检测器的一个端口。

CharsetDetector/UTF-unknownhave a class named CharsetDetector.

CharsetDetector/UTF-unknown有一个名为CharsetDetector.

CharsetDetectorcontains some static encoding detect methods:

CharsetDetector包含一些静态编码检测方法:

  • CharsetDetector.DetectFromFile()
  • CharsetDetector.DetectFromStream()
  • CharsetDetector.DetectFromBytes()
  • CharsetDetector.DetectFromFile()
  • CharsetDetector.DetectFromStream()
  • CharsetDetector.DetectFromBytes()

detected result is in class DetectionResulthas attribute Detectedwhich is instance of class DetectionDetailwith below attributes:

检测到的结果在类中DetectionResult具有属性Detected,它是DetectionDetail具有以下属性的类的实例:

  • EncodingName
  • Encoding
  • Confidence
  • EncodingName
  • Encoding
  • Confidence

below is an example to show usage:

以下是显示用法的示例:

// Program.cs
using System;
using System.Text;
using UtfUnknown;

namespace ConsoleExample
{
    public class Program
    {
        public static void Main(string[] args)
        {
            string filename = @"E:\new-file.txt";
            DetectDemo(filename);
        }

        /// <summary>
        /// Command line example: detect the encoding of the given file.
        /// </summary>
        /// <param name="filename">a filename</param>
        public static void DetectDemo(string filename)
        {
            // Detect from File
            DetectionResult result = CharsetDetector.DetectFromFile(filename);
            // Get the best Detection
            DetectionDetail resultDetected = result.Detected;

            // detected result may be null.
            if (resultDetected != null)
            {
                // Get the alias of the found encoding
                string encodingName = resultDetected.EncodingName;
                // Get the System.Text.Encoding of the found encoding (can be null if not available)
                Encoding encoding = resultDetected.Encoding;
                // Get the confidence of the found encoding (between 0 and 1)
                float confidence = resultDetected.Confidence;

                if (encoding != null)
                {
                    Console.WriteLine($"Detection completed: {filename}");
                    Console.WriteLine($"EncodingWebName: {encoding.WebName}{Environment.NewLine}Confidence: {confidence}");
                }
                else
                {
                    Console.WriteLine($"Detection completed: {filename}");
                    Console.WriteLine($"(Encoding is null){Environment.NewLine}EncodingName: {encodingName}{Environment.NewLine}Confidence: {confidence}");
                }
            }
            else
            {
                Console.WriteLine($"Detection failed: {filename}");
            }
        }
    }
}

example result screenshot: enter image description here

示例结果截图: 在此处输入图片说明