在 Windows 中获取文件的编码

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3710374/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 07:35:56  来源:igfitidea点击:

Get encoding of a file in Windows

windowsencoding

提问by TheWebGuy

This isn't really a programming question, is there a command line or Windows tool (Windows 7) to get the current encoding of a text file? Sure I can write a little C# app but I wanted to know if there is something already built in?

这不是真正的编程问题,是否有命令行或 Windows 工具(Windows 7)来获取文本文件的当前编码?当然我可以编写一个小的 C# 应用程序,但我想知道是否已经内置了一些东西?

回答by MikeTeeVee

Open up your file using regular old vanilla Notepad that comes with Windows.
It will show you the encoding of the file when you click "Save As...".
It'll look like this: enter image description here

使用 Windows 附带的常规旧香草记事本打开您的文件。
当您单击“另存为...”时,它将显示文件的编码。
它看起来像这样: 在此处输入图片说明

Whatever the default-selected encoding is, that is what your current encoding is for the file.
If it is UTF-8, you can change it to ANSI and click save to change the encoding (or visa-versa).

无论默认选择的编码是什么,这就是文件的当前编码。
如果是UTF-8,您可以将其更改为ANSI,然后单击保存更改编码(或反之亦然)。

I realize there are many different types of encoding, but this was all I needed when I was informed our export files were in UTF-8 and they required ANSI. It was a onetime export, so Notepad fit the bill for me.

我意识到有许多不同类型的编码,但是当我被告知我们的导出文件是 UTF-8 并且它们需要 ANSI 时,这就是我所需要的。这是一次性出口,所以记事本适合我。

FYI: From my understanding I think "Unicode" (as listed in Notepad) is a misnomer for UTF-16.
More here on Notepad's "Unicode" option: Windows 7 - UTF-8 and Unicdoe

仅供参考:根据我的理解,我认为“ Unicode”(如记事本中所列)是 UTF-16 的用词不当。
更多关于记事本“ Unicode”选项的信息:Windows 7 - UTF-8 和 Unicdoe

回答by Sybren

The (Linux) command-line tool 'file' is available on Windows via GnuWin32:

(Linux) 命令行工具“文件”可通过 GnuWin32 在 Windows 上使用:

http://gnuwin32.sourceforge.net/packages/file.htm

http://gnuwin32.sourceforge.net/packages/file.htm

If you have git installed, it's located in C:\Program Files\git\usr\bin.

如果你安装了 git,它位于 C:\Program Files\git\usr\bin。

Example:

例子:

    C:\Users\SH\Downloads\SquareRoot>file *
    _UpgradeReport_Files;         directory
    Debug;                        directory
    duration.h;                   ASCII C++ program text, with CRLF line terminators
    ipch;                         directory
    main.cpp;                     ASCII C program text, with CRLF line terminators
    Precision.txt;                ASCII text, with CRLF line terminators
    Release;                      directory
    Speed.txt;                    ASCII text, with CRLF line terminators
    SquareRoot.sdf;               data
    SquareRoot.sln;               UTF-8 Unicode (with BOM) text, with CRLF line terminators
    SquareRoot.sln.docstates.suo; PCX ver. 2.5 image data
    SquareRoot.suo;               CDF V2 Document, corrupt: Cannot read summary info
    SquareRoot.vcproj;            XML  document text
    SquareRoot.vcxproj;           XML document text
    SquareRoot.vcxproj.filters;   XML document text
    SquareRoot.vcxproj.user;      XML document text
    squarerootmethods.h;          ASCII C program text, with CRLF line terminators
    UpgradeLog.XML;               XML  document text

    C:\Users\SH\Downloads\SquareRoot>file --mime-encoding *
    _UpgradeReport_Files;         binary
    Debug;                        binary
    duration.h;                   us-ascii
    ipch;                         binary
    main.cpp;                     us-ascii
    Precision.txt;                us-ascii
    Release;                      binary
    Speed.txt;                    us-ascii
    SquareRoot.sdf;               binary
    SquareRoot.sln;               utf-8
    SquareRoot.sln.docstates.suo; binary
    SquareRoot.suo;               CDF V2 Document, corrupt: Cannot read summary infobinary
    SquareRoot.vcproj;            us-ascii
    SquareRoot.vcxproj;           utf-8
    SquareRoot.vcxproj.filters;   utf-8
    SquareRoot.vcxproj.user;      utf-8
    squarerootmethods.h;          us-ascii
    UpgradeLog.XML;               us-ascii

回答by George Ninan

If you have "git" or "Cygwin" on your Windows Machine, then go to the folder where your file is present and execute the command:

如果您的 Windows 机器上有“git”或“Cygwin”,则转到您的文件所在的文件夹并执行命令:

file *

This will give you the encoding details of all the files in that folder.

这将为您提供该文件夹中所有文件的编码详细信息。

回答by user961954

Another tool that I found useful: https://archive.codeplex.com/?p=encodingcheckerEXE can be found here

我发现另一个有用的工具:https: //archive.codeplex.com/?p =encodingchecker EXE 可以在这里找到

回答by yzorg

Here's my take how to detect the Unicode family of text encodings via BOM. The accuracy of this method is low, as this method only works on text files (specifically Unicode files), and defaults to asciiwhen no BOM is present (like most text editors, the default would be UTF8if you want to match the HTTP/web ecosystem).

这是我如何通过 BOM 检测 Unicode 文本编码系列。这种方法的准确性较低,因为这种方法只适用于文本文件(特别是 Unicode 文件),并且ascii在没有 BOM 时默认使用(与大多数文本编辑器一样,UTF8如果您想匹配 HTTP/web 生态系统,则默认为)。

Update 2018: I no longer recommend this method.I recommend using file.exe from GIT or *nix tools as recommended by @Sybren, and I show how to do that via PowerShell in a later answer.

2018 年更新我不再推荐这种方法。我建议使用 @Sybren 推荐的来自 GIT 或 *nix 工具的 file.exe,我将在稍后的答案中展示如何通过 PowerShell 执行此操作

# from https://gist.github.com/zommarin/1480974
function Get-FileEncoding($Path) {
    $bytes = [byte[]](Get-Content $Path -Encoding byte -ReadCount 4 -TotalCount 4)

    if(!$bytes) { return 'utf8' }

    switch -regex ('{0:x2}{1:x2}{2:x2}{3:x2}' -f $bytes[0],$bytes[1],$bytes[2],$bytes[3]) {
        '^efbbbf'   { return 'utf8' }
        '^2b2f76'   { return 'utf7' }
        '^fffe'     { return 'unicode' }
        '^feff'     { return 'bigendianunicode' }
        '^0000feff' { return 'utf32' }
        default     { return 'ascii' }
    }
}

dir ~\Documents\WindowsPowershell -File | 
    select Name,@{Name='Encoding';Expression={Get-FileEncoding $_.FullName}} | 
    ft -AutoSize

Recommendation: This can work reasonably well if the dir, ls, or Get-ChildItemonly checks known text files, and when you're only looking for "bad encodings" from a known list of tools. (i.e. SQL Management Studio defaults to UTF16, which broke GIT auto-cr-lf for Windows, which was the default for many years.)

建议:如果、 或仅检查已知的文本文件,并且当您仅从已知的工具列表中查找“错误编码”时dir,这可以很好地工作。(即 SQL Management Studio 默认为 UTF16,这破坏了 Windows 的 GIT auto-cr-lf,这是多年来的默认设置。)lsGet-ChildItem

回答by yzorg

I wrote the #4 answer (at time of writing). But lately I have git installed on all my computers, so now I use @Sybren's solution. Here is a new answer that makes that solution handy from powershell (without putting all of git/usr/bin in the PATH, which is too much clutter for me).

我写了 #4 答案(在撰写本文时)。但是最近我在所有计算机上都安装了 git,所以现在我使用 @Sybren 的解决方案。这是一个新的答案,它使 powershell 中的解决方案变得方便(没有将所有 git/usr/bin 放在 PATH 中,这对我来说太混乱了)。

Add this to your profile.ps1:

将此添加到您的profile.ps1

$global:gitbin = 'C:\Program Files\Git\usr\bin'
Set-Alias file.exe $gitbin\file.exe

And used like: file.exe --mime-encoding *. You must include .exein the command for PS alias to work.

而使用这样的:file.exe --mime-encoding *。您必须在命令中包含 .exe 才能使 PS 别名工作。

But if you don't customize your PowerShell profile.ps1 I suggest you start with mine: https://gist.github.com/yzorg/8215221/8e38fd722a3dfc526bbe4668d1f3b08eb7c08be0and save it to ~\Documents\WindowsPowerShell. It's safe to use on a computer without git, but will write warnings when git is not found.

但是,如果您不自定义 PowerShell profile.ps1,我建议您从我的开始:https: //gist.github.com/yzorg/8215221/8e38fd722a3dfc526bbe4668d1f3b08eb7c08be0并将其保存到~\Documents\WindowsPowerShell. 在没有 git 的计算机上使用是安全的,但是当找不到 git 时会写警告。

The .exein the command is also how I use C:\WINDOWS\system32\where.exefrom powershell; and many other OS CLI commands that are "hidden by default" by powershell, *shrug*.

命令中的.exe也是我C:\WINDOWS\system32\where.exe在 powershell 中使用的方式;以及许多其他被 powershell“默认隐藏”的 OS CLI 命令,*shrug*。

回答by Just Shadow

A simple solution might be opening the file in Firefox.

一个简单的解决方案可能是在 Firefox 中打开文件。

  1. Drag and drop the file into firefox
  2. Right click on the page
  3. Select "View Page Info"
  1. 将文件拖放到 Firefox 中
  2. 在页面上右击
  3. 选择“查看页面信息”

and the text encoding will appear on the "Page Info" window.

并且文本编码将出现在“页面信息”窗口中。

enter image description here

在此处输入图片说明

Note:If the file is not in txt format, just rename it to txt and try again.

注意:如果文件不是txt格式,只需将其重命名为txt并重试。

P.S. For more info see thisarticle.

PS欲了解更多信息,请参阅这篇文章。

回答by phd_coder

Install git ( on Windows you have to use git bash console). Type:

安装 git(在 Windows 上你必须使用 git bash 控制台)。类型:

file *   

for all files in the current directory , or

对于当前目录中的所有文件,或

file */*   

for the files in all subdirectories

对于所有子目录中的文件

回答by Ville

You can use a free utility called Encoding Recognizer (requires java). You can find it at http://mindprod.com/products2.html#ENCODINGRECOGNISER

您可以使用名为 Encoding Recognizer(需要 java)的免费实用程序。您可以在http://mindprod.com/products2.html#ENCODINGRECOGNISER找到它

回答by JaykeBird

Similar to the solution listed above with Notepad, you can also open the file in Visual Studio, if you're using that. In Visual Studio, you can select "File > Advanced Save Options..."

与上面使用记事本列出的解决方案类似,您也可以在 Visual Studio 中打开该文件(如果您正在使用它)。在 Visual Studio 中,您可以选择“文件 > 高级保存选项...”

The "Encoding:" combo box will tell you specifically which encoding is currently being used for the file. It has a lot more text encodings listed in there than Notepad does, so it's useful when dealing with various files from around the world and whatever else.

“编码:”组合框将具体告诉您当前正在为文件使用哪种编码。它列出的文本编码比记事本多得多,因此它在处理来自世界各地的各种文件和其他任何文件时非常有用。

Just like Notepad, you can also change the encoding from the list of options there, and then saving the file after hitting "OK". You can also select the encoding you want through the "Save with Encoding..." option in the Save As dialog (by clicking the arrow next to the Save button).

就像记事本一样,您也可以从那里的选项列表中更改编码,然后在点击“确定”后保存文件。您还可以通过“另存为”对话框中的“使用编码保存...”选项(通过单击“保存”按钮旁边的箭头)来选择所需的编码。