Windows 中的 Git Shell:补丁的默认字符编码是 UCS-2 Little Endian - 如何在没有 BOM 的情况下将其更改为 ANSI 或 UTF-8?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/13675782/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-19 07:50:48  来源:igfitidea点击:

Git Shell in Windows: patch's default character encoding is UCS-2 Little Endian - how to change this to ANSI or UTF-8 without BOM?

powershellgitencodinggithub

提问by Sk8erPeter

When creating a diff patch with Git Shell in Windows(when using GitHub for Windows), the character encodingof the patch will be UCS-2 Little Endianaccording to Notepad++ (see the screenshots below).

在 Windows 中使用 Git Shell创建差异补丁时(当使用GitHub for Windows 时),补丁的字符编码将根据 Notepad++为UCS-2 Little Endian(见下面的屏幕截图)。

How can I change this behavior, and force git to create patches with ANSI or UTF-8 without BOM character encoding?

如何更改此行为,并强制 git 使用 ANSI 或 UTF-8 创建补丁而不使用 BOM 字符编码?

It causes a problem because UCS-2 Little Endian encoded patches can not be applied, I have to manually convert it to ANSI. If I don't, I get "fatal: unrecognized input"error.

它会导致问题,因为无法应用 UCS-2 Little Endian 编码补丁,我必须手动将其转换为 ANSI。如果我不这样做,我会收到“致命:无法识别的输入”错误。

Creating git patch

创建 git 补丁

Notepad++ screenshot of the character encoding

Notepad++ 字符编码截图



Since then, I also realized that I have to manually convert the EOL from Windows format (\r\n) to UNIX (\n) in Notepad++ (Edit > EOL Conversion > UNIX). If I don't do this, I get "trailing whitespace" error (even if all the whitespaces are trimmed: "TextFX" > "TextFX Edit" > "Trim Trailing Spaces").

从那时起,我也意识到必须在 Notepad++ 中手动将 EOL 从 Windows 格式 ( \r\n)转换为 UNIX ( \n)(编辑 > EOL 转换 > UNIX)。如果我不这样做,我会收到“尾随空格”错误(即使所有空格都被修剪:“TextFX”>“TextFX 编辑”>“修剪尾随空格”)。

So, the steps I need to do for the patch to be applied:

因此,我需要为应用补丁执行的步骤:

  1. create patch (here is the result)
  2. convert character encoding to ANSI
  3. EOL conversion to UNIX format
  4. apply patch
  1. 创建补丁(这是结果
  2. 将字符编码转换为 ANSI
  3. EOL 转换为 UNIX 格式
  4. 应用补丁

Please, take a look at this screenshot:

请看一下这个截图:

Applying a patch in Windows Powershell with Git is problematic

使用 Git 在 Windows Powershell 中应用补丁是有问题的

采纳答案by Lars Noschinski

I'm not a Windows user, so take my answer with a grain of salt. According to the Windows PowerShell Cookbook, PowerShell preprocesses the output of git diff, splitting it in lines. Documentation of the Out-FileCmdlet suggests, that >is the same as | Out-Filewithout parameters. We also find this comment in the PowerShell documentation:

我不是 Windows 用户,所以请保留我的回答。根据Windows PowerShell Cookbook,PowerShell 预处理 的输出git diff,将其分成几行。Out-FileCmdlet 的文档表明,这>| Out-File没有参数的情况相同。我们还在PowerShell 文档中找到了这条评论:

The results of using the Out-File cmdlet may not be what you expect if you are used to traditional output redirection. To understand its behavior, you must be aware of the context in which the Out-File cmdlet operates.

By default, the Out-File cmdlet creates a Unicode file. This is the best default in the long run, but it means that tools that expect ASCII files will not work correctly with the default output format. You can change the default output format to ASCII by using the Encoding parameter:

[...]

Out-file formats file contents to look like console output. This causes the output to be truncated just as it is in a console window in most circumstances. [...]

To get output that does not force line wraps to match the screen width, you can use the Width parameter to specify line width.

如果您习惯于传统的输出重定向,则使用 Out-File cmdlet 的结果可能与您预期的不同。要了解其行为,您必须了解 Out-File cmdlet 运行的上下文。

默认情况下,Out-File cmdlet 创建一个 Unicode 文件。从长远来看,这是最好的默认设置,但这意味着需要 ASCII 文件的工具将无法在默认输出格式下正常工作。您可以使用 Encoding 参数将默认输出格式更改为 ASCII:

[...]

输出文件将文件内容格式化为类似于控制台输出。这会导致输出被截断,就像在大多数情况下在控制台窗口中一样。[...]

要获得不强制换行以匹配屏幕宽度的输出,您可以使用 Width 参数来指定线宽。

So, apparently it is not Git which chooses the character encoding, but Out-File. This suggests a) that PowerShell redirection really should only be used for text and b) that

因此,显然不是 Git 选择字符编码,而是Out-File. 这表明 a) PowerShell 重定向确实应该仅用于文本和 b)

| Out-File -encoding ASCII -Width 2147483647 my.patch

will avoid the encoding problems. However, this still does not solve the problem with Windows vs. Unix line-endings . There are Cmdlets (see the PowerShell Community Extensions) to do conversion of line-endings.

将避免编码问题。但是,这仍然不能解决 Windows 与 Unix 行尾的问题。有 Cmdlet(请参阅PowerShell 社区扩展)来转换行尾。

However, all this recoding does not increase my confidence in a patch (which has no encoding itself, but is just a string of bytes). The aforementioned Cookbookcontains a script Invoke-BinaryProcess, which can be used redirect the output of a command unmodified.

然而,所有这些重新编码并没有增加我对补丁的信心(它本身没有编码,只是一串字节)。前面提到的Cookbook包含一个脚本 Invoke-BinaryProcess,可用于重定向未修改命令的输出。

To sidestep this whole issue, an alternative would be to use git format-patchinstead of git diff. format-patchwrites directly to a file (and not to stdout), so its output is not recoded. However, it can only create patches from commits, not arbitrary diffs.

为了回避这个问题全,另一种是使用git format-patch替代git diffformat-patch直接写入文件(而不是标准输出),因此不会重新编码其输出。但是,它只能从提交中创建补丁,而不能从任意差异中创建。

format-patchtakes a commit range (e.g. master^10..master^5) or a single commit (e.g. X, meaning X..HEAD) and creates patch files of the form NNNN-SUBJECT.patch, where NNNN is an increasing 4-digit number and subject is the (mangled) subject of the patch. An output directory can be specified with -o.

format-patch采用提交范围(例如master^10..master^5)或单个提交(例如 X,表示 X..HEAD)并创建 NNNN-SUBJECT.patch 形式的补丁文件,其中 NNNN 是一个递增的 4 位数字,主题是(错位)补丁的主题。可以使用-o.

回答by ddiukariev

If you use powershell you can also just do:

如果您使用 powershell,您也可以这样做:

cmd /c "git diff > patch.diff"

This makes command to be run through CMD which writes to output file as is.

这使得命令通过 CMD 运行,CMD 按原样写入输出文件。

回答by Daniel Liuzzi

In case this helps anyone, using the good old Command Prompt instead of PowerShell works flawlessly; it doesn't seem to suffer from any of the issues present in PowerShell in regards to character encoding and EOLs.

如果这对任何人都有帮助,使用旧的命令提示符而不是 PowerShell 可以完美地工作;它似乎没有受到 PowerShell 中存在的有关字符编码和 EOL 的任何问题的影响。

enter image description here

在此处输入图片说明

回答by Lazy Badger

  1. Iconv output of diffs
  2. For plain-7bit patches (pure English) you can ignore crazy Notepad++ detection: patch-content doesn't contain any charset-definition
  1. 差异的图标输出
  2. 对于普通 7 位补丁(纯英文),您可以忽略疯狂的 Notepad++ 检测:补丁内容不包含任何字符集定义

回答by Vish

Doing dos2unix on the diff generated on powershell seems to do the trick for me. I was then able to applythe diff successfully.

在 powershell 上生成的差异上执行 dos2unix 似乎对我有用。然后我能够apply成功地进行差异化。

dos2unix.exe diff_file
git apply diff_file

回答by Console

As mentioned by Lars Noschinski you need to fix the Output of Out-File. You can set the DefaultParameter of Out-File using the following commands.

正如 Lars Noschinski 所提到的,您需要修复Out-File. 您可以使用以下命令设置 Out-File 的 DefaultParameter。

$PSDefaultParameterValues['Out-File:Encoding'] = 'ASCII'
$PSDefaultParameterValues['Out-File:Width'] = '2147483647'

After setting the Default parameters you can use the >to export a patch file.

设置默认参数后,您可以使用>导出补丁文件。

After adding those two lines to my Profile file everything works as expected.

将这两行添加到我的配置文件后,一切都按预期工作。

λ git stash show -p > test3
C:\Users\..\Source\.. [master +1 ~0 -0 !]
λ git apply test3
C:\Users\..\Source\.. [master +1 ~2 -0 !]