Mac OS X 上的 Git 和元音变音问题

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/5581857/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-19 05:18:00  来源:igfitidea点击:

Git and the Umlaut problem on Mac OS X

gitmacosversioning

提问by LuckyMalaka

Today I discovered a bug for Git on Mac OS X.

今天我在 Mac OS X 上发现了 Git 的一个错误。

For example, I will commit a file with the name überschrift.txt with the German special character ü at the beginning. From the command git statusI get following output.

例如,我将提交一个名称为 überschrift.txt 的文件,以德语特殊字符 ü 开头。从命令git status我得到以下输出。

Users-iMac: user$ git status

On branch master
# Untracked files:
#   (use "git add <file>..." to include in what will be committed)
#
#   "U40berschrift.txt"
nothing added to commit but untracked files present (use "git add" to track)

It seems that Git 1.7.2 has a problem with German special characters on Mac OS X. Is there a solution to get Git read the file names correct?

似乎 Git 1.7.2 在 Mac OS X 上存在德语特殊字符的问题。有没有办法让 Git 正确读取文件名?

回答by chicken

Enable core.precomposeunicode on the mac

在mac上启用core.precomposeunicode

git config --global core.precomposeunicode true

For this to work, you need to have at least Git 1.8.2.

为此,您至少需要有 Git 1.8.2。

Mountain Lion ships with 1.7.5. To get a newer git either use git-osx-installeror homebrew(requires Xcode).

Mountain Lion 随附 1.7.5。要获得更新的 git,请使用git-osx-installerhomebrew(需要 Xcode)。

That's it.

就是这样。

回答by Yuji

The cause is the different implementation of how the filesystem stores the file name.

原因是文件系统存储文件名的方式不同。

In Unicode, ü can be represented in two ways, one is by ü alone, the other is by U + "combining umlaut character". A Unicode string can contain both forms, but as it's confusing to have both, the file system normalizes the unicode string by setting every umlauted-U to ü, or U + "combining umlaut character".

在Unicode中,ü可以用两种方式表示,一种是单独用ü,另一种是用U+“组合变音字符”。一个 Unicode 字符串可以包含这两种形式,但是因为同时拥有这两种形式会造成混淆,所以文件系统通过将每个变音-U 设置为 ü 或 U +“组合变音字符”来规范化 unicode 字符串。

Linux uses the former method, called Normal-Form-Composed (or NFC), and Mac OS X uses the latter method, called Normal-Form-Decomposed (NFD).

Linux 使用前一种方法,称为 Normal-Form-Composed(或 NFC),Mac OS X 使用后一种方法,称为 Normal-Form-Decomposed (NFD)。

Apparently Gitdoesn't care about this point and simply uses the byte sequence of the filename, which leads to the problem you're having.

显然Git并不关心这一点,只是使用文件名的字节序列,这会导致您遇到的问题。

The mailing list thread Git, Mac OS X and German special charactershas a patch in it so that Git compares the file names after normalization.

邮件列表线程Git、Mac OS X 和德语特殊字符中有一个补丁,以便 Git 比较规范化后的文件名。

回答by el.nicko

The following put in ~/.gitconfig works for me on 10.12.1 Sierra for UTF-8 names:

在 ~/.gitconfig 中的以下内容适用于 10.12.1 Sierra 上的 UTF-8 名称:

precomposeunicode = true
quotepath = false

The first option is needed so that git 'understands' UTF-8 and the second one so that it doesn't escape the characters.

需要第一个选项,以便 git '理解' UTF-8 和第二个选项,以便它不会转义字符。

回答by pete

To make git add filework with umlauts in file names on Mac OS X, you may convert file path strings from composed into canonically decomposed UTF-8 using iconv.

git add file在 Mac OS X 上使用文件名中的变音符号,您可以使用iconv.

# test case

mkdir testproject
cd testproject

git --version    # git version 1.7.6.1
locale charmap   # UTF-8

git init
file=$'34berschrift.txt'    # composed UTF-8 (Linux-compatible)
touch "$file"
echo 'Hello, world!' > "$file"

# convert composed into canonically decomposed UTF-8
# cf. http://codesnippets.joyent.com/posts/show/12251
# printf '%s' "$file" | iconv -f utf-8 -t utf-8-mac | LC_ALL=C vis -fotc 
#git add "$file"
git add "$(printf '%s' "$file" | iconv -f utf-8 -t utf-8-mac)"  

git commit -a -m 'This is my commit message!'
git show
git status
git ls-files '*'
git ls-files -z '*' | tr '
git config core.precomposeunicode.true
' '\n' touch $'caf31 1' $'caf31 2' $'caf31 3' git ls-files --other '*' git ls-files -z --other '*' | tr '
git config --global core.precomposeunicode true
' '\n'

回答by user1338062

Change the repository's OSX-specific core.precomposeunicodeflag to true:

将存储库的 OSX 特定core.precomposeunicode标志更改为 true:

##代码##

To make sure new repositories get that flag, also run:

要确保新存储库获得该标志,还可以运行:

##代码##

Here is the relevant snippet from the manpage:

这是联机帮助页中的相关片段:

This option is only used by Mac OS implementation of Git. When core.precomposeunicode=true, Git reverts the unicode decomposition of filenames done by Mac OS. This is useful when sharing a repository between Mac OS and Linux or Windows. (Git for Windows 1.7.10 or higher is needed, or Git under cygwin 1.7). When false, file names are handled fully transparent by Git, which is backward compatible with older versions of Git.

此选项仅用于 Git 的 Mac OS 实现。当 core.precomposeunicode=true 时,Git 会还原 Mac OS 完成的文件名的 unicode 分解。这在 Mac OS 和 Linux 或 Windows 之间共享存储库时非常有用。(需要 Windows 1.7.10 或更高版本的 Git,或 cygwin 1.7 下的 Git)。当为 false 时,Git 处理文件名完全透明,这与旧版本的 Git 向后兼容。

回答by laalto

It is correct.

它是正确的。

Your filename is in UTF-8, ü being represented as LATIN CAPITAL LETTER U + COMBINING DIAERESIS (Unicode 0x0308, utf8 0xcc 0x88) instead of LATIN CAPITAL LETTER U WITH DIAERESIS (Unicode 0x00dc, utf8 0xc3 0x9c). The Mac OS X HFS file system decomposes Unicode in a such way. Gitin turn shows the octal-escape form of the non-ASCII filename bytes.

您的文件名是UTF-8, ü 表示为 LATIN CAPITAL LETTER U + COMBINING DIAERESIS (Unicode 0x0308, utf8 0xcc 0x88) 而不是 LATIN CAPITAL LETTER U WITH DIAERESIS (Unicode 0x00dc, utf8 0xc3)。在Mac OS X的HFS文件系统分解的Unicode在这样的方式Git依次显示非 ASCII 文件名字节的八进制转义形式。

Note that Unicode filenames can make your repository non-portable. For example, msysgit has had problems dealing with Unicode filenames.

请注意,Unicode 文件名会使您的存储库不可移植。例如,msysgit 在处理 Unicode 文件名时遇到了问题

回答by crysaz

I had similar problem with my personal repository, so I wrote a helper script with Python 3. You can grap it here: https://github.com/sjtoik/umlaut-cleaner

我的个人存储库遇到了类似的问题,所以我用 Python 3 编写了一个帮助脚本。你可以在这里获取它:https: //github.com/sjtoik/umlaut-cleaner

The script needs a bit of manual labour, but not much.

该脚本需要一些体力劳动,但并不多。