Linux SVN 错误:无法将字符串从本机编码转换为“UTF-8”

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2116718/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-03 19:42:56  来源:igfitidea点击:

SVN Error: Can't convert string from native encoding to 'UTF-8'

linuxsvnversion-control

提问by Camsoft

I've got a post-commit hook script that performs a SVN update of a working copy when commits are made to the repository.

我有一个提交后钩子脚本,当提交到存储库时,它会执行工作副本的 SVN 更新。

When users commit to the repository from their Windows machines using TortoiseSVN they get the following error:

当用户使用 TortoiseSVN 从他们的 Windows 机器提交到存储库时,他们会收到以下错误:

post-commit hook failed (exit code 1) with output:
svn: Error converting entry in directory '/home/websites/devel/website/guides/Images' to UTF-8
svn: Can't convert string from native encoding to 'UTF-8':
svn: Teneriffa-S?5?8d.jpg

The file in question above is: Teneriffa-Süd.jpgnotice the accented u. This is because the site is German and the files have been spelt in German.

上面有问题的文件是:Teneriffa-Süd.jpg注意带重音的 u。这是因为该站点是德语并且文件是用德语拼写的。

When executing a update on the working copy at the Linux command-line no errors are encountered. The above error only exists when the post-commit hook is executed via a commit by a Windows SVN client.

在 Linux 命令行中对工作副本执行更新时,不会遇到任何错误。上述错误仅在 Windows SVN 客户端通过提交执行 post-commit 挂钩时存在。

Questions:

问题:

  1. Why would SVN try to change the encoding of a file?
  2. Are filenames allowed to contain chars that are outside the Windows standard ASCII ones?
  1. 为什么 SVN 会尝试更改文件的编码?
  2. 是否允许文件名包含 Windows 标准 ASCII 字符之外的字符?


Update:

更新:

It turns out that the file in question's filename correctly displays as Teneriffa-Süd.jpgwhen viewed from a Windows machine (via Samba) but when I view the filename from the Linux server (using SSH and PuTTY) where the file resides I get Teneriffa-S??d.jpg

事实证明,Teneriffa-Süd.jpg当从 Windows 机器(通过 Samba)查看时,有问题的文件的文件名正确显示,但是当我从文件所在的 Linux 服务器(使用 SSH 和 PuTTY)查看文件名时,我得到了Teneriffa-S??d.jpg

采纳答案by Bahbar

  1. It does not change the encoding of the file. It changes the encoding of the filename (to something that every client can hopefully understand).
  2. Allowed by whom ? NTFS uses 16-bit code points, and Windows can expose the file names in various encodings, based on how you ask for it (it will try to convert them to the encoding you ask for). Now... That bit (how you ask) depends on the specific svn client you use. It sounds to me like a bug in TortoiseSVN.
  1. 它不会更改文件的编码。它改变了文件名的编码(希望每个客户都能理解)。
  2. 谁允许的?NTFS 使用 16 位代码点,Windows 可以根据您要求的方式以各种编码公开文件名(它会尝试将它们转换为您要求的编码)。现在......那一点(你怎么问)取决于你使用的特定 svn 客户端。对我来说,这听起来像是 TortoiseSVN 中的一个错误。

Edit to add:

编辑添加:

Ugh. I misunderstood the symptoms. the svn server stores everything in utf-8 (and it seems that it did that successfully).

啊。我误解了症状。svn 服务器将所有内容存储在 utf-8 中(似乎它成功地做到了)。

The post-commit hook is the bit that fails to convert from UTF-8. If I understand what you're saying correctly, the post-commit hook on the server triggers an svn update to a shared drive (the svn server therefore starts an svn client to itself...) ? This means that the configuration that needs to be fixed is the one for the client on the server. Check the LANG / LC_ALL on the environment executing the svn server.. As it happens, the hooks are run in a vacuum environment(see Tip). So you should set the variable in the hook itself.

post-commit hook 是无法从 UTF-8 转换的位。如果我理解你的意思,服务器上的提交后钩子会触发对共享驱动器的 svn 更新(因此 svn 服务器启动了一个 svn 客户端到自己......)?这意味着需要修复的配置是服务器上客户端的配置在执行 svn 服务器的环境中检查 LANG / LC_ALL。. 碰巧的是,钩子是在真空环境中运行的(请参阅提示)。所以你应该在钩子本身中设置变量。

See also this pagefor info on how svn handles localisation

另请参阅此页面以获取有关 svn 如何处理本地化的信息

回答by Ignacio Vazquez-Abrams

  1. It changes the encoding to a location-neutral encoding in case someone with a differentencoding checks it out.

  2. Of course. But it's not "Windows" ASCII (Windows actually uses some strange encoding like CP1251 or so).

  1. 它将编码更改为位置无关的编码,以防有人使用不同的编码进行检查。

  2. 当然。但它不是“Windows”ASCII(Windows 实际上使用一些奇怪的编码,如 CP1251 左右)。

The best way to fix this is to make sure that your system uses UTF-8 whenever possible (check $LANG).

解决此问题的最佳方法是确保您的系统尽可能使用 UTF-8(检查$LANG)。

回答by n-sw-bit

Don't forget to generate those locales in your system
(as root)

example for Ru

不要忘记在您的系统中
(以 root 用户身份)

为 Ru生成这些语言环境

locale-gen ru_RU.CP1251
locale-gen ru_RU.UTF-8
dpkg-reconfigure locales

回答by wbszh

put this in your post-commit export LANG=xxxxx (your lang)

把它放在你的提交后导出 LANG=xxxxx (你的语言)

回答by Nitin Srivastava

If Error is -

如果错误是 -

[abc@288832-web3 public_html]$ svn update
svn: Error converting entry in directory 'images' to UTF-8
svn: Valid UTF-8 data
(hex: 46 65 6e 65 72 62 61 68)
followed by invalid UTF-8 sequence
(hex: e7 65 2b 46)

Then do this.

然后这样做。

[abc@288832-web3 public_html]$ printf "\x46\x65\x6e\x65\x72\x62\x61\x68\n"
Fenerbah  

(This means that the system has some file name starting with "Fenerbah" in that folder.)

(这意味着系统在该文件夹中有一些以“Fenerbah”开头的文件名。)

[abc@288832-web3 public_html]$ cd  images
[abc@288832-web3 images]$ rm -rf Fenerbah?e+Forma+2.jpg

So you can see that there is a special character in the name and it is not supported by SVN.

所以可以看到名称中有特殊字符,SVN不支持。

回答by Anonymous User

Yet another example:

再举一个例子:

$ svn update
svn: Error converting entry in directory '.' to UTF-8
svn: Can't convert string from native encoding to 'UTF-8':

$ export LC_CTYPE=en_US.UTF-8

$ svn update

(... and all is fine now)

(......现在一切都很好)

回答by Erik Aronesty

I got a similar problem when running "svn add" on a directory, but the solution was different. I couldn't see the "hex" digits using printf (actually no hex output was shown by svn), but this command allowed me to see the results, and fix it:

在目录上运行“svn add”时遇到了类似的问题,但解决方案不同。我无法使用 printf 看到“十六进制”数字(实际上 svn 没有显示十六进制输出),但是此命令允许我查看结果并修复它:

LC_ALL=C svn add probealign

I think, in general, sticking LC_ALL=C before your command allows you to see the offending files... and is a lot easier than pasting in a lot of \x72 stuff (which apparently may not be available).

我认为,一般来说,在您的命令之前粘贴 LC_ALL=C 可以让您看到有问题的文件......并且比粘贴很多 \x72 东西(显然可能不可用)容易得多。

回答by deenfirdoush

Just use the following line in your script before executing any svn command. User appropriate language codes, in following example I used japanese

在执行任何 svn 命令之前,只需在脚本中使用以下行。用户适当的语言代码,在下面的例子中我使用了日语

export LC_ALL=ja_JP.UTF8

回答by user5155137

It seems that all LC_ varables need .UTF8 at the end. For example, I happened to have LC_ALL, LC_TIME, and LC_CTYPE defined. After setting LC_CTYPE the problem was not solved, so I needed to type LC_ALL as well and then it worked:

似乎所有 LC_ 变量最后都需要 .UTF8。例如,我碰巧定义了 LC_ALL、LC_TIME 和 LC_CTYPE。设置 LC_CTYPE 后问题没有解决,所以我还需要输入 LC_ALL 然后它起作用了:

LC_ALL=en_US.UTF-8
LC_TIME=en_DK.UTF-8
LC_CTYPE=en_US.UTF-8

In order to avoid the problem again, I copied the file to a different name, removed the old one from svn, added new one to svn, and send a message to a collaborator not to do this.

为了避免再次出现问题,我将文件复制到了不同的名称,从 svn 中删除了旧文件,在 svn 中添加了新文件,并向合作者发送了一条消息,不要这样做。

回答by Capitaine DALLE

For information, I got this error on commit native encoding to 'UTF-8'with a windows client tortoise svn,

有关信息,我在native encoding to 'UTF-8'使用 Windows 客户端 tortoise svn提交时遇到此错误,

when my URL of repository was :

当我的存储库 URL 是:

http://x.x.x.x/svn/myrepos

http://xxxx/svn/myrepos

I changed my URL of repository for :

我更改了存储库的 URL 为:

svn://x.x.x.x/myrepos

svn://xxxx/myrepos

and now all is perferct.

现在一切都很完美。

I think this information will be useful to some.

我认为这些信息对某些人有用。