bash 如何将文件名从 unicode 转换为 ascii

Question

提问by zedwarth

I have a bunch of music files on a NTFS partition mounted on linux that have filenames with unicode characters. I'm having trouble writing a script to rename the files so that all of the file names use only ASCII characters. I think that using the iconvcommand should work, but I'm having trouble escaping the characters for the 'mv'command.

我在 Linux 上安装的 NTFS 分区上有一堆音乐文件，文件名带有 unicode 字符。我在编写脚本来重命名文件时遇到问题，以便所有文件名都只使用 ASCII 字符。我认为使用该iconv命令应该可以工作，但是我无法转义该'mv'命令的字符。

EDIT: It doesn't matter if there isn't a direct translieration for the unicode chars. I guess that i'll just replace those with a "?" character.

编辑：如果没有对 unicode 字符的直接转译并不重要。我想我会用“？”替换它们。特点。

Answer 1

采纳答案by Thanatos

I don't think iconvhas any character replacement facilities. This in Python might help:

我认为iconv没有任何角色替换设施。这在 Python 中可能会有所帮助：

#!/usr/bin/python
import sys

def unistrip(s):
    if isinstance(s, str):
        s = s.decode('utf-8')
    chars = []
    for i in s:
        if ord(i) > 0x7f:
            chars.append(u'?')
        else:
            chars.append(i)
    return u''.join(chars)

if __name__ == '__main__':
    print unistrip(sys.argv[1])

Then call as:

然后调用为：

$ ./unistrip.py "yikes__oh_look_a_file_火"
yikes_?_oh_look_a_file_?

Also:

还：

$ mv "yikes__oh_look_a_file_火" "`./unistrip.py "yikes__oh_look_a_file_火"`"

You might test it a bit first. For large move operations, generating a list of mvcommands (ie, write code to write a script) is advisable, as you can look over the move commands before telling them to execute.

你可以先测试一下。对于大型移动操作，建议生成mv命令列表（即编写代码以编写脚本），因为您可以在告诉它们执行之前查看移动命令。

Answer 2

回答by Hefnawi

Sometimes mvwill not be able to read the filename in a shell, so you can try the inodereference.

有时mv会无法在shell中读取文件名，因此您可以尝试inode参考。

To get the inode of a file:

要获取文件的 inode：

$ ls -il

Output will be something like this:

输出将是这样的：

13377799 -rw-r--r--  1 draco  draco      11809 Apr 25 01:39 some_filename.ext
9340462  -rw-r--r--  1 draco  draco      81648 Apr 23 02:27 some_strange_filename.ext
9340480  -rw-r--r--  1 draco  draco       4717 Apr 23 03:54 yikes__oh_look_a_file_火

Then use findto get your file and perhaps using the python code by Thanatos:

然后用于find获取您的文件，并可能使用 Thanatos 的 python 代码：

$ find . -inum 9340480 -exec ./unistrip.py {} \;

You could also use the above command with iconvin a shell.

您也可以iconv在 shell 中使用上述命令。

Hope this helps someone out, and excuse me for any mistakes[first answer].

希望这对某人有所帮助，并请原谅我的任何错误[第一个答案]。

Answer 3

回答by Florian Diesch

convmvis a good Perl script to convert file name encodings. But it can't handle characters that aren't in the destination encoding.

convmv是一个很好的 Perl 脚本来转换文件名编码。但它无法处理不在目标编码中的字符。

You can change any character not in ASCII to '?' using the rename utility distributed with Perl:

您可以将任何非 ASCII 字符更改为 '?' 使用随 Perl 分发的重命名实用程序：

rename 's/[^ -~]/?/g' *

Unfortunately this replaces multi-byte characters with multiple '?'s. Depending on the Unicode encoding that is used and the characters involved changing the regex may help, e.g.

不幸的是，这用多个“?”替换了多字节字符。根据所使用的 Unicode 编码和所涉及的字符更改正则表达式可能会有所帮助，例如

rename 's/[^ -~]{2}/?/g' *

for 2-byte characters.

对于 2 字节字符。

bash 如何将文件名从 unicode 转换为 ascii

提问by zedwarth

采纳答案by Thanatos

回答by Hefnawi

回答by Florian Diesch

相关推荐

最近更新

标签

bash 如何将文件名从 unicode 转换为 ascii

提问by zedwarth

采纳答案by Thanatos

回答by Hefnawi

回答by Florian Diesch

相关推荐

bash 存储 os.system 或 os.popen 的值

bash shell命令查找进程ID并附加到它？

bash 比较两个文件并获得相同行的输出

正则表达式在 bash 中查找和复制（保留文件夹结构）？

相关推荐

最近更新

标签