java Java中文件名的编码

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/10106161/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-30 23:38:59  来源:igfitidea点击:

Encoding of file names in Java

javaencodingjvmopenjdk

提问by Roland Brand

I am running a small Java application on an embedded Linux platform. After replacing the Java VM JamVM with OpenJDK, file names with special characters are not stored correctly. Special characters like umlauts are replaced by question marks.

我在嵌入式 Linux 平台上运行一个小型 Java 应用程序。用OpenJDK替换Java VM JamVM后,带有特殊字符的文件名存储不正确。像元音变音这样的特殊字符被问号代替。

Here is my test code:

这是我的测试代码:

import java.io.File;
import java.io.IOException;

public class FilenameEncoding
{

        public static void main (String[] args) {
                String name = "umlaute-??ü";
                System.out.println("\nname = " + name);
                System.out.print("name in Bytes: ");
                for (byte b : name.getBytes()) {
                        System.out.print(Integer.toHexString(b & 255) + " ");
                }
                System.out.println();

                try {
                        File f = new File(name);
                        f.createNewFile();
                } catch (IOException e) {
                        e.printStackTrace();
                }
        }

}

Running it gives the following output:

运行它会给出以下输出:

name = umlaute-???
name in Bytes: 75 6d 6c 61 75 74 65 2d 3f 3f 3f

and file called umlaute-??? is created.

和文件名为 umlaute-??? 被建造。

Setting the properties file.encoding and sun.jnu.encoding to UTF-8 gives the correct strings in the terminal, but the created file is still umlaute-???

将属性 file.encoding 和 sun.jnu.encoding 设置为 UTF-8 会在终端中提供正确的字符串,但创建的文件仍然是 umlaute-???

Running the VM with strace, I can see the system call

用strace运行VM,可以看到系统调用

open("umlaute-???", O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0666) = 4

This shows, that the problem is not a file system issue, but one of the VM.

这表明,问题不是文件系统问题,而是 VM 之一。

How can the encoding of the file name be set?

如何设置文件名的编码?

回答by Youssef G.

If you are using Eclipse, then you can go to Window->Preferences->General->Workspace and select the "Text file encoding" option you want from the pull down menu. By changing mine around, I was able to recreate your problem (and also change back to the fix).

如果您使用的是 Eclipse,那么您可以转到 Window->Preferences->General->Workspace 并从下拉菜单中选择您想要的“文本文件编码”选项。通过改变我的问题,我能够重现您的问题(并且还改回修复程序)。

If you are not, then you can add an environmental variable to windows (System properties->Environment Variables and under system variables you want to select New...) The name should be (without quotes) JAVA_TOOL_OPTIONSand the value should be set to -Dfile.encoding=UTF8(or whatever encoding will get yours to work.

如果不是,那么您可以向窗口添加一个环境变量(系统属性->环境变量,在系统变量下您要选择新建...)名称应为(不带引号)JAVA_TOOL_OPTIONS,值应设置为-Dfile.encoding=UTF8(或者任何编码都会让你的工作。

I found the answer through this post, btw: Setting the default Java character encoding?

我通过这篇文章找到了答案,顺便说一句: 设置默认的 Java 字符编码?

Linux Solutions

Linux 解决方案

-(Permanent) Using env | grep LANGin the terminal will give you one or two responses back on what encoding linux is currently setup with. You can then set LANG to UTF8 (yours might be set to ASCII) in the /etc/sysconfig i18n file (I tested this on 2.6.40 fedora). Bascially, I switched from UTF8 (where I had odd characters) to ASCII (where I had question marks) and back.

-(永久)env | grep LANG在终端中使用会给你一个或两个关于当前设置的 linux 编码的响应。然后,您可以在 /etc/sysconfig i18n 文件(我在 2.6.40 fedora 上测试过)中将 LANG 设置为 UTF8(您的可能设置为 ASCII)。基本上,我从 UTF8(我有奇数字符)切换到 ASCII(我有问号)然后返回。

-(on running the JVM, but may not fix the problem) You can start the JVM with the encoding you want using java -Dfile.encoding=**** FilenameEncoding Here is the output from the two ways:

-(在运行 JVM 时,但可能无法解决问题)您可以使用 java -Dfile.encoding=**** FilenameEncoding 以您想要的编码启动 JVM 以下是两种方式的输出:

[youssef@JoeLaptop bin]$ java -Dfile.encoding=UTF8 FilenameEncoding

name = umlaute-???
name in Bytes: 75 6d 6c 61 75 74 65 2d d7 94 d7 a6 ef bf bd 
UTF-8
UTF8

[youssef@JoeLaptop bin]$ java FilenameEncoding

name = umlaute-???????
name in Bytes: 75 6d 6c 61 75 74 65 2d 3f 3f 3f 3f 3f 3f 3f 
US-ASCII
ASCII

Here is some references for the linux stuff http://www.cyberciti.biz/faq/set-environment-variable-linux/

这是 linux 的一些参考资料 http://www.cyberciti.biz/faq/set-environment-variable-linux/

and here is one about the -Dfile.encoding Setting the default Java character encoding?

这是一个关于 -Dfile.encoding 设置默认 Java 字符编码的内容?

回答by Stefan A

I know it's an old question but I had the same problem. All of the mentioned solutions did not work for me, but the following solved it:

我知道这是一个老问题,但我遇到了同样的问题。所有提到的解决方案对我都不起作用,但以下解决了它:

  • Source encoding to UTF8 (project.build.sourceEncoding to UTF-8 in maven properties)
  • Program arguments: -Dfile.encoding=utf8 and -Dsun.jnu.encoding=utf8
  • Using java.nio.file.Path instead of java.io.File
  • 源编码为 UTF8(在 maven 属性中 project.build.sourceEncoding 为 UTF-8)
  • 程序参数:-Dfile.encoding=utf8 和 -Dsun.jnu.encoding=utf8
  • 使用 java.nio.file.Path 而不是 java.io.File

回答by Christoffer Hammarstr?m

Your problem is that javacis expecting a different encoding for your .java-file than you have saved it as. Didn't javacwarn you when you compiled?

您的问题是javac期望您的.java-file 的编码与您将其保存为不同的编码。javac编译的时候没提示吗?

Maybe you have saved it with encoding ISO-8859-1or windows-1252, and javacis expecting UTF-8.

也许您已经使用编码ISO-8859-1或保存了它windows-1252,并且javac正在期待UTF-8.

Provide the correct encoding to javacwith the -encodingflag, or the equivalent for your build tool.

javac使用-encoding标志提供正确的编码,或您的构建工具的等效编码。