Java Unicode 变量名
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1422655/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Java Unicode variable names
提问by pg-robban
I got into an interesting discussion in a forum where we discussed the naming of variables.
我在一个讨论变量命名的论坛上进行了有趣的讨论。
Conventions aside, I noticed that it is legal for a variable to have the name of a Unicode character, for example the following is legal:
撇开约定不谈,我注意到变量具有 Unicode 字符的名称是合法的,例如以下是合法的:
int \u1234;
However, if I for example gave it the name #, it produces an error. According to Sun's tutorialit is valid if "beginning with a letter, the dollar sign "$", or the underscore character "_"."
但是,例如,如果我给它命名为#,则会产生错误。根据Sun 的教程,如果“以字母、美元符号“$”或下划线字符“_”开头”,则它是有效的。
But the unicode 1234 is some Ethiopic character. So what is really defined as a "letter"?
但是 unicode 1234 是一些埃塞俄比亚字符。那么什么才是真正意义上的“信”呢?
回答by Jon Skeet
The Unicode standard defines what counts as a letter.
Unicode 标准定义了什么算作一个字母。
From the Java Language Specification, section 3.8:
Letters and digits may be drawn from the entire Unicode character set, which supports most writing scripts in use in the world today, including the large sets for Chinese, Japanese, and Korean. This allows programmers to use identifiers in their programs that are written in their native languages.
A "Java letter" is a character for which the method Character.isJavaIdentifierStart(int) returns true. A "Java letter-or-digit" is a character for which the method Character.isJavaIdentifierPart(int) returns true.
字母和数字可以从整个 Unicode 字符集中提取,它支持当今世界上使用的大多数书写脚本,包括中文、日文和韩文的大集。这允许程序员在用他们的母语编写的程序中使用标识符。
“Java 字母”是 Character.isJavaIdentifierStart(int) 方法为其返回 true 的字符。“Java 字母或数字”是 Character.isJavaIdentifierPart(int) 方法为其返回 true 的字符。
From the Characterdocumenation for isJavaIdentifierPart:
从Character文档中isJavaIdentifierPart:
Determines if the character (Unicode code point) may be part of a Java identifier as other than the first character. A character may be part of a Java identifier if any of the following are true:
- it is a letter
- it is a currency symbol (such as '$')
- it is a connecting punctuation character (such as '_')
- it is a digit
- it is a numeric letter (such as a Roman numeral character)
- it is a combining mark
- it is a non-spacing mark
- isIdentifierIgnorable(codePoint) returns true for the character
确定字符(Unicode 代码点)是否可以作为第一个字符以外的 Java 标识符的一部分。如果以下任一情况为真,则字符可能是 Java 标识符的一部分:
- 这是一封信
- 它是一个货币符号(例如“$”)
- 它是一个连接标点符号(例如'_')
- 它是一个数字
- 它是一个数字字母(如罗马数字字符)
- 这是一个组合标记
- 它是一个非间距标记
- isIdentifierIgnorable(codePoint) 为字符返回真
回答by Vinay Sajip
Unicode characters fall into character classes. There's a set of Unicode characters which fall into the class "letter".
Unicode 字符属于字符类。有一组 Unicode 字符属于“字母”类。
Determined by Character.isLetter(c)for Java. But for identifiers, Character.isJavaIdentifierStart(c)and Character.isJavaIdentifierPart(c)are more relevant.
由Character.isLetter(c)for Java确定。但是对于标识符,Character.isJavaIdentifierStart(c)和Character.isJavaIdentifierPart(c)是更相关的。
For the relevant Unicode spec, see this.
有关相关的 Unicode 规范,请参阅此。

