Scala 中的有效标识符字符

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/7656937/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-22 03:32:10  来源:igfitidea点击:

Valid identifier characters in Scala

scalaoperators

提问by Luigi Plinge

One thing I find quite confusing is knowing which characters and combinations I can use in method and variable names. For instance

我觉得很困惑的一件事是知道我可以在方法和变量名称中使用哪些字符和组合。例如

val #^ = 1 // legal
val #  = 1 // illegal
val +  = 1 // legal
val &+ = 1 // legal
val &2 = 1 // illegal
val £2 = 1 // legal
val ?  = 1 // legal

As I understand it, there is a distinction between alphanumeric identifiersand operator identifiers. You can mix an match one or the other but not both, unless separated by an underscore (a mixed identifier).

据我了解,字母数字标识符运算符标识符之间是有区别的。除非用下划线(混合标识符)分隔,否则您可以混合匹配一个或另一个,但不能同时使用两者。

From Programming in Scalasection 6.10,

来自Scala 编程第 6.10 节,

An operator identifier consists of one or more operator characters. Operator characters are printable ASCII characters such as +, :, ?, ~ or #.

More precisely, an operator character belongs to the Unicode set of mathematical symbols(Sm) or other symbols(So), or to the 7-bit ASCII characters that are not letters, digits, parentheses, square brackets, curly braces, single or double quote, or an underscore, period, semi-colon, comma, or back tick character.

一个操作符标识符由一个或多个操作符字符组成。运算符字符是可打印的 ASCII 字符,例如 +、:、?、~ 或 #。

更准确地说,运算符字符属于 Unicode 数学符号集 (Sm) 或其他符号 (So),或属于 7 位 ASCII 字符,这些字符不是字母、数字、圆括号、方括号、花括号、单引号或双引号引号、下划线、句点、分号、逗号或反勾号字符。

So we are excluded from using ()[]{}'"_.;,and `

所以我们被排除在使用()[]{}'"_.;,和`

I looked up Unicode mathematical symbols on Wikipedia, but the ones I found didn't include +, :, ?etc. Is there a definitive list somewhere of what the operator characters are?

我抬起头,统一的数学符号的维基百科,但我发现没有包括的那些+:?等有没有的运营商角色是什么最终列表地方?

Also, any ideas why Unicode mathematical operators (rather than symbols) do not count as operators?

另外,有什么想法为什么 Unicode 数学运算符(而不是符号)不算作运算符?

回答by huynhjl

Working from the EBNF syntax in the spec:

使用规范中的 EBNF 语法:

upper ::= ‘A' | ... | ‘Z' | ‘$' | ‘_' and Unicode category Lu
lower ::= ‘a' | ... | ‘z' and Unicode category Ll
letter ::= upper | lower and Unicode categories Lo, Lt, Nl
digit ::= ‘0' | ... | ‘9'
opchar ::= “all other characters in \u0020-007F and Unicode
            categories Sm, So except parentheses ([]) and periods”

But also taking into account the very beginning on Lexical Syntax that defines:

但也要考虑到定义的词法语法的最开始:

Parentheses ‘(' | ‘)' | ‘[' | ‘]' | ‘{' | ‘}'.
Delimiter characters ‘‘' | ‘'' | ‘"' | ‘.' | ‘;' | ‘,'

Here is what I come up with. Working by elimination in the range \u0020-007F, eliminating letters, digits, parentheses and delimiters, we have for opchar... (drumroll):

这是我想出的。通过在范围内\u0020-007F消除,消除字母、数字、括号和分隔符,我们有opchar...(鼓声):

! # % & * + - / : < = > ? @ \ ^ | ~and also Smand So- except for parentheses and periods.

! # % & * + - / : < = > ? @ \ ^ | ~还有SmSo- 除了括号和句号。

(Edit: adding valid examples here:). In summary, here are some valid examples that highlights all cases - watch out for \in the REPL, I had to escape as \\:

(编辑:在此处添加有效示例:)。总之,这里有一些突出所有情况的有效示例 -\在 REPL 中注意,我不得不转义为\\

val !#%&*+-/:<=>?@\^|~ = 1 // all simple opchars
val simpleName = 1 
val withDigitsAndUnderscores_ab_12_ab12 = 1 
val wordEndingInOpChars_!#%&*+-/:<=>?@\^|~ = 1
val !^?? = 1 // opchars ans symbols
val abcαβγ_!^?? = 1 // mixing unicode letters and symbols


Note 1:

注 1:

I found this Unicode category indexto figure out Lu, Ll, Lo, Lt, Nl:

我找到了这个 Unicode类别索引来弄清楚Lu, Ll, Lo, Lt, Nl

  • Lu (uppercase letters)
  • Ll (lowercase letters)
  • Lo (other letters)
  • Lt (titlecase)
  • Nl (letter numbers like roman numerals)
  • Sm (symbol math)
  • So (symbol other)
  • Lu(大写字母)
  • Ll(小写字母)
  • Lo(其他字母)
  • Lt (titlecase)
  • Nl(字母数字,如罗马数字)
  • Sm(符号数学)
  • 所以(符号其他)

Note 2:

笔记2:

val #^ = 1 // legal   - two opchars
val #  = 1 // illegal - reserved word like class or => or @
val +  = 1 // legal   - opchar
val &+ = 1 // legal   - two opchars
val &2 = 1 // illegal - opchar and letter do not mix arbitrarily
val £2 = 1 // working - £ is part of Sc (Symbol currency) - undefined by spec
val ?  = 1 // legal   - part of Sm

Note 3:

注 3:

Other operator-looking things that are reserved words: _ : = => <- <: <% >: # @and also \u21D2? and \u2190

其他类似操作符的保留字:_ : = => <- <: <% >: # @还有\u21D2? 和\u2190

回答by Didier Dupont

The language specification. gives the rule in Chapter 1, lexical syntax (on page 3):

语言规范。给出第 1 章词法语法(第 3 页)中的规则:

  1. Operator characters. These consist of all printable ASCII characters \u0020-\u007F. which are in none of the sets above, mathematical sym- bols(Sm) and other symbols(So).
  1. 运算符字符。这些由所有可打印的 ASCII 字符 \u0020-\u007F 组成。不属于上述集合的数学符号(Sm)和其他符号(So)。

This is basically the same as your extract of Programming in Programming in Scala. +is not an Unicode mathematical symbol, but it is definitely an ASCII printable characternot listed above (not a letter, including _ or $, a digit, a paranthesis, a delimiter).

这与您在 Scala 编程中的编程摘录基本相同。+不是 Unicode 数学符号,但它绝对是上面未列出的ASCII 可打印字符(不是字母,包括 _ 或 $、数字、括号、分隔符)。

In your list:

在您的列表中:

  1. # is illegal not because the character is not an operator character (#^ is legal), but because it is a reserved word (on page 4), for type projection.
  2. &2 is illegal because you mix an operator character & and a non-operator character, digit 2
  3. £2 is legal because £ is not an operator character: it is not a seven bit ASCII, but 8 bit extended ASCII. It is not nice, as $is not one either (it is considered a letter).
  1. # 是非法的,不是因为该字符不是运算符字符(#^ 是合法的),而是因为它是一个保留字(第 4 页),用于类型投影。
  2. &2 是非法的,因为你混合了一个运算符字符 & 和一个非运算符字符,数字 2
  3. £2 是合法的,因为 £ 不是运算符字符:它不是 7 位 ASCII,而是 8 位扩展 ASCII。它不好,$也不是一个(它被认为是一封信)。