在 Scala 中检查字符是否为 ASCII 字母 (aZ) 的优雅方法是什么?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/15439765/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
What is an elegant way to check if a character is an ASCII letter (a-Z) in Scala?
提问by r0estir0bbe
I am currently working with Scanners and Parsers and need a Parser that accepts characters that are ASCII letters - so I can't use char.isLetter.
我目前正在使用扫描器和解析器,需要一个接受 ASCII 字母字符的解析器 - 所以我不能使用char.isLetter.
I came up with two solutions myself. I don't like both of them.
我自己想出了两个解决方案。我不喜欢他们两个。
Regex
正则表达式
def letter = elem("ascii letter", _.toString.matches("""[a-zA-Z]"""))
This seems rather "overkill" to check such a simple thing with a regex.
用正则表达式检查这么简单的事情似乎相当“矫枉过正”。
Range check
范围检查
def letter = elem("ascii letter", c => ('A' <= c && c <= 'Z') || ('a' <= c && c <= 'z'))
In my opinion, this would be the way to go in Java. But it's not really readable.
在我看来,这将是 Java 的发展方向。但它不是真的可读。
Is there a cleaner, more Scala-like solution to this problem? I do not really worry about performance, as it doesn't matter in this case.
这个问题有更干净、更像 Scala 的解决方案吗?我并不真正担心性能,因为在这种情况下无关紧要。
回答by DaoWen
You say you can't use Char.isLetterbecause you only want ASCII letters. Why not just restrict it to the 7-bit ASCII character range?
你说你不能使用,Char.isLetter因为你只想要 ASCII 字母。为什么不将其限制在 7 位 ASCII 字符范围内?
def isAsciiLetter(c: Char) = c.isLetter && c <= 'z'
If the reader wants to check for ASCII including non-letters then:
如果读者想检查包含非字母的 ASCII 码,则:
def isAscii(c: Char) = c.toInt <= 127
回答by Reimer Behrends
Regardless of what you choose in the end, I suggest abstracting out the definition of "is an ASCII letter" for readability and performance. E.g.:
不管你最终选择什么,为了可读性和性能,我建议抽象出“是一个 ASCII 字母”的定义。例如:
object Program extends App {
implicit class CharProperties(val ch: Char) extends AnyVal {
def isASCIILetter: Boolean =
(ch >= 'a' && ch <= 'z') || (ch >= 'A' && ch <= 'Z')
}
println('x'.isASCIILetter)
println('0'.isASCIILetter)
}
Or if you want to describe ASCII letters as a set:
或者,如果您想将 ASCII 字母描述为一组:
object Program extends App {
object CharProperties {
val ASCIILetters = ('a' to 'z').toSet ++ ('A' to 'Z').toSet
}
implicit class CharProperties(val ch: Char) extends AnyVal {
def isASCIILetter: Boolean =
CharProperties.ASCIILetters.contains(ch)
}
println('x'.isASCIILetter)
println('0'.isASCIILetter)
}
Once you're using an explicit function with an understandable name, your intent should be clear either way and you can choose the implementation with the better performance (though any performance differences between the two versions above should be rather minimal).
一旦您使用具有可理解名称的显式函数,您的意图应该很清楚,并且您可以选择具有更好性能的实现(尽管上述两个版本之间的任何性能差异应该很小)。
回答by om-nom-nom
Second one could be written as:
第二个可以写成:
def letter = elem("ascii letter", c => ('a' to 'z') ++ ('A' to 'Z') contains c)
It is more readable, but less performant.
它更具可读性,但性能较差。
Or, if you're terrified with ++, as barely plain english:
或者,如果你害怕++, 作为几乎不简单的英语:
c => ('a' to 'z') union ('A' to 'Z') contains c
回答by michael_s
Another - well - elegant solution could be using min/max:
另一个 - 很好 - 优雅的解决方案可能是使用最小/最大:
c => 'A'.max(c.toUpper) == 'Z'.min(c.toUpper)
or
或者
c => 'A'.max(c) == 'Z'.min(c) || 'a'.max(c) == 'z'.min(c)

