Java 中的 Character.isAlphabetic 和 Character.isLetter 有什么区别?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/18304804/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
What is the difference between Character.isAlphabetic and Character.isLetter in Java?
提问by Simon Kissane
What is the difference between Character.isAlphabetic() and Character.isLetter() in Java? When should one use one and when should one use the other?
Java 中的 Character.isAlphabetic() 和 Character.isLetter() 有什么区别?什么时候应该使用一种,什么时候应该使用另一种?
采纳答案by Simon Kissane
According to the API docs, isLetter() returns true if the character has any of the following general category types: UPPERCASE_LETTER (Lu), LOWERCASE_LETTER (Ll), TITLECASE_LETTER (Lt), MODIFIER_LETTER (Lm), OTHER_LETTER (Lo). If we compare isAlphabetic(), it has the same but adds LETTER_NUMBER (Nl), and also any characters having Other_Alphabetic property.
根据API 文档,如果字符具有以下任何一般类别类型,则 isLetter() 返回 true:UPPERCASE_LETTER (Lu)、LOWERCASE_LETTER (Ll)、TITLECASE_LETTER (Lt)、MODIFIER_LETTER (Lm)、OTHER_LETTER (Lo)。如果我们比较 isAlphabetic(),它具有相同但添加 LETTER_NUMBER (Nl),以及具有 Other_Alphabetic 属性的任何字符。
What does this mean in practice? Every letter is alphabetic, but not every alphabetic is a letter - in Java 7 (which uses Unicode 6.0.0), there are 824 characters in the BMP which are alphabetic but not letters. Some examples include 0345 (a combiner used in polytonic Greek), Hebrew vowel points (niqqud) starting at 05B0, Arabic honorifics such as saw ("peace be upon him") at 0610, Arabic vowel points... the list goes on.
这在实践中意味着什么?每个字母都是字母,但并非每个字母都是字母 - 在 Java 7(使用 Unicode 6.0.0)中,BMP 中有 824 个字符是字母但不是字母。一些例子包括 0345(多调希腊语中使用的组合器)、从 05B0 开始的希伯来语元音点 (niqqud)、阿拉伯语敬语,如 0610 处的看到(“愿他平安”)、阿拉伯语元音点……不胜枚举。
But basically, for English text, the distinction makes no difference. For some other languages, the distinction might make a difference, but it is hard to predict in advance what the difference might be in practice. If one has a choice, the best answer may be isLetter() - one can always change to permit additional characters in the future, but reducing the set of accepted characters might be harder.
但基本上,对于英文文本,区别没有区别。对于其他一些语言,区别可能会有所不同,但很难提前预测在实践中可能会有什么区别。如果有一个选择,最好的答案可能是 isLetter() - 人们可以随时更改以允许将来添加更多字符,但减少接受的字符集可能会更难。