Java - 检查 STRING 是否仅包含某些字符的最佳方法是什么?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/26555346/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Java - what is the best way to check if a STRING contains only certain characters?
提问by Victor2748
I have this problem: I have a String
, but I need to make sure that it onlycontains letters A-Zand numbers 0-9. Here is my current code:
我有这个问题:我有一个String
,但我需要确保它只包含字母AZ和数字0-9。这是我当前的代码:
boolean valid = true;
for (char c : string.toCharArray()) {
int type = Character.getType(c);
if (type == 2 || type == 1 || type == 9) {
// the character is either a letter or a digit
} else {
valid = false;
break;
}
}
But what is the best and the most efficient way to implement it?
但是,实现它的最佳和最有效的方法是什么?
采纳答案by Michael Krause
Since no one else has worried about "fastest" yet, here is my contribution:
由于没有其他人担心“最快”,这是我的贡献:
boolean valid = true;
char[] a = s.toCharArray();
for (char c: a)
{
valid = ((c >= 'a') && (c <= 'z')) ||
((c >= 'A') && (c <= 'Z')) ||
((c >= '0') && (c <= '9'));
if (!valid)
{
break;
}
}
return valid;
Full test code below:
完整的测试代码如下:
public static void main(String[] args)
{
String[] testStrings = {"abcdefghijklmnopqrstuvwxyz0123456789", "", "00000", "abcdefghijklmnopqrstuvwxyz0123456789&", "1", "q", "test123", "(#*$))&v", "ABC123", "hello", "supercalifragilisticexpialidocious"};
long startNanos = System.nanoTime();
for (String testString: testStrings)
{
isAlphaNumericOriginal(testString);
}
System.out.println("Time for isAlphaNumericOriginal: " + (System.nanoTime() - startNanos) + " ns");
startNanos = System.nanoTime();
for (String testString: testStrings)
{
isAlphaNumericFast(testString);
}
System.out.println("Time for isAlphaNumericFast: " + (System.nanoTime() - startNanos) + " ns");
startNanos = System.nanoTime();
for (String testString: testStrings)
{
isAlphaNumericRegEx(testString);
}
System.out.println("Time for isAlphaNumericRegEx: " + (System.nanoTime() - startNanos) + " ns");
startNanos = System.nanoTime();
for (String testString: testStrings)
{
isAlphaNumericIsLetterOrDigit(testString);
}
System.out.println("Time for isAlphaNumericIsLetterOrDigit: " + (System.nanoTime() - startNanos) + " ns");
}
private static boolean isAlphaNumericOriginal(String s)
{
boolean valid = true;
for (char c : s.toCharArray())
{
int type = Character.getType(c);
if (type == 2 || type == 1 || type == 9)
{
// the character is either a letter or a digit
}
else
{
valid = false;
break;
}
}
return valid;
}
private static boolean isAlphaNumericFast(String s)
{
boolean valid = true;
char[] a = s.toCharArray();
for (char c: a)
{
valid = ((c >= 'a') && (c <= 'z')) ||
((c >= 'A') && (c <= 'Z')) ||
((c >= '0') && (c <= '9'));
if (!valid)
{
break;
}
}
return valid;
}
private static boolean isAlphaNumericRegEx(String s)
{
return Pattern.matches("[\dA-Za-z]+", s);
}
private static boolean isAlphaNumericIsLetterOrDigit(String s)
{
boolean valid = true;
for (char c : s.toCharArray()) {
if(!Character.isLetterOrDigit(c))
{
valid = false;
break;
}
}
return valid;
}
Produces this output for me:
为我产生这个输出:
Time for isAlphaNumericOriginal: 164960 ns
Time for isAlphaNumericFast: 18472 ns
Time for isAlphaNumericRegEx: 1978230 ns
Time for isAlphaNumericIsLetterOrDigit: 110315 ns
回答by M Anouti
Use a regular expression:
使用正则表达式:
Pattern.matches("[\dA-Z]+", string)
[\\dA-Z]+
: At least one occurrence (+) of digits or uppercase letters.
[\\dA-Z]+
:至少出现一次 (+) 数字或大写字母。
If you want to include lowercase letter, replace [\\dA-Z]+
with [\\dA-Za-z]+
.
如果要包含小写字母,请替换[\\dA-Z]+
为[\\dA-Za-z]+
.
回答by But I'm Not A Wrapper Class
If you want to avoid regex, then the Character
class can help:
如果您想避免使用正则表达式,那么该Character
课程可以提供帮助:
boolean valid = true;
for (char c : string.toCharArray()) {
if(!Character.isLetterOrDigit(c))
{
valid = false;
break;
}
}
If you care about being upper case, then do below if statement instead:
如果您关心大写,请改为执行以下 if 语句:
if(!((Character.isLetter(c) && Character.isUpperCase(c)) || Character.isDigit(c)))
回答by Christoph Lühr
You could use Apache Commons Lang:
您可以使用 Apache Commons Lang:
StringUtils.isAlphanumeric(String)
回答by Hannes
The best way in sense of maintainability and simplicity is the already posted regular expression. Once familiar the this technic you know what to expect and it is very easy to widen the criteria if needed. Downside of this is the performance.
从可维护性和简单性的角度来看,最好的方法是已经发布的正则表达式。一旦熟悉了这项技术,您就会知道会发生什么,并且在需要时很容易扩大标准。这样做的缺点是性能。
The fastest way to go is the Array approach. Checking if a character's numerical value falls in the wanted range ASCII A-Z and 0-9 is nearly speed of light. But the maintainability is bad. Simplicity gone.
最快的方法是数组方法。检查字符的数值是否落在所需的 ASCII AZ 范围内,0-9 几乎是光速。但是可维护性很差。简单没了。
You could use and java 7 switch case with char approach but that's just as bad as the second.
你可以使用和 java 7 switch case with char 方法,但这和第二个一样糟糕。
In the end, since we are talking about java, I would strongly suggest to use regular expressions.
最后,既然我们在谈论java,我强烈建议使用正则表达式。
回答by Kirill Rakhman
Additionally to all the other answers, here's a Guava approach:
除了所有其他答案之外,这里还有一个番石榴方法:
boolean valid = CharMatcher.JAVA_LETTER_OR_DIGIT.matchesAllOf(string);
More on CharMatcher: https://code.google.com/p/guava-libraries/wiki/StringsExplained#CharMatcher
更多关于 CharMatcher:https: //code.google.com/p/guava-libraries/wiki/StringsExplained#CharMatcher
回答by Pier-Alexandre Bouchard
The following way is not as fast as Regular expression to implement but is one of the most efficient solution (I think) because it use bitwise operations which are really fast.
以下方式不如正则表达式实现快,但它是最有效的解决方案之一(我认为),因为它使用非常快的按位运算。
My solution is more complex and harder to read and maintain but I think it is another simple way to do what you want.
我的解决方案更复杂,更难阅读和维护,但我认为这是另一种简单的方法来做你想做的事。
A good way to test that a string only contains numbers or capital letters is with a simple 128 bits bitmask
(2 Longs) representing the ASCII table.
测试一个字符串是否只包含数字或大写字母的一个好方法是用一个简单的128 bits bitmask
(2 个 Longs)表示 ASCII 表。
So, For the standard ASCII table, there's a 1 on every character we want to keep (bit 48 to 57 and bit 65 to 90)
因此,对于标准 ASCII 表,我们要保留的每个字符都有一个 1(位 48 到 57 和位 65 到 90)
Thus, you can test that a char is a:
因此,您可以测试一个字符是否为:
- Number with this mask:
0x3FF000000000000L
(if the character code < 65) - Uppercase letter with this mask:
0x3FFFFFFL
(if the character code >=65)
- 带有此掩码的数字:(
0x3FF000000000000L
如果字符代码 < 65) - 带此掩码的大写字母:(
0x3FFFFFFL
如果字符代码 >=65)
So the following method should work:
所以下面的方法应该有效:
public boolean validate(String aString) {
for (int i = 0; i < aString.length(); i++) {
char c = aString.charAt(i);
if ((c <= 64) & ((0x3FF000000000000L & (1L << c)) == 0)
| (c > 64) & ((0x3FFFFFFL & (1L << (c - 65))) == 0)) {
return false;
}
}
return true;
}
回答by Mikael Vandmo
StringUtils in Apache Commons Lang 3 has a containsOnly method, https://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/StringUtils.html
Apache Commons Lang 3 中的 StringUtils 有一个 containsOnly 方法,https://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/StringUtils.html
The implementation should be fast enough.
实施应该足够快。