Java 检查字符串是否只包含拉丁字符?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1911902/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-12 23:48:13  来源:igfitidea点击:

Check String whether it contains only Latin characters?

javastringvalidationgwt

提问by Ashika Umanga Umagiliya

Greetings,

你好,

I am developing GWT application where user can enter his details in Japanese. But the 'userid' and 'password' should only contain English characters(Latin Alphabet). How to validate Strings for this?

我正在开发 GWT 应用程序,用户可以在其中用日语输入他的详细信息。但是'userid' 和'password' 应该只包含英文字符(拉丁字母)。如何为此验证字符串?

采纳答案by BalusC

You can use String#matches()with a bit regexfor this. Latin characters are covered by \w.

您可以为此使用String#matches()一点正则表达式。拉丁字符由\w.

So this should do:

所以这应该做:

boolean valid = input.matches("\w+");

This by the way also covers numbers and the underscore _. Not sure if that harms. Else you can just use [A-Za-z]+instead.

顺便说一下,这也涵盖了数字和下划线_。不确定这是否有害。否则,您可以[A-Za-z]+改为使用。

If you want to cover diacritical charactersas well (?, é, ò, and so on, those are per definition also Latin characters), then you need to normalize them first and get rid of the diacritical marks before matching, simply because there's no (documented)regex which covers diacriticals.

如果您还想涵盖变音字符(?、é、ò 等,这些都是根据定义也是拉丁字符),那么您需要先对它们进行规范化并在匹配之前去掉变音符号,因为没有(已记录)正则表达式,涵盖变音符号。

String clean = Normalizer.normalize(input, Form.NFD).replaceAll("\p{InCombiningDiacriticalMarks}+", "");
boolean valid = clean.matches("\w+");

Update: there's an undocumented regex in Java which covers diacriticals as well, the \p{L}.

更新:Java 中有一个未记录的正则表达式,它也涵盖了变音符号,\p{L}.

boolean valid = input.matches("\p{L}+");

Above works at Java 1.6.

以上适用于 Java 1.6。

回答by BalusC

There might be a better approach, but you could load a collection with whatever you deem to be acceptable characters, and then check each character in the username/password field against that collection.

可能有更好的方法,但您可以使用您认为可接受的字符加载一个集合,然后根据该集合检查用户名/密码字段中的每个字符。

Pseudo:

伪:


foreach (character in username)
{
    if !allowedCharacters.contains(character)
    {
        throw exception
    }
}

回答by erickson

For something this simple, I'd use a regular expression.

对于这么简单的事情,我会使用正则表达式。

private static final Pattern p = Pattern.compile("\p{Alpha}+");

static boolean isValid(String input) {
  Matcher m = p.matcher(input);
  return m.matches();
}

There are other pre-defined classes like \wthat might work better.

还有其他类似的预定义类\w可能会更好。

回答by erickson

public static boolean isValidISOLatin1 (String s) {
    return Charset.forName("US-ASCII").newEncoder().canEncode(s);
} // or "ISO-8859-1" for ISO Latin 1

For reference, see the documentation on Charset.

如需参考,请参阅有关 Charset文档

回答by fir99

I successfully used a combination of the answers of user232624, Joachim Sauerand Tvaroh:

我成功地结合了 user232624、Joachim SauerTvaroh的答案:

static CharsetEncoder asciiEncoder = Charset.forName("US-ASCII"); // or "ISO-8859-1" for ISO Latin 1

boolean isValid(String input) {    
    return Character.isLetter(ch) && asciiEncoder.canEncode(username);
}

回答by Aleksey Timoshchenko

There is my solution and it is working excellent

有我的解决方案,它运行良好

public static boolean isStringContainsLatinCharactersOnly(final String iStringToCheck)
{
    return iStringToCheck.matches("^[a-zA-Z0-9.]+$");
}