java 正则表达式查找“姓氏,名字中间名”格式
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/25801247/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Regular Expression to find "lastname, firstname middlename" format
提问by A Paul
I am trying to find the format "abc, def g" which is a name format "lastname, firstname middlename". I think the best suited method is regex but I do not have any idea in Regex. I tried doing some learning in regex and tried some expression also but no luck. One additional point there may be more than one spaces between the words.
我试图找到格式“abc,def g”,这是一种名称格式“姓氏,名字中间名”。我认为最适合的方法是正则表达式,但我对正则表达式没有任何想法。我尝试在正则表达式中进行一些学习,也尝试了一些表达式,但没有运气。另外一点,单词之间可能有多个空格。
This is what I tried. But this is not working.
这是我尝试过的。但这行不通。
(([A-Z][,]\s?)*([A-Z][a-z]+\s?)+([A-Z]\s?[a-z]*)*)
Need help ! Any idea how I can do this so that only the above expression match.
需要帮忙 !知道如何做到这一点,以便只有上述表达式匹配。
Thanks !
谢谢 !
ANSWER
回答
Finally I am using
最后我正在使用
([A-Za-z]+),\s*([A-Za-z]+)\s*([A-Za-z]+)
Thanks to everyone for the suggestions.
感谢大家的建议。
采纳答案by Andreas Fester
Your sample input is "lastname, firstname middlename"
- with that, you can use the following regexp to extract lastname, firstname and middlename (with the addition that there might be multiple white spaces, and that there might be both capital and non-capital letters in the strings - also, all parts are mandatory):
您的示例输入是"lastname, firstname middlename"
- 有了这个,您可以使用以下正则表达式来提取姓氏、名字和中间名(另外可能有多个空格,并且字符串中可能有大写和非大写字母 - 也,所有部分都是强制性的):
String input = "Lastname, firstname middlename";
String regexp = "([A-Za-z]+),\s+([A-Za-z]+)\s+([A-Za-z]+)";
Pattern pattern = Pattern.compile(regexp);
Matcher matcher = pattern.matcher(input);
matcher.find();
System.out.println("Lastname : " + matcher.group(1));
System.out.println("Firstname : " + matcher.group(2));
System.out.println("Middlename: " + matcher.group(3));
Short summary:
简短的摘要:
([A-Za-z]+) First capture group - matches one or more letters to extract the last name
,\s+ Capture group is followed by a comma and one or more spaces
([A-Za-z]+) Second capture group - matches one or more letters to extract the first name
\s+ Capture group is followed by one or more spaces
([A-Za-z]+) Third capture group - matches one or more letters to extract the middle name
This only works if your names contain latin letters only - probably you should use a more open match for the characters:
这仅适用于您的姓名仅包含拉丁字母的情况 - 您可能应该对字符使用更开放的匹配:
String input = "Müller, firstname middlename";
String regexp = "(.+),\s+(.+)\s+(.+)";
This matches any character for lastname, firstname and middlename.
这匹配姓氏、名字和中间名的任何字符。
If the spaces are optional (only the first occurrence can be optional, otherwise we can not distinguish between firstname and middlename), then use *
instead of +
:
如果空格是可选的(只有第一次出现是可选的,否则我们无法区分名字和中间名),然后使用*
代替+
:
String input = "Müller,firstname middlename";
String regexp = "(.+),\s*(.+)\s+(.+)";
As @Elliott mentions, there might be other possibilities like using String.split()
or String.indexOf()
with String.substring()
- regular expressions are often more flexible, but harder to maintain, especially for complex expressions.
正如@Elliott 提到的,可能还有其他可能性,例如 usingString.split()
或String.indexOf()
with String.substring()
- 正则表达式通常更灵活,但更难维护,尤其是对于复杂表达式。
In either case, implement unit tests with as much different inputs (including invalid ones) as possible so that you can verify that your algorithm is still valid after you modify it.
在任一情况下,使用尽可能多的不同输入(包括无效输入)实施单元测试,以便您可以验证您的算法在修改后仍然有效。
回答by Elliott Frisch
I would try and avoid a complicated regex, I would use String.substring()
and indexOf()
. That is, something like
我会尽量避免使用复杂的正则表达式,我会使用String.substring()
and indexOf()
。也就是说,像
String name = "Last, First Middle";
int comma = name.indexOf(',');
int lastSpace = name.lastIndexOf(' ');
String lastName = name.substring(0, comma);
String firstName = name.substring(comma + 2, lastSpace);
String middleName = name.substring(lastSpace + 1);
System.out.printf("first='%s' middle='%s' last='%s'%n", firstName,
middleName, lastName);
Output is
输出是
first='First' middle='Middle' last='Last'
回答by Breandán Dalton
As an alternative to matching the lastname, firstname middlename
directly, you could use String.split and provide a regexp that matches the separators, instead. For instance:
作为lastname, firstname middlename
直接匹配的替代方法,您可以使用 String.split 并提供与分隔符匹配的正则表达式。例如:
static String[] lastFirstMiddle(String input){
String[] result=input.split("[,\s]+");
System.out.println(Arrays.asList(result));
return result;
}
I tested this with inputs
我用输入测试了这个
"Müller, firstname middlename"
"Müller,firstname middlename"
"O'Gara, Ronan Ramón"
Note: this approach fails with surnames that contain spaces, for instance "van der Heuvel", "de Valera", "mac Piarais" or "bin Laden" but then again, OP's original specification does not seem to admit of spaces in the surname (or the other names. I work with a "Mary Kate". That's her first name, not first and middle). There's an interesting page about personal names at http://www.w3.org/International/questions/qa-personal-names
注意:这种方法在姓氏包含空格时失败,例如“van der Heuvel”、“de Valera”、“mac Piarais”或“bin Laden”,但话说回来,OP 的原始规范似乎不承认姓氏中有空格(或其他名字。我和“玛丽凯特”一起工作。那是她的名字,而不是名字和中间名)。在http://www.w3.org/International/questions/qa-personal-names 上有一个关于人名的有趣页面
回答by NeverHopeless
回答by vks
^([a-zA-Z]+)\s*,\s*([a-zA-Z]+)\s+([a-zA-Z]+)$
I think you are looking for this.just grab the groups to get your needs.See demo.
我认为您正在寻找这个。只需抓住组即可满足您的需求。请参阅演示。