如何在 Python 中解析人们的名字和姓氏?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1720503/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-03 22:56:33  来源:igfitidea点击:

How do I parse the people's first and last name in Python?

pythonparsing

提问by y2k

So basically I need to parse a name and find the following info:

所以基本上我需要解析一个名字并找到以下信息:

  • First Name

  • First Initial (if employee has initials for a first name like D.J., use both initials)

  • Last Name (include if employee has a suffix such as Jr. or III.)

  • 首字母缩写(如果员工有 DJ 等名字的首字母缩写,则使用两个首字母缩写)

  • 姓氏(包括如果员工有后缀,例如 Jr. 或 III。)



So here's the interface I'm working with:

所以这是我正在使用的界面:

Input:

输入:

names = ["D.J. Richies III", "John Doe", "A.J. Hardie Jr."]
for name in names:
   print parse_name(name)

Expected Output:

预期输出:

{'FirstName': 'D.J.', 'FirstInitial': 'D.J.', 'LastName': 'Richies III' }
{'FirstName': 'John', 'FirstInitial': 'J.', 'LastName': 'Doe' }
{'FirstName': 'A.J.', 'FirstInitial': 'A.J.', 'LastName': 'Hardie Jr.' }

Not really good at Regex, and actually that's probably overkill for this. I'm just guessing:

不太擅长正则表达式,实际上这可能有点矫枉过正。我只是猜测:

if name[1] == ".":  # we have a name like D.J.?

采纳答案by Daniel G

Well, for your simple example names, you can do something like this.

好吧,对于您的简单示例名称,您可以执行以下操作。

# This separates the first and last names
name = name.partition(" ")
firstName = name[0]
# now figure out the first initial
# we're assuming that if it has a dot it's an initialized name,
# but this may not hold in general
if "." in firstName:
    firstInitial = firstName
else:
    firstInitial = firstName[0] + "."
lastName = name[2]
return {"FirstName":firstName, "FirstInitial":firstInitial, "LastName": lastName}

I haven't tested it, but a function like that should do the job on the input example you provided.

我还没有测试过它,但是像这样的函数应该可以完成您提供的输入示例的工作。

回答by Hamish Currie

I found this library quite useful for parsing names. https://code.google.com/p/python-nameparser/

我发现这个库对于解析名称非常有用。https://code.google.com/p/python-nameparser/

It can also deal with names that are formatted Lastname, Firstname.

它还可以处理格式为 Lastname, Firstname 的名称。

回答by Anurag Uniyal

There is no general solution and solution will depend on the constraints you put. For the specs you have given here is a simple solution which gives exactly what you want

没有通用的解决方案,解决方案将取决于您施加的约束。对于您在此处给出的规格,这是一个简单的解决方案,可以准确地提供您想要的

def parse_name(name):
   fl = name.split()
   first_name = fl[0]
   last_name = ' '.join(fl[1:])
   if "." in first_name:
      first_initial = first_name
   else:
      first_initial = first_name[0]+"."

   return {'FirstName':first_name, 'FirstInitial':first_initial, 'LastName':last_name}

names = ["D.J. Richies III", "John Doe", "A.J. Hardie Jr."]
for name in names:
   print parse_name(name)

output:

输出:

{'LastName': 'Richies III', 'FirstInitial': 'D.J.', 'FirstName': 'D.J.'}
{'LastName': 'Doe', 'FirstInitial': 'J.', 'FirstName': 'John'}
{'LastName': 'Hardie Jr.', 'FirstInitial': 'A.J.', 'FirstName': 'A.J.'}

回答by P?r Wieslander

This is basically the same solution as the one Anurag Uniyal provided, only a little more compact:

这与 Anurag Uniyal 提供的解决方案基本相同,只是更紧凑一点:

import re

def parse_name(name):
    first_name, last_name = name.split(' ', 1)
    first_initial = re.search("^[A-Z.]+", first_name).group()
    if not first_initial.endswith("."):
        first_initial += "."
    return {"FirstName": first_name,
            "FirstInitial": first_initial,
            "LastName": last_name}