在 PHP 中拆分名字和姓氏的最佳方法

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/8808902/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-26 05:33:12  来源:igfitidea点击:

Best way to split a first and last name in PHP

php

提问by Shackrock

I am stuck with a NAME field, which typically is in the format:

我被一个 NAME 字段卡住了,该字段通常采用以下格式:

FirstName LastName

However, I also have the occasional names that are in any of these formats (with prefix or suffix):

但是,我偶尔也有采用以下任何格式的名称(带前缀或后缀):

Mr. First Last
First Last Jr.

What do people think is a safe way to split these into FIRST/LAST name variables in PHP? I can't really come up with anything that tends to work all of the time...

人们认为将这些拆分为 PHP 中的 FIRST/LAST 名称变量的安全方法是什么?我真的想不出任何总是有效的东西......

回答by Francis Lewis

A regex is the best way to handle something like this. Try this piece - it pulls out the prefix, first name, last name and suffix:

正则表达式是处理此类事情的最佳方式。试试这个——它提取出前缀、名字、姓氏和后缀:

$array = array(
    'FirstName LastName',
    'Mr. First Last',
    'First Last Jr.',
    'Shaqueal O'neal',
    'D'angelo Hall',
);

foreach ($array as $name)
{
    $results = array();
    echo $name;
    preg_match('#^(\w+\.)?\s*([\'\'\w]+)\s+([\'\'\w]+)\s*(\w+\.?)?$#', $name, $results);
print_r($results);
}

The result comes out like this:

结果是这样的:

FirstName LastName
Array
(
    [0] => FirstName LastName
    [1] => 
    [2] => FirstName
    [3] => LastName
)
Mr. First Last
Array
(
    [0] => Mr. First Last
    [1] => Mr.
    [2] => First
    [3] => Last
)
First Last Jr.
Array
(
    [0] => First Last Jr.
    [1] => 
    [2] => First
    [3] => Last
    [4] => Jr.
)
shaqueal o'neal
Array
(
    [0] => shaqueal o'neal
    [1] => 
    [2] => shaqueal
    [3] => o'neal
)
d'angelo hall
Array
(
    [0] => d'angelo hall
    [1] => 
    [2] => d'angelo
    [3] => hall
)

etc…

等等…

so in the array $array[0]contains the entire string. $array[2]is always first name and $array[3]is always last name. $array[1]is prefix and $array[4](not always set) is suffix. I also added code to handle both ' and ' for names like Shaqueal O'neal and D'angelo Hall.

所以在数组中 $array[0]包含整个字符串。$array[2]永远是名字,$array[3]永远是姓氏。 $array[1]是前缀,$array[4](并非总是设置)是后缀。我还添加了代码来处理像 Shaqueal O'neal 和 D'angelo Hall 这样的名字的 ' 和 '。

回答by Still don't know everything

The accepted answer doesn't work for languages other than english, or names such as "Oscar de la Hoya".

接受的答案不适用于英语以外的语言,或诸如“Oscar de la Hoya”之类的名称。

Here's something I did that I think is utf-8 safe and works for all of those cases, building on the accepted answer's assumption that a prefix and suffix will have a period:

这是我所做的一些事情,我认为 utf-8 安全并且适用于所有这些情况,建立在接受的答案的假设之上,即前缀和后缀将有一个句点:

/**
 * splits single name string into salutation, first, last, suffix
 * 
 * @param string $name
 * @return array
 */
public static function doSplitName($name)
{
    $results = array();

    $r = explode(' ', $name);
    $size = count($r);

    //check first for period, assume salutation if so
    if (mb_strpos($r[0], '.') === false)
    {
        $results['salutation'] = '';
        $results['first'] = $r[0];
    }
    else
    {
        $results['salutation'] = $r[0];
        $results['first'] = $r[1];
    }

    //check last for period, assume suffix if so
    if (mb_strpos($r[$size - 1], '.') === false)
    {
        $results['suffix'] = '';
    }
    else
    {
        $results['suffix'] = $r[$size - 1];
    }

    //combine remains into last
    $start = ($results['salutation']) ? 2 : 1;
    $end = ($results['suffix']) ? $size - 2 : $size - 1;

    $last = '';
    for ($i = $start; $i <= $end; $i++)
    {
        $last .= ' '.$r[$i];
    }
    $results['last'] = trim($last);

    return $results;
}

Here's the phpunit test:

这是phpunit测试:

public function testDoSplitName()
{
    $array = array(
        'FirstName LastName',
        'Mr. First Last',
        'First Last Jr.',
        'Shaqueal O\'neal',
        'D'angelo Hall',
        'Václav Havel',
        'Oscar De La Hoya',
        'АБВГ?Д ??Е?Ё?ЖЗ', //cyrillic
        '????? ????????', //yiddish
    );

    $assertions = array(
            array(
                    'salutation' => '',
                    'first' => 'FirstName',
                    'last' => 'LastName',
                    'suffix' => ''
                ),
            array(
                    'salutation' => 'Mr.',
                    'first' => 'First',
                    'last' => 'Last',
                    'suffix' => ''
                ),
            array(
                    'salutation' => '',
                    'first' => 'First',
                    'last' => 'Last',
                    'suffix' => 'Jr.'
                ),
            array(
                    'salutation' => '',
                    'first' => 'Shaqueal',
                    'last' => 'O\'neal',
                    'suffix' => ''
                ),
            array(
                    'salutation' => '',
                    'first' => 'D'angelo',
                    'last' => 'Hall',
                    'suffix' => ''
                ),
            array(
                    'salutation' => '',
                    'first' => 'Václav',
                    'last' => 'Havel',
                    'suffix' => ''
                ),
            array(
                    'salutation' => '',
                    'first' => 'Oscar',
                    'last' => 'De La Hoya',
                    'suffix' => ''
                ),
            array(
                    'salutation' => '',
                    'first' => 'АБВГ?Д',
                    'last' => '??Е?Ё?ЖЗ',
                    'suffix' => ''
                ),
            array(
                    'salutation' => '',
                    'first' => '?????',
                    'last' => '????????',
                    'suffix' => ''
                ),
        );

    foreach ($array as $key => $name)
    {
        $result = Customer::doSplitName($name);

        $this->assertEquals($assertions[$key], $result);
    }
}

回答by martinstoeckli

You won't find a safe way to solve this problem, not even a human can always tell which parts belong to the firstname and which belong to the lastname, especially when one of them contains several words like: Andrea Frank Gutenberg. The middle part Frankcan be a second firstname or the lastname with a maiden name Gutenberg.

你不会找到解决这个问题的安全方法,即使是人类也不能总能分辨出哪些部分属于名字,哪些属于姓氏,尤其是当其中一个包含几个单词时,如:Andrea Frank Gutenberg。中间部分Frank可以是第二个名字或带有婚前姓Gutenberg的姓氏。

The best you can do is, to provide different input fields for firstname and lastname, and safe them separated in the database, you can avoid a lot of problems this way.

你能做的最好的事情是,为名字和姓氏提供不同的输入字段,并在数据库中安全地将它们分开,这样可以避免很多问题。

回答by Graham T

Great library here that so far has parsed names flawlessly: https://github.com/joshfraser/PHP-Name-Parser

很棒的图书馆,到目前为止已经完美地解析了名字:https: //github.com/joshfraser/PHP-Name-Parser

回答by Míla Mrvík

There is another solution:

还有另一种解决方案:

// First, just for safety make replacement '.' for '. '
$both = str_replace('.', '. ', $both);

// Now delete titles
$both = preg_replace('/[^ ]+\./', '', $both);

// Delete redundant spaces
$both = trim(str_replace('  ', ' ', $both));

// Explode
$split = explode(" ", $both, 2);
if( count($split) > 1 ) {
    list($name, $surname) = $split;
} else {
    $name = $split[0];
    $surname = '';
}

回答by Murray McDonald

Not a simple problem, and to a large extent your ability to get a workable solution depends on cultural "norms"

不是一个简单的问题,在很大程度上,你能否找到可行的解决方案取决于文化“规范”

  1. First hive off any "honorifics" - using preg_replaceeg.

     $normalized_name = preg_replace('/^(Mr\.*\sJustice|Mr\.*\s+|Mrs\.*\s+|Ms\.\s+|Dr\.*\s+|Justice|etc.)*(.*)$/is', '', trim($input_name));
    
  2. Next hive off any trailing suffixes

    $normalized_name = preg_replace('/^(.*)(Jr\.*|III|Phd\.*|Md\.)$/is', '', $normalized_name);
    
  3. Finally split at the first blank to get a first name and last name.

  1. 首先将任何“敬语”分开 - 使用preg_replace例如。

     $normalized_name = preg_replace('/^(Mr\.*\sJustice|Mr\.*\s+|Mrs\.*\s+|Ms\.\s+|Dr\.*\s+|Justice|etc.)*(.*)$/is', '', trim($input_name));
    
  2. 接下来去除任何尾随后缀

    $normalized_name = preg_replace('/^(.*)(Jr\.*|III|Phd\.*|Md\.)$/is', '', $normalized_name);
    
  3. 最后在第一个空格处拆分以获得名字和姓氏。

Obviously in "english" alone there are many possible honorifics, I couldn't think of too many suffixes but there's probably more than I listed.

显然,仅在“英语”中就有许多可能的敬语,我想不出太多的后缀,但可能比我列出的要多。

回答by Ramon Saraiva

First you explode the FIRST/LAST, then you concatenate the prefix.

The example above:

首先分解FIRST/LAST,然后连接前缀。

上面的例子:

Vicent van Gogh

文森特·梵高

The firstnameis the first index of the array. What comes after the firstname, is/are the lastname, so you just need to get the rest of the array indexes.

姓名是该阵列的第一索引。在 firstname 之后的是lastname,因此您只需要获取数组索引的其余部分。

After that, you concatenate the prefix/sufix.

之后,您连接前缀/后缀。

Mr. Vicent van Gogh
Vicent van Gogh jr.

文森特·梵高先生 Vicent van
Gogh jr.

回答by Mathieu Dumoulin

If you have a database, i'd create a column called prefix and suffix. Then run a query to extract that portion from the text.

如果你有一个数据库,我会创建一个名为前缀和后缀的列。然后运行查询以从文本中提取该部分。

UPDATE names SET prefix = 'mr.' WHERE name LIKE 'mr. %'
UPDATE names SET name = substring(name, 4) WHERE name LIKE 'mr. %'

This way you can keep the different prefix in the database, it works like a charm cause it's a batch statement and you can add as many suffix or prefix to your scan as you like and it's not that long to build.

通过这种方式,您可以在数据库中保留不同的前缀,它就像一个魅力,因为它是一个批处理语句,您可以根据需要为扫描添加任意数量的后缀或前缀,而且构建时间并不长。

Then you can split on the first space after removing all prefixes and suffixes this way.

然后,您可以在以这种方式删除所有前缀和后缀后在第一个空格上进行拆分。

回答by blake305

Assuming you don't care about the Mr. or Jr. part and that $textcontains the name:

假设您不关心 Mr. 或 Jr. 部分并且$text包含名称:

$textarray = explode(" ", $text);

foreach($textarray as $key => $value)
{
    if (preg_match("/\./", $value))
    {
        unset($text[$key]);
    }
}

$first_last = array_values($text);

$firstname = $first_last[0];
$lastname = $first_last[1];

$firstnamewill be the first name and $lastnamewill be the last name. Not the cleanest way to do it, but it's a possibility.

$firstname将是名字,$lastname将是姓氏。不是最干净的方法,但这是一种可能性。

回答by wickeed

Another Solution:

另一个解决方案:

function getFirstLastName($fullName) {
    $fullName = $firstLast = trim($fullName);
    if (preg_match('/\s/', $fullName)) {
        $first = mb_substr($fullName, 0, mb_strpos($fullName, " "));
        $last = mb_substr($fullName, -abs(mb_strpos(strrev($fullName), " ")));
        $firstLast = $first . " " . $last;
    }
    return $firstLast;
}

Hope that is useful!

希望有用!