在列表中的字符串中查找最后一个单词（Pandas，Python 3）

Question

提问by user3682157

I have a DF named 'Stories" that looks like this:

我有一个名为“Stories”的 DF，如下所示：

Story
The Man
The Man Child
The Boy of Egypt
The Legend of Zelda

Is there a way to extract the last word in each of those strings?

有没有办法提取每个字符串中的最后一个单词？

Something like:

就像是：

Stories['Prefix'] = final['Story'].str.extract(r'([^ ]*)')

finds the prefix but I am not sure how to adapt it accordingly

找到前缀，但我不确定如何相应地调整它

I was hoping to end up with something like

我希望最终得到类似的东西

Story                  Suffix
The Word Of Man         Man
The Man of Legend       Legend
The Boy of Egypt        Egypt
The Legend of Zelda     Zelda

Any help would be much appreciated!

任何帮助将非常感激！

Answer 1

回答by DSM

You can use .strtwice, as .str[-1]will pick up the last element:

您可以使用.str两次，因为.str[-1]将拾取最后一个元素：

>>> df["Suffix"] = df["Story"].str.split().str[-1]
>>> df
                 Story Suffix
0              The Man    Man
1        The Man Child  Child
2     The Boy of Egypt  Egypt
3  The Legend of Zelda  Zelda

Answer 2

回答by rhaskett

I think split is a little more clear than regex but you can applyany function you choose to a series.

我认为 split 比 regex 更清晰一点，但您可以apply选择一系列的任何函数。

final['Prefix'] = final['Story'].apply(lambda x: x.split()[-1])

Answer 3

回答by EdChum

Use can use a regexp pattern to extract the last word:

使用可以使用正则表达式模式来提取最后一个单词：

In [10]:

df['suffix'] = df.Story.str.extract(r'((\b\w+)[\.?!\s]*$)')[0]
df
Out[10]:
                  Story  suffix
0               The Man     Man
1         The Man Child   Child
2      The Boy of Egypt   Egypt
3  The Legend of Zeldar  Zeldar

The pattern is a modified version of the answer I found here: regex match first and last word or any word

该模式是我在此处找到的答案的修改版本：regex match first and last word or any word

Answer 4

回答by A.J. Uppal

To get the last word, you can make a list with each title being an entry in the list, and call this list comprehension to get all the suffixes:

要获得最后一个单词，您可以制作一个列表，其中每个标题都是列表中的一个条目，并调用此列表理解来获取所有后缀：

suffixes = [item.split()[-1] for item in mylist]

This splits the strings by each word, and uses [-1]to get the last entry.

这将按每个单词拆分字符串，并用于[-1]获取最后一个条目。

Then you can write it back whichever way you want.

然后你可以用任何你想要的方式写回它。

The above list comprehension is equivalent to:

上面的列表理解等价于：

suffixes = []
for item in mylist:
    suffixes.append(item.split()[-1])) #item.split() to get a list of each word in the string, and [-1] to get the last word

Here is an example:

下面是一个例子：

mylist = ['The Man', 'The Man Child', 'The Boy of Egypt', 'The Legend of Zelda']
suffixes = [item.split()[-1] for item in mylist]
print suffixes #['Man', 'Child', 'Egypt', 'Zelda']

Answer 5

回答by Inox

Not sure if there's any built in function to do it directly. You can iterate through the strings like

不确定是否有任何内置函数可以直接执行此操作。您可以遍历字符串，如

for i in xrange(len(df)):
    df['Suffix'].iat[i] = df['Story'].iat[i].split(' ')[len(df['Story'].iat[i].split(' '))-1]

在列表中的字符串中查找最后一个单词（Pandas，Python 3）

提问by user3682157

回答by DSM

回答by rhaskett

回答by EdChum

回答by A.J. Uppal

回答by Inox

相关推荐

最近更新

标签

在列表中的字符串中查找最后一个单词（Pandas，Python 3）

提问by user3682157

回答by DSM

回答by rhaskett

回答by EdChum

回答by A.J. Uppal

回答by Inox

相关推荐

使用 pandas 转换字符串时间戳

pandas DataFrame.drop_duplicates 和 DataFrame.drop 不删除行

Pandas：为什么 pandas.Series.std() 与 numpy.std() 不同

pandas 使用 Seaborn FacetGrid 绘制时间序列

相关推荐

最近更新

标签