Python 熊猫从另一列的字符串切片创建新列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/25789445/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 23:39:20  来源:igfitidea点击:

Pandas make new column from string slice of another column

pythonpandas

提问by BML91

I want to create a new column in Pandas using a string sliced for another column in the dataframe.

我想使用为数据框中另一列切片的字符串在 Pandas 中创建一个新列。

For example.

例如。

Sample  Value  New_sample
AAB     23     A
BAB     25     B

Where New_sampleis a new column formed from a simple [:1]slice of Sample

New_sample从一个简单的[:1]切片形成的新列在哪里Sample

I've tried a number of things to no avail - I feel I'm missing something simple.

我尝试了很多方法都无济于事 - 我觉得我错过了一些简单的东西。

What's the most efficient way of doing this?

这样做的最有效方法是什么?

采纳答案by EdChum

You can call the strmethod and apply a slice, this will be much quicker than the other method as this is vectorised (thanks @unutbu):

您可以调用该str方法并应用切片,这将比其他方法快得多,因为这是矢量化的(感谢@unutbu):

df['New_Sample'] = df.Sample.str[:1]

You can also call a lambda function on the df but this will be slower on larger dataframes:

您还可以在 df 上调用 lambda 函数,但这在较大的数据帧上会变慢:

In [187]:

df['New_Sample'] = df.Sample.apply(lambda x: x[:1])
df
Out[187]:
  Sample  Value New_Sample
0    AAB     23          A
1    BAB     25          B

回答by student

You can also use slice()to slice string of Seriesas following:

您还可以使用以下方式slice()对字符串进行切片Series

df['New_sample'] = df['Sample'].str.slice(0,1)

From pandas documentation:

来自熊猫文档

Series.str.slice(start=None, stop=None, step=None)

Slice substrings from each element in the Series/Index

系列.str.slice(开始=无,停止=无,步骤=无)

从系列/索引中的每个元素切片子字符串

For slicing index (if index is of type string), you can try:

对于切片索引(如果索引是字符串类型),您可以尝试:

df.index = df.index.str.slice(0,1)