Python 如何在熊猫数据框中对字符串进行左、右和中
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/20970279/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
how to do a left,right and mid of a string in a pandas dataframe
提问by IcemanBerlin
in a pandas dataframe how can I apply a sort of excel left('state',2) to only take the first two letters. Ideally I want to learn how to use left,right and mid in a dataframe too. So need an equivalent and not a "trick" for this specific example.
在熊猫数据框中,我如何应用一种 excel left('state',2) 只取前两个字母。理想情况下,我也想学习如何在数据框中使用左、右和中。因此,对于此特定示例,需要等效而不是“技巧”。
data = {'state': ['Auckland', 'Otago', 'Wellington', 'Dunedin', 'Hamilton'],
'year': [2000, 2001, 2002, 2001, 2002],
'pop': [1.5, 1.7, 3.6, 2.4, 2.9]}
df = pd.DataFrame(data)
print df
pop state year
0 1.5 Auckland 2000
1 1.7 Otago 2001
2 3.6 Wellington 2002
3 2.4 Dunedin 2001
4 2.9 Hamilton 2002
I want to get this:
我想得到这个:
pop state year StateInitial
0 1.5 Auckland 2000 Au
1 1.7 Otago 2001 Ot
2 3.6 Wellington 2002 We
3 2.4 Dunedin 2001 Du
4 2.9 Hamilton 2002 Ha
采纳答案by alko
First two letters for each value in a column:
列中每个值的前两个字母:
>>> df['StateInitial'] = df['state'].str[:2]
>>> df
pop state year StateInitial
0 1.5 Auckland 2000 Au
1 1.7 Otago 2001 Ot
2 3.6 Wellington 2002 We
3 2.4 Dunedin 2001 Du
4 2.9 Hamilton 2002 Ha
For last two that would be df['state'].str[-2:]. Don't know what exactly you want for middle, but you can apply arbitrary function to a column with applymethod:
对于最后两个,将是df['state'].str[-2:]. 不知道你到底想要什么中间,但你可以使用apply方法将任意函数应用于列:
>>> df['state'].apply(lambda x: x[len(x)/2-1:len(x)/2+1])
0 kl
1 ta
2 in
3 ne
4 il
回答by r col
With regards to the mid, probably a short cut code would be df['state'].str[3,5]
关于中间,可能一个快捷代码是 df['state'].str[3,5]
this will start from the 3rd character and give you the 3rd and 4th character of the string.
这将从第 3 个字符开始,并为您提供字符串的第 3 个和第 4 个字符。

