pandas 使用另一列值的 len() 添加 DataFrame 列
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/29869559/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Adding a DataFrame column with len() of another column's values
提问by halycos
I'm having a problem trying to get a character count column of the string values in another column, and haven't figured out how to do it efficiently.
我在尝试获取另一列中字符串值的字符计数列时遇到问题,但还没有想出如何有效地做到这一点。
for index in range(len(df)):
df['char_length'][index] = len(df['string'][index]))
This apparently involves first creating a column of nulls and then rewriting it, and it takes a really long time on my data set. So what's the most effective way of getting something like
这显然涉及首先创建一列空值,然后重写它,并且在我的数据集上需要很长时间。那么获得类似东西的最有效方法是什么
'string' 'char_length'
abcd 4
abcde 5
I've checked around quite a bit, but I haven't been able to figure it out.
我已经检查了很多,但我一直无法弄清楚。
回答by Alex Riley
Pandas has a vectorised string methodfor this: str.len(). To create the new column you can write:
大Pandas有一个向量化字符串的方法为这样的:str.len()。要创建新列,您可以编写:
df['char_length'] = df['string'].str.len()
For example:
例如:
>>> df
string
0 abcd
1 abcde
>>> df['char_length'] = df['string'].str.len()
>>> df
string char_length
0 abcd 4
1 abcde 5
This should be considerably faster than looping over the DataFrame with a Python forloop.
这应该比使用 Pythonfor循环遍历 DataFrame 快得多。
Many other familiar string methods from Python have been introduced to Pandas. For example, lower(for converting to lowercase letters), countfor counting occurrences of a particular substring, and replacefor swapping one substring with another.
许多其他熟悉的 Python 字符串方法已被引入 Pandas。例如,lower(用于转换为小写字母),count用于计算特定子字符串的出现次数,以及replace用于将一个子字符串与另一个交换。
回答by Zero
Here's one way to do it.
这是一种方法。
In [3]: df
Out[3]:
string
0 abcd
1 abcde
In [4]: df['len'] = df['string'].str.len()
In [5]: df
Out[5]:
string len
0 abcd 4
1 abcde 5

