Python Pandas:基于其他列添加列
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/35424567/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python Pandas: Add column based on other column
提问by beginner_
I'm new to pandas and pretty confused about it especially compared to lists and using list comprehensions.
我是Pandas的新手,对它非常困惑,尤其是与列表和使用列表理解相比。
I have a dataframe with 4 columns. I want to create a 5th column "c" based on 4th column "m". I can get the value for "c" by applying my function for each row in column "m".
我有一个包含 4 列的数据框。我想根据第 4 列“m”创建第 5 列“c”。我可以通过对“m”列中的每一行应用我的函数来获得“c”的值。
If "m" was a list and using list comprehension it would be
如果“m”是一个列表并使用列表理解,它将是
c = [myfunction(x) for x in m]
How do I do apply this "logic" to a dataframe?
如何将此“逻辑”应用于数据帧?
回答by jezrael
You can assign
- sample from doc
:
您可以assign
- 样品来自doc
:
import pandas as pd
import numpy as np
df = pd.DataFrame({'A': range(1, 11), 'B': np.random.randn(10)})
print df
A B
0 1 0.769028
1 2 -0.392471
2 3 0.153051
3 4 -0.379848
4 5 -0.665426
5 6 0.880684
6 7 1.126381
7 8 -0.559828
8 9 0.862935
9 10 -0.909402
df = df.assign(ln_A = lambda x: np.log(x.A))
print df
A B ln_A
0 1 0.769028 0.000000
1 2 -0.392471 0.693147
2 3 0.153051 1.098612
3 4 -0.379848 1.386294
4 5 -0.665426 1.609438
5 6 0.880684 1.791759
6 7 1.126381 1.945910
7 8 -0.559828 2.079442
8 9 0.862935 2.197225
9 10 -0.909402 2.302585
Sometimes lambda
function is helpful:
有时lambda
函数很有帮助:
import pandas as pd
import numpy as np
df = pd.DataFrame({'A': range(1, 11), 'B': np.random.randn(10)})
df['ln_A'] = df['A'].apply(np.log)
df['round'] = df['B'].apply(lambda x: np.round(x, 2))
print df
A B ln_A round
0 1 -0.982828 0.000000 -0.98
1 2 2.306111 0.693147 2.31
2 3 0.967858 1.098612 0.97
3 4 -0.286280 1.386294 -0.29
4 5 -2.026937 1.609438 -2.03
5 6 0.061735 1.791759 0.06
6 7 -0.506620 1.945910 -0.51
7 8 -0.309438 2.079442 -0.31
8 9 -1.261842 2.197225 -1.26
9 10 1.079921 2.302585 1.08
回答by MaxGu
Since pandas is on the top of numpy. You can easily apply a function to a numpy.array. The following example might help. You can transfer a list(or a column) to numpy.array and then do a vector computing.
由于大Pandas在 numpy 的顶部。您可以轻松地将函数应用于numpy.array。以下示例可能会有所帮助。您可以将列表(或列)传输到 numpy.array 然后进行向量计算。
import numpy as np
import pandas as pd
data = pd.DataFrame([[1,2],[3,4]],columns=['a','b'])
def square(x):
return x ** 2
data['c'] = square(np.array(data.a))
回答by Bjorks number one fan
The following is analogous to the generic list comprehension case
下面类似于通用列表推导的情况
def some_fn(x):
# return some_other_fn(x.Colname1, x.Colname2, ...)
return x.a + x.b
df = pd.DataFrame({'a' : [1, 2], 'b' : [3, 4]})
df['c'] = [some_fn(row) for ind, row in df.iterrows()]