Python 如何在 Pandas 中使用多列映射函数?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/28457149/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to map a function using multiple columns in pandas?
提问by ashishsingal
I've checked out map, apply, mapapply, and combine, but can't seem to find a simple way of doing the following:
我已经检查了 map、apply、mapapply 和 combine,但似乎找不到执行以下操作的简单方法:
I have a dataframe with 10 columns. I need to pass three of them into a function that takes scalars and returns a scalar ...
我有一个包含 10 列的数据框。我需要将其中三个传递给一个函数,该函数接受标量并返回一个标量......
some_func(int a, int b, int c) returns int d
I want to apply this and create a new column in the dataframe with the result.
我想应用它并在数据框中创建一个带有结果的新列。
df['d'] = some_func(a = df['a'], b = df['b'], c = df['c'])
All the solutions that I've found seem to suggest to rewrite some_func to work with Series instead of scalars, but this is not possible as it is part of another package. How do I elegantly do the above?
我发现的所有解决方案似乎都建议重写 some_func 以使用 Series 而不是标量,但这是不可能的,因为它是另一个包的一部分。我如何优雅地执行上述操作?
采纳答案by tsherwen
Use pd.DataFrame.apply()
, as below:
使用pd.DataFrame.apply()
,如下:
df['d'] = df.apply(lambda x: some_func(a = x['a'], b = x['b'], c = x['c']), axis=1)
NOTE: As @ashishsingalasked about columns, the axis
argument should be provided with a value of 1, as the default is 0 (as in the documentationand copied below).
注意:当@ashishsingal询问列时,axis
应为参数提供值 1,因为默认值为 0(如文档中所示,并在下面复制)。
axis : {0 or ‘index', 1 or ‘columns'}, default 0
- 0 or ‘index': apply function to each column
- or ‘columns': apply function to each row
轴:{0 或 'index',1 或 'columns'},默认为 0
- 0 或 'index':对每一列应用函数
- 或“列”:将函数应用于每一行
回答by ashishsingal
I'm using the following:
我正在使用以下内容:
df['d'] = df.apply(lambda x: some_func(a = x['a'], b = x['b'], c = x['c']))
Seems to be working well, but if anyone else has a better solution, please let me know.
似乎运行良好,但如果其他人有更好的解决方案,请告诉我。
回答by Elias Hasle
If it is a really simple function, such as one based on simple arithmetic, chances are it can be vectorized. For instance, a linear combination can be made directly from the columns:
如果它是一个非常简单的函数,例如基于简单算术的函数,则它很可能可以被向量化。例如,可以直接从列中进行线性组合:
df["d"] = w1*df["a"] + w2*df["b"] + w3*["c"]
where w1,w2,w3 are scalar weights.
其中 w1,w2,w3 是标量权重。