Python 如何将自定义函数应用于每行的熊猫数据框

Question

提问by zorny

I want to apply a custom function and create a derived column called population2050 that is based on two columns already present in my data frame.

我想应用一个自定义函数并创建一个名为population2050 的派生列，该列基于我的数据框中已经存在的两列。

import pandas as pd
import sqlite3
conn = sqlite3.connect('factbook.db')
query = "select * from facts where area_land =0;"
facts = pd.read_sql_query(query,conn)
print(list(facts.columns.values))

def final_pop(initial_pop,growth_rate):
    final = initial_pop*math.e**(growth_rate*35)
    return(final)

facts['pop2050'] = facts['population','population_growth'].apply(final_pop,axis=1)

When I run the above code, I get an error. Am I not using the 'apply' function correctly?

当我运行上面的代码时，出现错误。我没有正确使用“应用”功能吗？

Answer 1

采纳答案by Boud

Apply will pass you along the entire row with axis=1. Adjust like this assuming your two columns are called initial_popand growth_rate

Apply 将沿轴 = 1 的整行传递您。假设你的两列被调用initial_pop，像这样调整growth_rate

def final_pop(row):
    return row.initial_pop*math.e**(row.growth_rate*35)

Answer 2

回答by Karnage

You were almost there:

你快到了：

facts['pop2050'] = facts.apply(lambda row: final_pop(row['population'],row['population_growth']),axis=1)

Using lambda allows you to keep the specific (interesting) parameters listed in your function, rather than bundling them in a 'row'.

使用 lambda 允许您保留函数中列出的特定（有趣）参数，而不是将它们捆绑在“行”中。

Answer 3

回答by Mr. Duhart

You can achieve the same result without the need for DataFrame.apply(). Pandas series (or dataframe columns) can be used as direct arguments for NumPy functions and even built-in Python operators, which are applied element-wise. In your case, it is as simple as the following:

您无需使用DataFrame.apply(). Pandas 系列（或数据帧列）可以用作 NumPy 函数甚至内置 Python 运算符的直接参数，这些运算符是按元素应用的。在您的情况下，它很简单，如下所示：

import numpy as np

facts['pop2050'] = facts['population'] * np.exp(35 * facts['population_growth'])

This multiplies each element in the column population_growth, applies numpy's exp()function to that new column (35 * population_growth) and then adds the result with population.

这将列中的每个元素相乘population_growth，将 numpy 的exp()函数应用于该新列 ( 35 * population_growth)，然后将结果与相加population。

Answer 4

回答by syed irfan

Your function,

你的功能，

def function(x):
  // your operation
  return x

call your function as,

将您的功能称为，

df['column']=df['column'].apply(function)

Python 如何将自定义函数应用于每行的熊猫数据框

提问by zorny

采纳答案by Boud

回答by Karnage

回答by Mr. Duhart

回答by syed irfan

相关推荐

最近更新

标签

Python 如何将自定义函数应用于每行的熊猫数据框

提问by zorny

采纳答案by Boud

回答by Karnage

回答by Mr. Duhart

回答by syed irfan

相关推荐

Python 如何更改 Keras 后端（json 文件在哪里）？

Python 替换熊猫数据框中大于数字的值

Python 来自数据框 groupby 的条形图

如何在 jupyter 上将 python 3.6 内核与 3.5 一起添加

相关推荐

最近更新

标签