在 Pandas 数据框中的每一列上应用函数
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/38848411/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Apply function on each column in a pandas dataframe
提问by Night Walker
How I can write following function in more pandas way:
我如何以更多的Pandas方式编写以下函数:
def calculate_df_columns_mean(self, df):
means = {}
for column in df.columns.columns.tolist():
cleaned_data = self.remove_outliers(df[column].tolist())
means[column] = np.mean(cleaned_data)
return means
Thanks for help.
感谢帮助。
采纳答案by EdChum
It seems to me that the iteration over the columns is unnecessary:
在我看来,对列的迭代是不必要的:
def calculate_df_columns_mean(self, df):
cleaned_data = self.remove_outliers(df[column].tolist())
return cleaned_data.mean()
the above should be enough assuming that remove_outliers
still returns a df
假设remove_outliers
仍然返回 df ,以上应该足够了
EDIT
编辑
I think the following should work:
我认为以下应该有效:
def calculate_df_columns_mean(self, df):
return df.apply(lambda x: remove_outliers(x.tolist()).mean()
回答by Nick Bull
Use dataFrame.apply(func, axis=0)
:
使用dataFrame.apply(func, axis=0)
:
# axis=0 means apply to columns; axis=1 to rows
df.apply(numpy.sum, axis=0) # equiv to df.sum(0)