在 Pandas 数据框中的每一列上应用函数

Question

提问by Night Walker

How I can write following function in more pandas way:

我如何以更多的Pandas方式编写以下函数：

     def calculate_df_columns_mean(self, df):
        means = {}
        for column in df.columns.columns.tolist():
            cleaned_data = self.remove_outliers(df[column].tolist())
            means[column] = np.mean(cleaned_data)
        return means

Thanks for help.

感谢帮助。

Answer 1

采纳答案by EdChum

It seems to me that the iteration over the columns is unnecessary:

在我看来，对列的迭代是不必要的：

def calculate_df_columns_mean(self, df):
    cleaned_data = self.remove_outliers(df[column].tolist())
    return cleaned_data.mean()

the above should be enough assuming that remove_outliersstill returns a df

假设remove_outliers仍然返回 df ，以上应该足够了

EDIT

编辑

I think the following should work:

我认为以下应该有效：

def calculate_df_columns_mean(self, df):
    return df.apply(lambda x: remove_outliers(x.tolist()).mean()

Answer 2

回答by Nick Bull

Use dataFrame.apply(func, axis=0):

使用dataFrame.apply(func, axis=0)：

# axis=0 means apply to columns; axis=1 to rows
df.apply(numpy.sum, axis=0) # equiv to df.sum(0)

在 Pandas 数据框中的每一列上应用函数

提问by Night Walker

采纳答案by EdChum

回答by Nick Bull

相关推荐

最近更新

标签

在 Pandas 数据框中的每一列上应用函数

提问by Night Walker

采纳答案by EdChum

回答by Nick Bull

相关推荐

在 Pandas 中使用 .notnull() 时正确的语法是什么？

如何在 Pandas 中按降序对两列进行排序？

使用 numpy/pandas 在 Python 中读取 CSV 文件的最后 N 行

pandas 如何删除数据框列中的字符串子串？

相关推荐

最近更新

标签