Python 大熊猫的平均计算不包括零

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/33217636/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 13:02:01  来源:igfitidea点击:

mean calculation in pandas excluding zeros

pythonpandas

提问by Gabriel

Is there a direct way to calculate the mean of a dataframe column in pandas but not taking into account data that has zero as a value? Like a parameter inside the .mean() function? Was currently doing it like this:

有没有一种直接的方法来计算 Pandas 中数据框列的平均值,但不考虑值为零的数据?就像 .mean() 函数中的参数一样?目前是这样做的:

x = df[df[A]!=0]
x.mean()

采纳答案by tibi3000

It also depends on the meaning of 0 in your data.

它还取决于数据中 0 的含义。

  • If these are indeed '0' values, then your approach is good
  • If '0' is a placeholder for a value that was not measured (i.e. 'NaN'), then it might make more sense to replace all '0' occurrences with 'NaN' first. Calculation of the mean then by default exclude NaN values.

    df = pd.DataFrame([1, 0, 2, 3, 0], columns=['a'])
    df = df.replace(0, np.NaN)
    df.mean()
    
  • 如果这些确实是“0”值,那么您的方法很好
  • 如果“0”是未测量的值(即“NaN”)的占位符,那么首先用“NaN”替换所有出现的“0”可能更有意义。计算平均值然后默认排除 NaN 值。

    df = pd.DataFrame([1, 0, 2, 3, 0], columns=['a'])
    df = df.replace(0, np.NaN)
    df.mean()