pandas 熊猫数据框所有列的平均值？

Question

提问by Michael

I'm trying to calculate the mean of all the columns of a DataFrame but it looks like having a value in the B column of row 6 prevents from calculating the mean on the C column. Why?

我正在尝试计算 DataFrame 的所有列的平均值，但看起来第 6 行的 B 列中有一个值阻止计算 C 列的平均值。为什么？

import pandas as pd
from decimal import Decimal
d = [
    {'A': 2, 'B': None, 'C': Decimal('628.00')},
    {'A': 1, 'B': None, 'C': Decimal('383.00')},
    {'A': 3, 'B': None, 'C': Decimal('651.00')},
    {'A': 2, 'B': None, 'C': Decimal('575.00')},
    {'A': 4, 'B': None, 'C': Decimal('1114.00')},
    {'A': 1, 'B': 'TEST', 'C': Decimal('241.00')},
    {'A': 2, 'B': None, 'C': Decimal('572.00')},
    {'A': 4, 'B': None, 'C': Decimal('609.00')},
    {'A': 3, 'B': None, 'C': Decimal('820.00')},
    {'A': 5, 'B': None, 'C': Decimal('1223.00')}
]

df = pd.DataFrame(d)

In : df
Out:
   A     B        C
0  2  None   628.00
1  1  None   383.00
2  3  None   651.00
3  2  None   575.00
4  4  None  1114.00
5  1  TEST   241.00
6  2  None   572.00
7  4  None   609.00
8  3  None   820.00
9  5  None  1223.00

Tests:

测试：

# no mean for C column
In : df.mean()
Out:
A    2.7
dtype: float64

# mean for C column when row 6 is left out of the DF
In : df.head(5).mean()
Out:
A      2.4
B      NaN
C    670.2
dtype: float64

# no mean for C column when row 6 is part of the DF
In : df.head(6).mean()
Out:
A    2.166667
dtype: float64

dtypes:

数据类型：

In : df.dtypes
Out:
A     int64
B    object
C    object
dtype: object

In : df.head(5).dtypes
Out:
A     int64
B    object
C    object
dtype: object

Answer 1

采纳答案by Anton Protopopov

You could use particular columns if you need only columns with numbers:

如果您只需要带有数字的列，则可以使用特定的列：

In [90]: df[['A','C']].mean()
Out[90]: 
A      2.7
C    681.6
dtype: float64

or to change type as @jezrael advice in comment:

或者在评论中将类型更改为@jezrael 建议：

df['C'] = df['C'].astype(float)

Probably df.meantrying to convert all object to numeric and if it's fall then it's roll back and calculate only for actual numbers

可能df.mean试图将所有对象转换为数字，如果它下降，那么它会回滚并仅计算实际数字

pandas 熊猫数据框所有列的平均值？

提问by Michael

采纳答案by Anton Protopopov

相关推荐

最近更新

标签

pandas 熊猫数据框所有列的平均值？

提问by Michael

采纳答案by Anton Protopopov

相关推荐

Pandas Grouper 按频率和完整性要求

Pandas：时间戳系列中的唯一天数

在 Pandas 数据框中检索 NaN 值的索引

pandas 熊猫无法读取用 h5py 创建的 hdf5 文件

相关推荐

最近更新

标签