pandas 行子集的一列上的熊猫标准偏差

Question

提问by Thomas

I'm new to working with Python and Pandas. Currently I'm attempting to create a report that extracts data from an SQL database and using that data in a pandas dataframe. In each row is a server name and date of sample and then sample data per column following that.

我是 Python 和 Pandas 的新手。目前我正在尝试创建一个从 SQL 数据库中提取数据并在 Pandas 数据框中使用该数据的报告。每行是服务器名称和样本日期，然后是每列的样本数据。

I have been able to filter by the hostname using df[df['hostname'] == uniquehost] df being a variable for the dataframe and uniquehost being a variable for each unique host name.

我已经能够使用 df[df['hostname'] == uniquehost] df 作为数据帧的变量和 uniquehost 作为每个唯一主机名的变量按主机名进行过滤。

What I am trying to do next is to obtain the stdev of the other columns although I haven't been capable of figuring this part out. I attempted to use df[df['hostname'] == uniquehost].std()

我接下来要做的是获取其他列的 stdev，尽管我无法弄清楚这部分。我试图使用 df[df['hostname'] == uniquehost].std()

However, this wasn't correct.

然而，这是不正确的。

Can anyone point me in the appropriate direction to get this figure out? I suspect I'm barking up the wrong tree and there's likely a very easy way to handle this that I haven't encountered yet.

谁能指出我正确的方向来弄清楚这个数字？我怀疑我找错了树，可能有一种非常简单的方法来处理这个问题，但我还没有遇到过。

Hostname | Sample Date | CPU Peak | Memory Peak 
server1 | 08/08/17 | 67.32 | 34.83 
server1 | 08/09/17 | 34 | 62

Answer 1

采纳答案by cs95

IIUC, you'll want to first do df.groupbyon Hostnameand thenfind the standard deviation. Something like this:

IIUC，你要首先做df.groupby的Hostname，并随后找到标准偏差。像这样的东西：

In [118]: df.groupby('Hostname')[['CPU Peak', 'Memory Peak']].std()
Out[118]: 
           CPU Peak  Memory Peak
Hostname                        
server1   23.560798    19.212091

pandas 行子集的一列上的熊猫标准偏差

提问by Thomas

采纳答案by cs95

相关推荐

最近更新

标签

pandas 行子集的一列上的熊猫标准偏差

提问by Thomas

采纳答案by cs95

相关推荐

pandas 从数据框中随机选择列

如何将 Pandas 从 0.17.1 版本升级到更高版本？

Pandas - is inplace = True 是否有害？

使用 Pandas 在同一图中绘制不同 DataFrame 的不同列

相关推荐

最近更新

标签