pandas 在熊猫数据框中查找目标值

Question

提问by Nic

I have a multilevel dataframe df. As columns, I have different "objects" I analyze. As rows index , I have a Case ID lc, and time t.

我有一个多级数据框df。作为专栏，我分析了不同的“对象”。作为行索引，我有一个案例 IDlc和时间t。

I need to find, for each case lc, the time t(ideally interpolated, but closest value is fine enough) at which each object reached a target value.

对于每种情况lc，我需要找到t每个对象达到目标值的时间（理想情况下是内插的，但最接近的值就足够了）。

This target value is a function of the given object at time t==0.

该目标值是时间给定对象的函数t==0。

import pandas as pd
print(pd.__version__)

0.16.2

Dummy data set example:

虚拟数据集示例：

data = {1: {(1014, 0.0): 20.25,
     (1014, 0.0991): 19.08,
     (1014, 0.1991): 18.43,
     (1014, 0.2991): 19.03,
     (1014, 0.3991): 18.71,
     (1015, 0.0): 20.22,
     (1015, 0.0991): 19.3,
     (1015, 0.1991): 18.68,
     (1015, 0.2991): 18.22,
     (1015, 0.3991): 17.84,
     (1016, 0.0): 21.75,
     (1016, 0.0991): 19.97,
     (1016, 0.1991): 19.65,
     (1016, 0.2991): 19.29,
     (1016, 0.3991): 18.94
    },
 2: {(1014, 0.0): 29.11,
     (1014, 0.0991): 28.68,
     (1014, 0.1991): 28.27,
     (1014, 0.2991): 27.46,
     (1014, 0.3991): 26.96,
     (1015, 0.0): 29.22,
     (1015, 0.0991): 28.64,
     (1015, 0.1991): 28.18,
     (1015, 0.2991): 27.74,
     (1015, 0.3991): 27.25,
     (1016, 0.0): 29.17,
     (1016, 0.0991): 28.68,
     (1016, 0.1991): 28.17,
     (1016, 0.2991): 27.68,
     (1016, 0.3991): 27.18
    },
 3: {(1014, 0.0): 22.01,
     (1014, 0.0991): 21.5,
     (1014, 0.1991): 21.18,
     (1014, 0.2991): 20.58,
     (1014, 0.3991): 20.21,
     (1015, 0.0): 21.81,
     (1015, 0.0991): 21.46,
     (1015, 0.1991): 21.11,
     (1015, 0.2991): 20.78,
     (1015, 0.3991): 20.42,
     (1016, 0.0): 21.82,
     (1016, 0.0991): 21.49,
     (1016, 0.1991): 21.11,
     (1016, 0.2991): 20.75,
     (1016, 0.3991): 20.37
    }}

df = pd.DataFrame(data).sort()
df.index.names=['case', 't']

Dataframe looks thus like:

数据框看起来像：

                 1      2      3
case t                          
1014 0.0000  20.25  29.11  22.01
     0.0991  19.08  28.68  21.50
     0.1991  18.43  28.27  21.18
     0.2991  19.03  27.46  20.58
     0.3991  18.71  26.96  20.21
1015 0.0000  20.22  29.22  21.81
     0.0991  19.30  28.64  21.46
     0.1991  18.68  28.18  21.11
     0.2991  18.22  27.74  20.78
     0.3991  17.84  27.25  20.42
1016 0.0000  21.75  29.17  21.82
     0.0991  19.97  28.68  21.49
     0.1991  19.65  28.17  21.11
     0.2991  19.29  27.68  20.75
     0.3991  18.94  27.18  20.37

Target values are a function of the values at time t==0. typically, this would be k=0.5 for half-time period. For the current sample,we will take k=0.926

目标值是时间值的函数t==0。通常，对于半场时间，这将是 k=0.5。对于当前样本，我们取 k=0.926

Since values are sorted, it is ok to take the first lines for each case.

由于值已排序，因此可以为每种情况取第一行。

targets = df.groupby(level='case').first() * 0.926
print(targets)

             1         2         3
case                              
1014  18.75150  26.95586  20.38126
1015  18.72372  27.05772  20.19606
1016  20.14050  27.01142  20.20532

Now, How could I simply build the following dataframe, which shows time tat wich each object reach target value calculated above?

现在，我怎么能简单地构建以下数据框，它显示t每个对象达到上面计算的目标值的时间？

             1         2         3
case                              
1014    0.3991    0.3991    0.2991
1015    0.1991    0.3991    0.3991
1016    0.0991    0.3991    0.3991

Answer 1

采纳答案by CT Zhu

These are somewhat of a hack, let's see if there are better solutions:

这些有点hack，让我们看看是否有更好的解决方案：

In [36]:
targets['t']=0

In [37]:
df2 = df.reset_index().set_index('case') - targets

In [38]:
df3 = df2.groupby(df2.index).transform(lambda x: x.abs()==np.min(x.abs()))

In [39]:
df4 = pd.DataFrame({'1': df2.t[df3[1]],
                    '2': df2.t[df3[2]],
                    '3': df2.t[df3[3]]})

print df4

           1       2       3
case                        
1014  0.3991  0.3991  0.3991
1015  0.1991  0.3991  0.3991
1016  0.0991  0.3991  0.3991

pandas 在熊猫数据框中查找目标值

提问by Nic

采纳答案by CT Zhu

相关推荐

最近更新

标签

pandas 在熊猫数据框中查找目标值

提问by Nic

采纳答案by CT Zhu

相关推荐

Pandas DataFrame - 将具有相同索引的一列值组合到列表中

在 Pandas Dataframe 中删除标准差较低的列

pandas 将集合计数器变成字典

pandas 熊猫从 csv 读取数据帧，索引为字符串，而不是 int

相关推荐

最近更新

标签