KDB+ 像 asof 一样在 Pandas 中加入时间序列数据？

Question

提问by signalseeker

kdb+ has an ajfunction that is usually used to join tables along time columns.

kdb+ 有一个aj函数，通常用于沿时间列连接表。

Here is an example where I have trade and quote tables and I get the prevailing quote for every trade.

这是一个例子，我有交易和报价表，我得到每笔交易的现行报价。

q)5# t
time         sym  price size 
-----------------------------
09:30:00.439 NVDA 13.42 60511
09:30:00.439 NVDA 13.42 60511
09:30:02.332 NVDA 13.42 100  
09:30:02.332 NVDA 13.42 100  
09:30:02.333 NVDA 13.41 100  

q)5# q
time         sym  bid   ask   bsize asize
-----------------------------------------
09:30:00.026 NVDA 13.34 13.44 3     16   
09:30:00.043 NVDA 13.34 13.44 3     17   
09:30:00.121 NVDA 13.36 13.65 1     10   
09:30:00.386 NVDA 13.36 13.52 21    1    
09:30:00.440 NVDA 13.4  13.44 15    17

q)5# aj[`time; t; q]
time         sym  price size  bid   ask   bsize asize
-----------------------------------------------------
09:30:00.439 NVDA 13.42 60511 13.36 13.52 21    1    
09:30:00.439 NVDA 13.42 60511 13.36 13.52 21    1    
09:30:02.332 NVDA 13.42 100   13.34 13.61 1     1    
09:30:02.332 NVDA 13.42 100   13.34 13.61 1     1    
09:30:02.333 NVDA 13.41 100   13.34 13.51 1     1

How can I do the same operation using pandas? I am working with trade and quote dataframes where the index is datetime64.

如何使用 Pandas 执行相同的操作？我正在使用索引为 datetime64 的交易和报价数据框。

In [55]: quotes.head()
Out[55]: 
                              bid    ask  bsize  asize
2012-09-06 09:30:00.026000  13.34  13.44      3     16
2012-09-06 09:30:00.043000  13.34  13.44      3     17
2012-09-06 09:30:00.121000  13.36  13.65      1     10
2012-09-06 09:30:00.386000  13.36  13.52     21      1
2012-09-06 09:30:00.440000  13.40  13.44     15     17

In [56]: trades.head()
Out[56]: 
                            price   size
2012-09-06 09:30:00.439000  13.42  60511
2012-09-06 09:30:00.439000  13.42  60511
2012-09-06 09:30:02.332000  13.42    100
2012-09-06 09:30:02.332000  13.42    100
2012-09-06 09:30:02.333000  13.41    100

I see that pandas has an asof function but that is not defined on the DataFrame, only on the Series object. I guess one could loop through each of the Series and align them one by one, but I am wondering if there is a better way?

我看到Pandas有一个 asof 函数，但它没有在 DataFrame 上定义，只在 Series 对象上定义。我想可以遍历每个系列并将它们一个一个对齐，但我想知道是否有更好的方法？

Answer 1

采纳答案by Chang She

As you mentioned in the question, looping through each column should work for you:

正如您在问题中提到的，遍历每一列应该对您有用：

df1.apply(lambda x: x.asof(df2.index))

We could potentially create a faster NaN-naive version of DataFrame.asof to do all the columns in one shot. But for now, I think this is the most straightforward way.

我们可能会创建一个更快的 NaN-naive 版本的 DataFrame.asof 来一次性完成所有列。但就目前而言，我认为这是最直接的方式。

Answer 2

回答by Wes McKinney

I wrote an under-advertised ordered_mergefunction some time ago:

ordered_merge前段时间我写了一个宣传不足的函数：

In [27]: quotes
Out[27]: 
                        time    bid    ask  bsize  asize
0 2012-09-06 09:30:00.026000  13.34  13.44      3     16
1 2012-09-06 09:30:00.043000  13.34  13.44      3     17
2 2012-09-06 09:30:00.121000  13.36  13.65      1     10
3 2012-09-06 09:30:00.386000  13.36  13.52     21      1
4 2012-09-06 09:30:00.440000  13.40  13.44     15     17

In [28]: trades
Out[28]: 
                        time  price   size
0 2012-09-06 09:30:00.439000  13.42  60511
1 2012-09-06 09:30:00.439000  13.42  60511
2 2012-09-06 09:30:02.332000  13.42    100
3 2012-09-06 09:30:02.332000  13.42    100
4 2012-09-06 09:30:02.333000  13.41    100

In [29]: ordered_merge(quotes, trades)
Out[29]: 
                        time    bid    ask  bsize  asize  price   size
0 2012-09-06 09:30:00.026000  13.34  13.44      3     16    NaN    NaN
1 2012-09-06 09:30:00.043000  13.34  13.44      3     17    NaN    NaN
2 2012-09-06 09:30:00.121000  13.36  13.65      1     10    NaN    NaN
3 2012-09-06 09:30:00.386000  13.36  13.52     21      1    NaN    NaN
4 2012-09-06 09:30:00.439000    NaN    NaN    NaN    NaN  13.42  60511
5 2012-09-06 09:30:00.439000    NaN    NaN    NaN    NaN  13.42  60511
6 2012-09-06 09:30:00.440000  13.40  13.44     15     17    NaN    NaN
7 2012-09-06 09:30:02.332000    NaN    NaN    NaN    NaN  13.42    100
8 2012-09-06 09:30:02.332000    NaN    NaN    NaN    NaN  13.42    100
9 2012-09-06 09:30:02.333000    NaN    NaN    NaN    NaN  13.41    100

In [32]: ordered_merge(quotes, trades, fill_method='ffill')
Out[32]: 
                        time    bid    ask  bsize  asize  price   size
0 2012-09-06 09:30:00.026000  13.34  13.44      3     16    NaN    NaN
1 2012-09-06 09:30:00.043000  13.34  13.44      3     17    NaN    NaN
2 2012-09-06 09:30:00.121000  13.36  13.65      1     10    NaN    NaN
3 2012-09-06 09:30:00.386000  13.36  13.52     21      1    NaN    NaN
4 2012-09-06 09:30:00.439000  13.36  13.52     21      1  13.42  60511
5 2012-09-06 09:30:00.439000  13.36  13.52     21      1  13.42  60511
6 2012-09-06 09:30:00.440000  13.40  13.44     15     17  13.42  60511
7 2012-09-06 09:30:02.332000  13.40  13.44     15     17  13.42    100
8 2012-09-06 09:30:02.332000  13.40  13.44     15     17  13.42    100
9 2012-09-06 09:30:02.333000  13.40  13.44     15     17  13.41    100

It could be easily (well, for someone who is familiar with the code) extended to be a "left join" mimicking KDB. I realize in this case that forward-filling the trade data is not appropriate; just illustrating the function.

它可以很容易地（好吧，对于熟悉代码的人）扩展为模仿 KDB 的“左连接”。我意识到在这种情况下向前填充贸易数据是不合适的；只是说明功能。

Answer 3

回答by chrisaycock

pandas 0.19 has introduced an asof join:

pandas 0.19 引入了 asof join：

pd.merge_asof(trades, quotes, on='time')

The semantics are very similar to the functionality in q/kdb+.

语义与 q/kdb+ 中的功能非常相似。

KDB+ 像 asof 一样在 Pandas 中加入时间序列数据？

提问by signalseeker

采纳答案by Chang She

回答by Wes McKinney

回答by chrisaycock

相关推荐

最近更新

标签

KDB+ 像 asof 一样在 Pandas 中加入时间序列数据？

提问by signalseeker

采纳答案by Chang She

回答by Wes McKinney

回答by chrisaycock

相关推荐

apache 使用“共享”选项编译 OpenSSL？

Linux/Apache 上的 ColdFusion 是否稳定？

apache Zend 框架项目显示空白页面，没有任何错误

apache 如何删除导致Apache 400的大cookie

相关推荐

最近更新

标签