具有多于一列的 Pandas 数据框的差异
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/19939896/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Diff on pandas dataframe with more than one column
提问by Fra
I have a pandas dataframe with two columns:
我有一个包含两列的Pandas数据框:
ddf.head()
a b
0 3136 13280
1 3072 13312
2 3152 13296
3 3120 13248
4 3120 13200
I would like to calculate the difference between consecutive elements in the same column. Now, if I do it for one column at a time (ddf['a'].diff()) it works as I expect, but if I try ddf.diff()it gives:
我想计算同一列中连续元素之间的差异。现在,如果我一次做一列 ( ddf['a'].diff()) 它会按我的预期工作,但如果我尝试ddf.diff()它会给出:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-68-6ff864856571> in <module>()
----> 1 ddf.diff()
/home/app/anaconda/lib/python2.7/site-packages/pandas/core/frame.pyc in diff(self, periods)
4285 diffed : DataFrame
4286 """
-> 4287 new_data = self._data.diff(periods)
4288 return self._constructor(new_data)
4289
/home/app/anaconda/lib/python2.7/site-packages/pandas/core/internals.pyc in diff(self, *args, **kwargs)
1287
1288 def diff(self, *args, **kwargs):
-> 1289 return self.apply('diff', *args, **kwargs)
1290
1291 def interpolate(self, *args, **kwargs):
/home/app/anaconda/lib/python2.7/site-packages/pandas/core/internals.pyc in apply(self, f, *args, **kwargs)
1267 applied = f(blk, *args, **kwargs)
1268 else:
-> 1269 applied = getattr(blk,f)(*args, **kwargs)
1270
1271 if isinstance(applied,list):
/home/app/anaconda/lib/python2.7/site-packages/pandas/core/internals.pyc in diff(self, n)
423 def diff(self, n):
424 """ return block for the diff of the values """
--> 425 new_values = com.diff(self.values, n, axis=1)
426 return make_block(new_values, self.items, self.ref_items, fastpath=True)
427
/home/app/anaconda/lib/python2.7/site-packages/pandas/core/common.pyc in diff(arr, n, axis)
643 if arr.ndim == 2 and arr.dtype.name in _diff_special:
644 f = _diff_special[arr.dtype.name]
--> 645 f(arr, out_arr, n, axis)
646 else:
647 res_indexer = [slice(None)] * arr.ndim
/home/app/anaconda/lib/python2.7/site-packages/pandas/algos.so in pandas.algos.diff_2d_int16 (pandas/algos.c:91446)()
ValueError: Buffer dtype mismatch, expected 'float32_t' but got 'double'
采纳答案by Roman Pekar
You can use this:
你可以使用这个:
>>> df - df.shift(1)
a b
0 NaN NaN
1 -64 32
2 80 -16
3 -32 -48
4 0 -48
But actually, at my machine, df.diff()works ok:
但实际上,在我的机器上,df.diff()工作正常:
>>> df.diff()
a b
0 NaN NaN
1 -64 32
2 80 -16
3 -32 -48
4 0 -48

