Pandas 重新索引并填充缺失值:“索引必须是单调的”

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/37982170/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 01:27:11  来源:igfitidea点击:

Pandas reindex and fill missing values: "Index must be monotonic"

pythonpandasreindex

提问by michael_j_ward

In answering this stackoverflow question, I found some interesting behavior when using a fill method while reindexing a dataframe.

在回答这个 stackoverflow 问题时,我发现在重新索引数据帧时使用填充方法时出现了一些有趣的行为。

This old bug reportin pandas says that df.reindex(newIndex,method='ffill')should be equivalent to df.reindex(newIndex).ffill(), but that is NOT the behavior I'm witnessing

Pandas中的这个旧错误报告df.reindex(newIndex,method='ffill')应该相当于df.reindex(newIndex).ffill(),但这不是我目睹的行为

Here's a code snippet that illustrates the behavior

这是一个说明行为的代码片段

df = pd.DataFrame({'values': 2}, index=pd.DatetimeIndex(['2016-06-02', '2016-05-04', '2016-06-03']))
newIndex = pd.DatetimeIndex(['2016-05-04', '2016-06-01', '2016-06-02', '2016-06-03', '2016-06-05'])
print(df.reindex(newIndex).ffill())
print(df.reindex(newIndex, method='ffill'))

The first print statement works as expected. The second raises a

第一个打印语句按预期工作。第二个提出了一个

ValueError: index must be monotonic increasing or decreasing

What's going on here?

这里发生了什么?



EDIT: Note that the sample dfintentionallyhas a non-monotonic index. The question pertains to the order of operations in df.reindex(newIndex, method='ffil'). My expectation is as the bug-report says it should work- first reindex with the new index and then fill.

编辑:请注意,样本df有意具有非单调索引。该问题与 中的操作顺序有关df.reindex(newIndex, method='ffil')。我的期望是错误报告说它应该工作 - 首先使用新索引重新索引然后填充。

As you can see, the newIndex.is_monotonicis True, and the fill works when called separately but fails when called as a parameter to reindex.

如您所见,newIndex.is_monotonicisTrue和 fill 在单独调用时有效,但在作为参数调用时失败reindex

回答by piRSquared

Some element of reindexrequires the incoming index to be sorted. I'm deducing that when methodis passed, it fails to presort the incoming index and subsequently fails. I'm drawing this conclusion based on the fact that this works:

的某些元素reindex要求对传入索引进行排序。我推断当method传递时,它无法对传入的索引进行预排序并随后失败。我根据以下事实得出这个结论:

print df.sort_index().reindex(newIndex.sort_values(), method='ffill')

回答by Eric

It seems that this needs to be done on the columns as well.

似乎这也需要在列上完成。

In[76]: frame = DataFrame(np.arange(9).reshape((3, 3)), index=['a', 'c', 'd'],columns=['Ohio', 'Texas', 'California'])

In[77]: frame.reindex(index=['a','b','c','d'],method='ffill',columns=states)
---> ValueError: index must be monotonic increasing or decreasing

In[78]: frame.reindex(index=['a','b','c','d'],method='ffill',columns=states.sort())

Out[78]:
  Ohio  Texas  California
a     0      1           2
b     0      1           2
c     3      4           5
d     6      7           8