使用 ix() 方法对带有负索引的 Pandas DataFrame 进行切片

Question

提问by Julia He

DataFrame.ix() does not seem to slice the DataFrame that I want when negative indexing is used.

当使用负索引时，DataFrame.ix() 似乎没有切片我想要的 DataFrame。

I have a DataFrame object and want to slice the last 2 rows.

我有一个 DataFrame 对象，想要切片最后 2 行。

    In [90]: df = pd.DataFrame(np.random.randn(10, 4))

    In [91]: df
    Out[91]: 
            0         1         2         3
    0  1.985922  0.664665 -2.800102  1.695480
    1  0.580509  0.782473  1.032970  1.559917
    2  0.584387  1.798743  0.095950  0.071999
    3  1.956221  0.075530 -0.391008  1.692585
    4 -0.644979 -1.959265  0.749394 -0.437995
    5 -1.204964  0.653912 -1.426602  2.409855
    6  1.178886  2.177259 -0.165106  1.145952
    7  1.410595 -0.761426 -1.280866  0.609122
    8  0.110534 -0.234781 -0.819976  0.252080
    9  1.798894  0.553394 -1.358335  1.278704

One way to do it:

一种方法：

    In [92]: df[-2:]
    Out[92]: 
              0         1         2         3
    8  0.110534 -0.234781 -0.819976  0.252080
    9  1.798894  0.553394 -1.358335  1.278704

Anther way to do it:

花药的做法：

    In [93]: df.ix[len(df)-2:, :]
    Out[93]: 
              0         1         2         3
    8  0.110534 -0.234781 -0.819976  0.252080
    9  1.798894  0.553394 -1.358335  1.278704

Now I want to use negative indexing, but having problem:

现在我想使用负索引，但有问题：

    In [94]: df.ix[-2:, :]
    Out[94]: 
              0         1         2         3
    0  1.985922  0.664665 -2.800102  1.695480
    1  0.580509  0.782473  1.032970  1.559917
    2  0.584387  1.798743  0.095950  0.071999
    3  1.956221  0.075530 -0.391008  1.692585
    4 -0.644979 -1.959265  0.749394 -0.437995
    5 -1.204964  0.653912 -1.426602  2.409855
    6  1.178886  2.177259 -0.165106  1.145952
    7  1.410595 -0.761426 -1.280866  0.609122
    8  0.110534 -0.234781 -0.819976  0.252080
    9  1.798894  0.553394 -1.358335  1.278704

How do I use negative indexing with DataFrame.ix() correctly? Thanks.

如何正确使用 DataFrame.ix() 负索引？谢谢。

Answer 1

回答by Wes McKinney

This is a bug:

这是一个错误：

In [1]: df = pd.DataFrame(np.random.randn(10, 4))

In [2]: df
Out[2]: 
          0         1         2         3
0 -3.100926 -0.580586 -1.216032  0.425951
1 -0.264271 -1.091915 -0.602675  0.099971
2 -0.846290  1.363663 -0.382874  0.065783
3 -0.099879 -0.679027 -0.708940  0.138728
4 -0.302597  0.753350 -0.112674 -1.253316
5 -0.213237 -0.467802  0.037350  0.369167
6  0.754915 -0.569134 -0.297824 -0.600527
7  0.644742  0.038862  0.216869  0.294149
8  0.101684  0.784329  0.218221  0.965897
9 -1.482837 -1.325625  1.008795 -0.150439

In [3]: df.ix[-2:]
Out[3]: 
          0         1         2         3
0 -3.100926 -0.580586 -1.216032  0.425951
1 -0.264271 -1.091915 -0.602675  0.099971
2 -0.846290  1.363663 -0.382874  0.065783
3 -0.099879 -0.679027 -0.708940  0.138728
4 -0.302597  0.753350 -0.112674 -1.253316
5 -0.213237 -0.467802  0.037350  0.369167
6  0.754915 -0.569134 -0.297824 -0.600527
7  0.644742  0.038862  0.216869  0.294149
8  0.101684  0.784329  0.218221  0.965897
9 -1.482837 -1.325625  1.008795 -0.150439

https://github.com/pydata/pandas/issues/2600

Note that df[-2:]will work:

请注意，这df[-2:]将起作用：

In [4]: df[-2:]
Out[4]: 
          0         1         2         3
8  0.101684  0.784329  0.218221  0.965897
9 -1.482837 -1.325625  1.008795 -0.150439

Answer 2

回答by Zelazny7

ix's main purpose is to allow numpy like indexing with support for row and column labels. So I'm not sure your use-case is the intended purpose. Here are a couple of ways I can think of, mostly trivial:

ix的主要目的是允许类似 numpy 的索引并支持行和列标签。所以我不确定你的用例是预期的目的。以下是我能想到的几种方法，大多是微不足道的：

In [142]: df.ix[:][-2:]
Out[142]:
          0         1         2         3
8  0.386882 -0.836112 -0.108250 -0.433797
9  0.642468 -0.399255 -0.911456 -0.497720

In [161]: df.ix[df.index[-2:],:]
Out[161]:
          0         1         2         3
8  0.386882 -0.836112 -0.108250 -0.433797
9  0.642468 -0.399255 -0.911456 -0.497720

I don't think ixsupports negative indexing at all. It seems to just ignore it altogether:

我认为根本不ix支持负索引。它似乎完全忽略了它：

In [181]: df.ix[-100:,:]
Out[181]:
          0         1         2         3
0 -1.144137 -1.042034 -2.158838  0.674055
1 -0.424184  1.237318 -1.846130  0.575357
2 -0.844974 -0.541060  2.197364 -0.031898
3  0.846263  1.244450 -1.570566 -0.477919
4 -0.193445  0.171045 -0.235587 -1.185583
5  1.361539 -1.107389 -1.321081 -0.776407
6  0.505907 -1.364414 -2.093770  0.144016
7 -0.888465 -0.329153  0.491264 -0.363472
8  0.386882 -0.836112 -0.108250 -0.433797
9  0.642468 -0.399255 -0.911456 -0.497720

Edit: From the pandas documentationwe have:

编辑：从Pandas文档我们有：

Label-based indexing with integer axis labels is a thorny topic. It has been discussed heavily on mailing lists and among various members of the scientific Python community. In pandas, our general viewpoint is that labels matter more than integer locations. Therefore, with an integer axis index only label-based indexing is possible with the standard tools like .ix. The following code will generate exceptions:
s = Series(range(5))
s[-1]
df = DataFrame(np.random.randn(5, 4))
df
df.ix[-2:]
This deliberate decision was made to prevent ambiguities and subtle bugs (many users reported finding bugs when the API change was made to stop “falling back” on position-based indexing).

带有整数轴标签的基于标签的索引是一个棘手的话题。它已经在邮件列表和科学 Python 社区的各个成员之间进行了大量讨论。在 Pandas 中，我们的一般观点是标签比整数位置更重要。因此，对于整数轴索引，只能使用 .ix 等标准工具进行基于标签的索引。以下代码将产生异常：
s = Series(range(5))
s[-1]
df = DataFrame(np.random.randn(5, 4))
df
df.ix[-2:]
这一深思熟虑的决定是为了防止歧义和细微的错误（许多用户报告说，当 API 更改停止“回退”基于位置的索引时发现了错误）。

使用 ix() 方法对带有负索引的 Pandas DataFrame 进行切片

提问by Julia He

回答by Wes McKinney

回答by Zelazny7

相关推荐

最近更新

标签

使用 ix() 方法对带有负索引的 Pandas DataFrame 进行切片

提问by Julia He

回答by Wes McKinney

回答by Zelazny7

相关推荐

pandas DataFrame 单独划分一列

pandas Python熊猫插入长整数

pandas 熊猫合并并保留索引

pandas python pandas数据帧线程安全吗？

相关推荐

最近更新

标签