Pandas：迭代已经排序的列的唯一值

Question

提问by Setjmp

I have constructed a pandas data frame in sorted order and would like to iterate over groups having identical values of a particular column. It seems to me that the groupby functionality is useful for this, but as far as I can tell performing groupby does not give any guarantee about the order of the key. How can I extract the unqiue column values in sorted order.

我已经按排序顺序构建了一个 Pandas 数据框，并希望迭代具有特定列的相同值的组。在我看来 groupby 功能对此很有用，但据我所知，执行 groupby 并不能保证密钥的顺序。如何按排序顺序提取 unqiue 列值。

Here is an example data frame:

这是一个示例数据框：

Foo,1
Foo,2
Bar,2
Bar,1

I would like a list ["Foo","Bar"] where the order is guaranteed by the order of the original data frame. I can then use this list to extract appropriate rows. The sort is actually defined in my case by columns that are also given in the data frame (not included in the example above) and so a solution that re-sorts will be acceptable if the information can not be pulled out directly.

我想要一个列表 ["Foo","Bar"] ，其中的顺序由原始数据框的顺序保证。然后我可以使用这个列表来提取适当的行。在我的情况下，排序实际上是由数据框中给出的列（不包括在上面的示例中）定义的，因此如果无法直接提取信息，重新排序的解决方案将是可以接受的。

Answer 1

回答by Andy Hayden

As mentioned in the comments, you can use unique on the column which will preserve the order (unlike numpy's unique, it doesn't sort):

正如评论中提到的，您可以在列上使用 unique 来保留顺序（与 numpy 的 unique 不同，它不排序）：

In [11]: df
Out[11]: 
     0  1
0  Foo  1
1  Foo  2
2  Bar  2
3  Bar  1

In [12]: df[0].unique()
Out[12]: array(['Foo', 'Bar'], dtype=object)

Then you can access the relevant rows using groupby's get_group:

然后您可以使用 groupby's 访问相关行get_group：

In [13]: g = df.groupby([0])

In [14]: g.get_group('Foo')
Out[14]: 
     0  1
0  Foo  1
1  Foo  2

Pandas：迭代已经排序的列的唯一值

提问by Setjmp

回答by Andy Hayden

相关推荐

最近更新

标签

Pandas：迭代已经排序的列的唯一值

提问by Setjmp

回答by Andy Hayden

相关推荐

pandas 填充熊猫中缺失的索引

使用 sqlalchemy、mysql 和 pandas 读取框架

python pandas 3 个最小值和 3 个最大值

在 Pandas 中将列附加到 HDF 文件的框架

相关推荐

最近更新

标签