如何按两列或更多列对python pandas中的数据帧进行排序？

Question

提问by Rakesh Adhikesavan

Suppose I have a dataframe with columns a, band c, I want to sort the dataframe by column bin ascending order, and by column cin descending order, how do I do this?

假设我有一列的数据帧a，b并且c，我要通过列数据框排序b按升序，并经柱c降序排列，我该怎么办呢？

Answer 1

采纳答案by Andy Hayden

As of the 0.17.0 release, the sortmethod was deprecated in favor of sort_values. sortwas completely removed in the 0.20.0 release. The arguments (and results) remain the same:

从 0.17.0 版本开始，该sort方法已被弃用，而支持sort_values. sort在 0.20.0 版本中被完全删除。参数（和结果）保持不变：

df.sort_values(['a', 'b'], ascending=[True, False])

You can use the ascending argument of sort:

您可以使用的升序参数sort：

df.sort(['a', 'b'], ascending=[True, False])

For example:

例如：

In [11]: df1 = pd.DataFrame(np.random.randint(1, 5, (10,2)), columns=['a','b'])

In [12]: df1.sort(['a', 'b'], ascending=[True, False])
Out[12]:
   a  b
2  1  4
7  1  3
1  1  2
3  1  2
4  3  2
6  4  4
0  4  3
9  4  3
5  4  1
8  4  1

As commented by @renadeen

正如@renadeen 所评论的

Sort isn't in place by default! So you should assign result of the sort method to a variable or add inplace=True to method call.

默认情况下没有排序！因此，您应该将 sort 方法的结果分配给变量或将 inplace=True 添加到方法调用中。

that is, if you want to reuse df1 as a sorted DataFrame:

也就是说，如果您想将 df1 重用为已排序的 DataFrame：

df1 = df1.sort(['a', 'b'], ascending=[True, False])

or

或者

df1.sort(['a', 'b'], ascending=[True, False], inplace=True)

Answer 2

回答by Kyle Heuton

As of pandas 0.17.0, DataFrame.sort()is deprecated, and set to be removed in a future version of pandas. The way to sort a dataframe by its values is now is DataFrame.sort_values

从 pandas 0.17.0 开始，DataFrame.sort()已弃用，并将在未来版本的 pandas 中删除。现在按值对数据框进行排序的方法是DataFrame.sort_values

As such, the answer to your question would now be

因此，您的问题的答案现在是

df.sort_values(['b', 'c'], ascending=[True, False], inplace=True)

Answer 3

回答by jpp

For large dataframes of numeric data, you may see a significant performance improvement via numpy.lexsort, which performs an indirect sort using a sequence of keys:

对于数字数据的大型数据帧，您可能会通过看到显着的性能改进numpy.lexsort，它使用一系列键执行间接排序：

import pandas as pd
import numpy as np

np.random.seed(0)

df1 = pd.DataFrame(np.random.randint(1, 5, (10,2)), columns=['a','b'])
df1 = pd.concat([df1]*100000)

def pdsort(df1):
    return df1.sort_values(['a', 'b'], ascending=[True, False])

def lex(df1):
    arr = df1.values
    return pd.DataFrame(arr[np.lexsort((-arr[:, 1], arr[:, 0]))])

assert (pdsort(df1).values == lex(df1).values).all()

%timeit pdsort(df1)  # 193 ms per loop
%timeit lex(df1)     # 143 ms per loop

One peculiarity is that the defined sorting order with numpy.lexsortis reversed: (-'b', 'a')sorts by series afirst. We negate series bto reflect we want this series in descending order.

一个特点是定义的排序顺序numpy.lexsort相反：首先(-'b', 'a')按系列排序a。我们否定系列b以反映我们希望按降序排列的系列。

Be aware that np.lexsortonly sorts with numeric values, while pd.DataFrame.sort_valuesworks with either string or numeric values. Using np.lexsortwith strings will give: TypeError: bad operand type for unary -: 'str'.

请注意，np.lexsort仅使用数字值排序，而pd.DataFrame.sort_values适用于字符串或数字值。np.lexsort与字符串一起使用将给出：TypeError: bad operand type for unary -: 'str'.

如何按两列或更多列对python pandas中的数据帧进行排序？

提问by Rakesh Adhikesavan

采纳答案by Andy Hayden

回答by Kyle Heuton

回答by jpp

相关推荐

最近更新

标签

如何按两列或更多列对python pandas中的数据帧进行排序？

提问by Rakesh Adhikesavan

采纳答案by Andy Hayden

回答by Kyle Heuton

回答by jpp

相关推荐

如何使用 Python、Pandas 创建一个 Decile 和 Quintile 列以根据大小对另一个变量进行排名？

使用 BeautifulSoup 和 Python 抓取多个页面

在 Python 2.7 中获取列表长度作为字典中的值

是否使用 -m 选项执行 Python 代码

相关推荐

最近更新

标签