Python 如何使用来自多列的值对熊猫数据框进行排序？

Question

提问by Roman

I have the following data frame:

我有以下数据框：

df = pandas.DataFrame([{'c1':3,'c2':10},{'c1':2, 'c2':30},{'c1':1,'c2':20},{'c1':2,'c2':15},{'c1':2,'c2':100}])

Or, in human readable form:

或者，以人类可读的形式：

The following sorting-command works as expected:

以下排序命令按预期工作：

df.sort(['c1','c2'], ascending=False)

Output:

输出：

But the following command:

但是下面的命令：

df.sort(['c1','c2'], ascending=[False,True])

results in

结果是

and this is not what I expect. I expect to have the values in the first column ordered from largest to smallest, and if there are identical values in the first column, order by the ascending values from the second column.

这不是我所期望的。我希望第一列中的值从大到小排序，如果第一列中有相同的值，则按第二列中的升序值排序。

Does anybody know why it does not work as expected?

有人知道为什么它不能按预期工作吗？

ADDED

添加

This is copy-paste:

这是复制粘贴：

>>> df.sort(['c1','c2'], ascending=[False,True])
   c1   c2
2   1   20
3   2   15
1   2   30
4   2  100
0   3   10

Answer 1

回答by falsetru

DataFrame.sortis deprecated; use DataFrame.sort_values.

DataFrame.sort已弃用；使用DataFrame.sort_values.

>>> df.sort_values(['c1','c2'], ascending=[False,True])
   c1   c2
0   3   10
3   2   15
1   2   30
4   2  100
2   1   20
>>> df.sort(['c1','c2'], ascending=[False,True])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/ampawake/anaconda/envs/pseudo/lib/python2.7/site-packages/pandas/core/generic.py", line 3614, in __getattr__
    return object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute 'sort'

Answer 2

回答by Akash

If you are writing this code as a script file then you will have to write it like this:

如果您将此代码编写为脚本文件，则必须这样编写：

df = df.sort(['c1','c2'], ascending=[False,True])

Answer 3

回答by HonzaB

Use of sortcan result in warning message. See githubdiscussion. So you might wanna use sort_values, docs here

使用sort可能会导致警告消息。见github讨论。所以你可能想使用sort_values，这里的文档

Then your code can look like this:

然后你的代码看起来像这样：

df = df.sort_values(by=['c1','c2'], ascending=[False,True])

Answer 4

回答by miraculixx

I have found this to be really useful:

我发现这非常有用：

df = pd.DataFrame({'A' : range(0,10) * 2, 'B' : np.random.randint(20,30,20)})

# A ascending, B descending
df.sort(**skw(columns=['A','-B']))

# A descending, B ascending
df.sort(**skw(columns=['-A','+B']))

Note that unlike the standard columns=,ascending=arguments, here column names and their sort order are in the same place. As a result your code gets a lot easier to read and maintain.

请注意，与标准columns=,ascending=参数不同，此处的列名及其排序顺序位于同一位置。因此，您的代码更易于阅读和维护。

Note the actual call to .sortis unchanged, skw(sortkwargs) is just a small helper function that parses the columns and returns the usual columns=and ascending=parameters for you. Pass it any other sort kwargs as you usually would. Copy/paste the following code into e.g. your local utils.pythen forget about it and just use it as above.

请注意，对的实际调用.sort没有改变，skw( sort kwargs) 只是一个小的辅助函数，它解析列并为您返回常用参数columns=和ascending=参数。像往常一样将任何其他类型的 kwarg 传递给它。将以下代码复制/粘贴到例如您的本地代码中，utils.py然后忘记它并按上述方式使用它。

# utils.py (or anywhere else convenient to import)
def skw(columns=None, **kwargs):
    """ get sort kwargs by parsing sort order given in column name """
    # set default order as ascending (+)
    sort_cols = ['+' + col if col[0] != '-' else col for col in columns]
    # get sort kwargs
    columns, ascending = zip(*[(col.replace('+', '').replace('-', ''), 
                                False if col[0] == '-' else True) 
                               for col in sort_cols])
    kwargs.update(dict(columns=list(columns), ascending=ascending))
    return kwargs

Answer 5

回答by fotis j

The dataframe.sort() method is - so my understanding - deprecated in pandas > 0.18. In order to solve your problem you should use dataframe.sort_values() instead:

dataframe.sort() 方法 - 所以我的理解 - 在 pandas > 0.18 中被弃用。为了解决您的问题，您应该使用 dataframe.sort_values() 代替：

f.sort_values(by=["c1","c2"], ascending=[False, True])

The output looks like this:

输出如下所示：

Answer 6

回答by CONvid19

In my case, the accepted answer didn't work:

就我而言，接受的答案不起作用：

~~f.sort_values(by=["c1","c2"], ascending=[False, True])~~

~~f.sort_values(by=["c1","c2"], 升序=[假，真])~~

Only the following worked as expected:

只有以下按预期工作：

f = f.sort_values(by=["c1","c2"], ascending=[False, True])

Answer 7

回答by siddesh chavan

Note : Everything up here is correct,just replace sort--> sort_values() So, it becomes:

注意：这里的一切都是正确的，只需替换sort--> sort_values() 所以，它变成：

 import pandas as pd
 df = pd.read_csv('data.csv')
 df.sort_values(ascending=False,inplace=True)

Refer to the official website here.

请参阅此处的官方网站。

Python 如何使用来自多列的值对熊猫数据框进行排序？

提问by Roman

回答by falsetru

回答by Akash

回答by HonzaB

回答by miraculixx

回答by fotis j

回答by CONvid19

回答by siddesh chavan

相关推荐

最近更新

标签

Python 如何使用来自多列的值对熊猫数据框进行排序？

提问by Roman

回答by falsetru

回答by Akash

回答by HonzaB

回答by miraculixx

回答by fotis j

回答by CONvid19

回答by siddesh chavan

相关推荐

Python 旅行商贪婪算法

如何使用 Python 检索动态 html 内容的值

在 Python 中使用 Matplotlib 绘制抛物线图

在 Mac 上安装 MySQL-python

相关推荐

最近更新

标签