使用对列值的函数对 Pandas DataFrame 进行排序

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/38662826/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 01:42:06  来源:igfitidea点击:

Sort pandas DataFrame with function over column values

pythonsortingpandasdataframe

提问by Ohumeronen

Based on python, sort descending dataframe with pandas:

基于python,使用pandas对降序数据框进行排序

Given:

鉴于:

from pandas import DataFrame
import pandas as pd

d = {'x':[2,3,1,4,5],
     'y':[5,4,3,2,1],
     'letter':['a','a','b','b','c']}

df = DataFrame(d)

df then looks like this:

df 然后看起来像这样:

df:
      letter    x    y
    0      a    2    5
    1      a    3    4
    2      b    1    3
    3      b    4    2
    4      c    5    1

I would like to have something like:

我想要一些类似的东西:

f = lambda x,y: x**2 + y**2
test = df.sort(f('x', 'y'))

This should order the complete dataframe with respect to the sum of the squared values of column 'x' and 'y' and give me:

这应该根据列 'x' 和 'y' 的平方值的总和对完整的数据框进行排序,并给我:

test:
      letter    x    y
    2      b    1    3
    3      b    4    2
    1      a    3    4
    4      c    5    1
    0      a    2    5

Ascending or descending order does not matter. Is there a nice and simple way to do that? I could not yet find a solution.

升序或降序无关紧要。有没有一种很好且简单的方法来做到这一点?我还没有找到解决办法。

采纳答案by andrewkittredge

回答by ayhan

You can create a temporary column to use in sort and then drop it:

您可以创建一个临时列以用于排序,然后将其删除:

df.assign(f = df['one']**2 + df['two']**2).sort_values('f').drop('f', axis=1)
Out: 
  letter  one  two
2      b    1    3
3      b    4    2
1      a    3    4
4      c    5    1
0      a    2    5

回答by Sandeep

Have you tried to create a new column and then sorting on that. I cannot comment on the original post, so i am just posting my solution.

您是否尝试过创建一个新列,然后对其进行排序。我无法对原始帖子发表评论,所以我只是发布了我的解决方案。

df['c'] = df.a**2 + df.b**2
df = df.sort_values('c')

回答by Adam Warner

from pandas import DataFrame
import pandas as pd

d = {'one':[2,3,1,4,5],
     'two':[5,4,3,2,1],
     'letter':['a','a','b','b','c']}

df = pd.DataFrame(d)

#f = lambda x,y: x**2 + y**2
array = []
for i in range(5):
    array.append(df.ix[i,1]**2 + df.ix[i,2]**2)
array = pd.DataFrame(array, columns = ['Sum of Squares'])
test = pd.concat([df,array],axis = 1, join = 'inner')
test = test.sort_index(by = "Sum of Squares", ascending = True).drop('Sum of Squares',axis =1)

Just realized that you wanted this:

刚刚意识到你想要这个:

    letter  one  two
2      b    1    3
3      b    4    2
1      a    3    4
4      c    5    1
0      a    2    5