使用对列值的函数对 Pandas DataFrame 进行排序

Question

提问by Ohumeronen

Based on python, sort descending dataframe with pandas:

Given:

鉴于：

from pandas import DataFrame
import pandas as pd

d = {'x':[2,3,1,4,5],
     'y':[5,4,3,2,1],
     'letter':['a','a','b','b','c']}

df = DataFrame(d)

df then looks like this:

df 然后看起来像这样：

df:
      letter    x    y
    0      a    2    5
    1      a    3    4
    2      b    1    3
    3      b    4    2
    4      c    5    1

I would like to have something like:

我想要一些类似的东西：

f = lambda x,y: x**2 + y**2
test = df.sort(f('x', 'y'))

This should order the complete dataframe with respect to the sum of the squared values of column 'x' and 'y' and give me:

这应该根据列 'x' 和 'y' 的平方值的总和对完整的数据框进行排序，并给我：

test:
      letter    x    y
    2      b    1    3
    3      b    4    2
    1      a    3    4
    4      c    5    1
    0      a    2    5

Ascending or descending order does not matter. Is there a nice and simple way to do that? I could not yet find a solution.

升序或降序无关紧要。有没有一种很好且简单的方法来做到这一点？我还没有找到解决办法。

Answer 1

采纳答案by andrewkittredge

df.iloc[(df.x ** 2 + df.y **2).sort_values().index]

after How to sort pandas dataframe by custom order on string index

在如何按字符串索引上的自定义顺序对Pandas数据框进行排序之后

Answer 2

回答by ayhan

You can create a temporary column to use in sort and then drop it:

您可以创建一个临时列以用于排序，然后将其删除：

df.assign(f = df['one']**2 + df['two']**2).sort_values('f').drop('f', axis=1)
Out: 
  letter  one  two
2      b    1    3
3      b    4    2
1      a    3    4
4      c    5    1
0      a    2    5

Answer 3

回答by Sandeep

Have you tried to create a new column and then sorting on that. I cannot comment on the original post, so i am just posting my solution.

您是否尝试过创建一个新列，然后对其进行排序。我无法对原始帖子发表评论，所以我只是发布了我的解决方案。

df['c'] = df.a**2 + df.b**2
df = df.sort_values('c')

Answer 4

回答by Adam Warner

from pandas import DataFrame
import pandas as pd

d = {'one':[2,3,1,4,5],
     'two':[5,4,3,2,1],
     'letter':['a','a','b','b','c']}

df = pd.DataFrame(d)

#f = lambda x,y: x**2 + y**2
array = []
for i in range(5):
    array.append(df.ix[i,1]**2 + df.ix[i,2]**2)
array = pd.DataFrame(array, columns = ['Sum of Squares'])
test = pd.concat([df,array],axis = 1, join = 'inner')
test = test.sort_index(by = "Sum of Squares", ascending = True).drop('Sum of Squares',axis =1)

Just realized that you wanted this:

刚刚意识到你想要这个：

    letter  one  two
2      b    1    3
3      b    4    2
1      a    3    4
4      c    5    1
0      a    2    5

使用对列值的函数对 Pandas DataFrame 进行排序

提问by Ohumeronen

采纳答案by andrewkittredge

回答by ayhan

回答by Sandeep

回答by Adam Warner

相关推荐

最近更新

标签

使用对列值的函数对 Pandas DataFrame 进行排序

提问by Ohumeronen

采纳答案by andrewkittredge

回答by ayhan

回答by Sandeep

回答by Adam Warner

相关推荐

pandas ValueError：无法将 DatetimeIndex 转换为 dtype datetime64[us]

Python Pandas，创建指定列数据类型的空 DataFrame

pandas 反加入熊猫

pandas，将数据框中的所有数值乘以一个常数

相关推荐

最近更新

标签