pandas 对具有字符串和数字的 DataFrame 索引进行排序

Question

提问by Plug4

My dfDataFrame index looks like this:

我的dfDataFrame 索引如下所示：

Com_Lag_01
Com_Lag_02
Com_Lag_03
Com_Lag_04
Com_Lag_05
Com_Lag_06
Com_Lag_07
Com_Lag_08
Com_Lag_09
Com_Lag_10
Com_Lag_101
Com_Lag_102
Com_Lag_103
...
Com_Lag_11
Com_Lag_111
Com_Lag_112
Com_Lag_113
Com_Lag_114
...
Com_Lag_12
Com_Lag_120
...
Com_Lag_13
Com_Lag_14
Com_Lag_15

I want to sort this index so that the numbers go from Com_Lag_1to Com_Lag_120. If I use df.sort_index()I will get the same thing as above. Any suggestion on how to sort this index properly?

我想对这个索引进行排序，以便数字从Com_Lag_1到Com_Lag_120。如果我使用，df.sort_index()我会得到与上面相同的东西。关于如何正确排序此索引的任何建议？

Answer 1

回答by Guillaume Jacquenot

One could try something like this, by performing a sort on a numbered version of the index

人们可以尝试这样的事情，通过对索引的编号版本执行排序

import pandas as pd
# Create a DataFrame example
df = pd.DataFrame(\
    {'Year': [1991 ,2004 ,2001 ,2009 ,1997],\
    'Age': [27 ,25 ,22 ,34 ,31],\
    },\
    index = ['Com_Lag_1' ,'Com_Lag_12' ,'Com_Lag_3' ,'Com_Lag_24' ,'Com_Lag_5'])

# Add of a column containing a numbered version of the index
df['indexNumber'] = [int(i.split('_')[-1]) for i in df.index]
# Perform sort of the rows
df.sort(['indexNumber'], ascending = [True], inplace = True)
# Deletion of the added column
df.drop('indexNumber', 1, inplace = True)

Edit 2017 - V1:

编辑 2017 - V1：

To avoid SettingWithCopyWarning:

为避免 SettingWithCopyWarning：

df = df.assign(indexNumber=[int(i.split('_')[-1]) for i in df.index])

Edit 2017 - V2 for Pandas Version 0.21.0

编辑 2017 - V2 for Pandas 版本 0.21.0

import pandas as pd
print(pd.__version__)
# Create a DataFrame example
df = pd.DataFrame(\
    {'Year': [1991 ,2004 ,2001 ,2009 ,1997],\
    'Age': [27 ,25 ,22 ,34 ,31],\
    },\
    index = ['Com_Lag_1' ,'Com_Lag_12' ,'Com_Lag_3' ,'Com_Lag_24' ,'Com_Lag_5'])

df.reindex(index=df.index.to_series().str.rsplit('_').str[-1].astype(int).sort_values().index)

Answer 2

回答by jezrael

Solution without new column with DataFrame.reindexby indexof sorted Series:

没有新列的解决方案与DataFrame.reindex通过index的排序Series：

a = df.index.to_series().str.rsplit('_').str[-1].astype(int).sort_values()
print (a)
Com_Lag_1      1
Com_Lag_3      3
Com_Lag_5      5
Com_Lag_12    12
Com_Lag_24    24
dtype: int32

df = df.reindex(index=a.index)
print (df)
            Age  Year
Com_Lag_1    27  1991
Com_Lag_3    22  2001
Com_Lag_5    31  1997
Com_Lag_12   25  2004
Com_Lag_24   34  2009

But if duplicated values is necessary add new column:

但如果需要重复值，请添加新列：

df = pd.DataFrame(\
    {'Year': [1991 ,2004 ,2001 ,2009 ,1997],\
    'Age': [27 ,25 ,22 ,34 ,31],\
    },\
    index = ['Com_Lag_1' ,'Com_Lag_12' ,'Com_Lag_3' ,'Com_Lag_24' ,'Com_Lag_12'])

print (df)
            Age  Year
Com_Lag_1    27  1991
Com_Lag_12   25  2004
Com_Lag_3    22  2001
Com_Lag_24   34  2009
Com_Lag_12   31  1997

df['indexNumber'] = df.index.str.rsplit('_').str[-1].astype(int)
df = df.sort_values(['indexNumber']).drop('indexNumber', axis=1)
print (df)
            Age  Year
Com_Lag_1    27  1991
Com_Lag_3    22  2001
Com_Lag_12   25  2004
Com_Lag_12   31  1997
Com_Lag_24   34  2009

pandas 对具有字符串和数字的 DataFrame 索引进行排序

提问by Plug4

回答by Guillaume Jacquenot

回答by jezrael

相关推荐

最近更新

标签

pandas 对具有字符串和数字的 DataFrame 索引进行排序

提问by Plug4

回答by Guillaume Jacquenot

回答by jezrael

相关推荐

分配给 Pandas 中的容器

在 python pandas 中按列分层的箱线图

pandas “模块”对象没有属性“_version_”

使用 Pandas 进行自定义排序

相关推荐

最近更新

标签