按列表排序索引 - Python Pandas
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/45389126/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Sort Index by list - Python Pandas
提问by ScoutEU
I have a dataframe that I have pivoted:
我有一个已旋转的数据框:
FinancialYear 2014/2015 2015/2016 2016/2017 2017/2018
Month
April 42 32 29 27
August 34 28 32 0
December 45 51 28 0
February 28 20 28 0
January 32 28 33 0
July 40 66 31 30
June 32 67 37 35
March 43 36 39 0
May 34 30 24 29
November 39 32 31 0
October 38 39 28 0
September 29 19 34 0
This is the code that I used:
这是我使用的代码:
new_hm01 = hmdf[['FinancialYear','Month','FirstReceivedDate']]
hm05 = new_hm01.pivot_table(index=['FinancialYear','Month'], aggfunc='count')
df_hm = new_hm01.groupby(['Month', 'FinancialYear']).size().unstack(fill_value=0).rename(columns=lambda x: '{}'.format(x))
The Months are not in the order I want, so I used the following code to reindex it according to a list:
月份不是我想要的顺序,所以我使用以下代码根据列表重新索引它:
vals = ['April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December', 'January', 'February', 'March']
df_hm = df_hm.reindex(vals)
This worked, but the values in my table are now mostly showing NaN
values.
这有效,但我表中的值现在主要显示NaN
值。
FinancialYear 2014/2015 2015/2016 2016/2017 2017/2018
Month
April nan nan nan nan
May nan nan nan nan
June nan nan nan nan
July nan nan nan nan
August nan nan nan nan
September 29 19 34 0
October nan nan nan nan
November nan nan nan nan
December nan nan nan nan
January nan nan nan nan
February nan nan nan nan
March nan nan nan nan
Any idea on what is happening? How to fix it? and if there is a better alternative method?
知道发生了什么吗?如何解决?如果有更好的替代方法?
回答by unutbu
Unexpected NaNs after reindexing are often due to the new index labels not exactly matching the old index labels. For example, if the original index labels contains whitespaces, but the new labels don't, then you'll get NaNs:
重新索引后出现意外的 NaN 通常是由于新索引标签与旧索引标签不完全匹配。例如,如果原始索引标签包含空格,但新标签不包含,那么您将得到 NaN:
import numpy as np
import pandas as pd
df = pd.DataFrame({'col':[1,2,3]}, index=['April ', 'June ', 'May ', ])
print(df)
# col
# April 1
# June 2
# May 3
df2 = df.reindex(['April', 'May', 'June'])
print(df2)
# col
# April NaN
# May NaN
# June NaN
This can be fixed by removing the whitespace to make the labels match:
这可以通过删除空格使标签匹配来解决:
df.index = df.index.str.strip()
df3 = df.reindex(['April', 'May', 'June'])
print(df3)
# col
# April 1
# May 3
# June 2