按列表排序索引 - Python Pandas

Question

提问by ScoutEU

I have a dataframe that I have pivoted:

我有一个已旋转的数据框：

FinancialYear   2014/2015   2015/2016   2016/2017   2017/2018
Month               
April             42           32          29          27
August            34           28          32           0
December          45           51          28           0
February          28           20          28           0
January           32           28          33           0
July              40           66          31          30
June              32           67          37          35
March             43           36          39           0
May               34           30          24          29
November          39           32          31           0
October           38           39          28           0
September         29           19          34           0

This is the code that I used:

这是我使用的代码：

new_hm01 = hmdf[['FinancialYear','Month','FirstReceivedDate']]

hm05 = new_hm01.pivot_table(index=['FinancialYear','Month'], aggfunc='count')

df_hm = new_hm01.groupby(['Month', 'FinancialYear']).size().unstack(fill_value=0).rename(columns=lambda x: '{}'.format(x))

The Months are not in the order I want, so I used the following code to reindex it according to a list:

月份不是我想要的顺序，所以我使用以下代码根据列表重新索引它：

vals = ['April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December', 'January', 'February', 'March']

df_hm = df_hm.reindex(vals)

This worked, but the values in my table are now mostly showing NaNvalues.

这有效，但我表中的值现在主要显示NaN值。

FinancialYear   2014/2015   2015/2016   2016/2017   2017/2018
Month               
April              nan          nan         nan         nan
May                nan          nan         nan         nan
June               nan          nan         nan         nan
July               nan          nan         nan         nan
August             nan          nan         nan         nan
September           29           19          34           0
October            nan          nan         nan         nan
November           nan          nan         nan         nan
December           nan          nan         nan         nan
January            nan          nan         nan         nan
February           nan          nan         nan         nan
March              nan          nan         nan         nan

Any idea on what is happening? How to fix it? and if there is a better alternative method?

知道发生了什么吗？如何解决？如果有更好的替代方法？

Answer 1

回答by unutbu

Unexpected NaNs after reindexing are often due to the new index labels not exactly matching the old index labels. For example, if the original index labels contains whitespaces, but the new labels don't, then you'll get NaNs:

重新索引后出现意外的 NaN 通常是由于新索引标签与旧索引标签不完全匹配。例如，如果原始索引标签包含空格，但新标签不包含，那么您将得到 NaN：

import numpy as np
import pandas as pd

df = pd.DataFrame({'col':[1,2,3]}, index=['April ', 'June ', 'May ', ])
print(df)
#         col
# April     1
# June      2
# May       3

df2 = df.reindex(['April', 'May', 'June'])
print(df2)
#        col
# April  NaN
# May    NaN
# June   NaN

This can be fixed by removing the whitespace to make the labels match:

这可以通过删除空格使标签匹配来解决：

df.index = df.index.str.strip()
df3 = df.reindex(['April', 'May', 'June'])
print(df3)
#        col
# April    1
# May      3
# June     2

按列表排序索引 - Python Pandas

提问by ScoutEU

回答by unutbu

相关推荐

最近更新

标签

按列表排序索引 - Python Pandas

提问by ScoutEU

回答by unutbu

相关推荐

在 Pandas 数据框中检查 None

如何在 SQLAlchemy 的 `create_engine` 中使用 `charset` 和 `encoding`（创建 Pandas 数据框）？

pandas 将数据框转换为元组列表

何时使用 Pandas 系列、numpy ndarrays 或简单的 Python 字典？

相关推荐

最近更新

标签