在工作日重新订购 Pandas 系列
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/35193808/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Re-order Pandas Series on weekday
提问by Simon
Using Pandas, I have pulled in a CSV file and then created a series of the data to find out which days of the week have the most crashes:
使用 Pandas,我提取了一个 CSV 文件,然后创建了一系列数据来找出一周中哪几天崩溃次数最多:
crashes_by_day = bc['DAY_OF_WEEK'].value_counts()
I have then plotted this out, but of course it plots them in the same ranked order as the series.
然后我将其绘制出来,但当然它以与系列相同的排名顺序绘制它们。
crashes_by_day.plot(kind='bar')
What is the most efficient way to re-rank these to Mon, Tue, Wed, Thur, Fri, Sat, Sun?
将这些重新排序为周一、周二、周三、周四、周五、周六、周日的最有效方法是什么?
Do I have to break it out into a list? Thanks.
我必须把它分成一个列表吗?谢谢。
回答by jezrael
You can use Ordered Categorical
and then sort_index
:
您可以使用Ordered Categorical
然后sort_index
:
print bc
DAY_OF_WEEK a b
0 Sunday 0.7 0.5
1 Monday 0.4 0.1
2 Tuesday 0.3 0.2
3 Wednesday 0.4 0.1
4 Thursday 0.3 0.6
5 Friday 0.4 0.9
6 Saturday 0.3 0.2
7 Sunday 0.7 0.5
8 Monday 0.4 0.1
9 Tuesday 0.3 0.2
10 Wednesday 0.4 0.1
11 Thursday 0.3 0.6
12 Friday 0.4 0.9
13 Saturday 0.3 0.2
14 Sunday 0.7 0.5
15 Monday 0.4 0.1
16 Tuesday 0.3 0.2
17 Wednesday 0.4 0.1
18 Thursday 0.3 0.6
19 Friday 0.4 0.9
20 Saturday 0.3 0.2
bc['DAY_OF_WEEK'] = pd.Categorical(bc['DAY_OF_WEEK'], categories=
['Monday','Tuesday','Wednesday','Thursday','Friday','Saturday', 'Sunday'],
ordered=True)
print bc['DAY_OF_WEEK']
0 Sunday
1 Monday
2 Tuesday
3 Wednesday
4 Thursday
5 Friday
6 Saturday
7 Sunday
8 Monday
9 Tuesday
10 Wednesday
11 Thursday
12 Friday
13 Saturday
14 Sunday
15 Monday
16 Tuesday
17 Wednesday
18 Thursday
19 Friday
20 Saturday
Name: DAY_OF_WEEK, dtype: category
Categories (7, object): [Monday < Tuesday < Wednesday < Thursday < Friday < Saturday < Sunday]
crashes_by_day = bc['DAY_OF_WEEK'].value_counts()
crashes_by_day = crashes_by_day.sort_index()
print crashes_by_day
Monday 3
Tuesday 3
Wednesday 3
Thursday 3
Friday 3
Saturday 3
Sunday 3
dtype: int64
crashes_by_day.plot(kind='bar')
Next possible solution without Categorical
is set sorting by mapping:
没有Categorical
设置按映射排序的下一个可能的解决方案:
crashes_by_day = bc['DAY_OF_WEEK'].value_counts().reset_index()
crashes_by_day.columns = ['DAY_OF_WEEK', 'count']
print crashes_by_day
DAY_OF_WEEK count
0 Thursday 3
1 Wednesday 3
2 Friday 3
3 Tuesday 3
4 Monday 3
5 Saturday 3
6 Sunday 3
days = ['Monday','Tuesday','Wednesday','Thursday','Friday','Saturday', 'Sunday']
mapping = {day: i for i, day in enumerate(days)}
key = crashes_by_day['DAY_OF_WEEK'].map(mapping)
print key
0 3
1 2
2 4
3 1
4 0
5 5
6 6
Name: DAY_OF_WEEK, dtype: int64
crashes_by_day = crashes_by_day.iloc[key.argsort()].set_index('DAY_OF_WEEK')
print crashes_by_day
count
DAY_OF_WEEK
Monday 3
Tuesday 3
Wednesday 3
Thursday 3
Friday 3
Saturday 3
Sunday 3
crashes_by_day.plot(kind='bar')