Pandas - 更改因子类型对象的级别顺序

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/38023881/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 01:27:41  来源:igfitidea点击:

Pandas - change the order of levels of factor-type object

pythonpandas

提问by Square9627

I have a Pandas dataframe dfwith column schoolas factor

我有一个dfschool列为因子的 Pandas 数据框

Name    school
A       An
B       Bn
C       Bn

How can I change the levels of the schoolcolumn from ('An', 'Bn') to ('Bn', 'An') in python?

如何school在 python中将列的级别从 ('An', 'Bn') 更改为 ('Bn', 'An')?

R equivalent is

R 等价物是

levels(df$school) = c('Bn','An')

回答by Andy Hayden

You can use reorder_categories(you pass in the sorted factors):

您可以使用reorder_categories(您传入已排序的因素):

In [11]: df
Out[11]:
  Name school
0    A     An
1    B     Bn
2    C     Bn

In [12]: df['school'] = df['school'].astype('category')

In [13]: df['school']
Out[13]:
0    An
1    Bn
2    Bn
Name: school, dtype: category
Categories (2, object): [An, Bn]

In [14]: df['school'].cat.reorder_categories(['Bn', 'An'])
Out[14]:
0    An
1    Bn
2    Bn
dtype: category
Categories (2, object): [Bn, An]

You can do this inplace:

您可以就地执行此操作:

In [21]: df['school'].cat.reorder_categories(['Bn', 'An'], inplace=True)

In [22]: df['school']
Out[22]:
0    An
1    Bn
2    Bn
Name: school, dtype: category
Categories (2, object): [Bn, An]

See the reordering categories section of the docs.

请参阅文档的重新排序类别部分

回答by HYRY

You can set cat.categories:

您可以设置cat.categories

import pandas as pd

school = pd.Series(["An", "Bn", "Bn"])
school = school.astype("category")

school.cat.categories = ["Bn", "An"]

回答by Alexander

As a general solution, you can remap using a dictionary:

作为通用解决方案,您可以使用字典重新映射:

df = pd.DataFrame({'Name': ['A', 'B', 'C'], 
                   'school': ['An', 'Bn', 'Bn']})
d = {'An': 'Bn', 'Bn': 'An'}
df['school'] = df.school.map(d)
>>> df
  Name school
0    A     Bn
1    B     An
2    C     An