在 Pandas 中将多列转换为类别。申请?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/30991532/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Converting multiple columns to categories in Pandas. apply?
提问by Amelio Vazquez-Reina
Consider a Dataframe. I want to convert a set of columns to_convertto categories.
考虑一个数据框。我想将一组列转换to_convert为类别。
I can certainly do the following:
我当然可以做到以下几点:
for col in to_convert:
df[col] = df[col].astype('category')
but I was surprised that the following does not return a dataframe:
但我很惊讶以下不返回数据帧:
df[to_convert].apply(lambda x: x.astype('category'), axis=0)
which of course makes the following not work:
这当然会使以下内容不起作用:
df[to_convert] = df[to_convert].apply(lambda x: x.astype('category'), axis=0)
Why does apply(axis=0) return a Series even though it is supposed to act on the columns one by one?
为什么apply( axis=0) 返回一个系列,即使它应该一一作用于列?
采纳答案by Jeff
This was just fixed in master, and so will be in 0.17.0, see the issue here
这只是在 master 中修复,因此将在 0.17.0 中,请在此处查看问题
In [7]: df = DataFrame({'A' : list('aabbcd'), 'B' : list('ffghhe')})
In [8]: df
Out[8]:
A B
0 a f
1 a f
2 b g
3 b h
4 c h
5 d e
In [9]: df.dtypes
Out[9]:
A object
B object
dtype: object
In [10]: df.apply(lambda x: x.astype('category'))
Out[10]:
A B
0 a f
1 a f
2 b g
3 b h
4 c h
5 d e
In [11]: df.apply(lambda x: x.astype('category')).dtypes
Out[11]:
A category
B category
dtype: object
回答by joelostblom
Note that since pandas 0.23.0you no longer applyto convert multiple columns to categorical data types. Now you can simply do df[to_convert].astype('category')instead (where to_convertis a set of columns as defined in the question).
请注意,从 pandas 0.23.0 开始,您不再apply需要将多列转换为分类数据类型。现在你可以简单地做df[to_convert].astype('category')(哪里to_convert是问题中定义的一组列)。

