pandas 将类别列表打印为列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/35332177/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 00:40:53  来源:igfitidea点击:

printing list of categories as a column

pythonlistpandascategoriesseries

提问by PagMax

I am taking example from pandas documentation. Let us say I have a series after reading an excel file

我正在以Pandas文档为例。假设我在阅读了一个 excel 文件后有一个系列

import pandas as pd
s = pd.Series(["a","b","c","a"], dtype="category")

I know I can get different categories by

我知道我可以通过以下方式获得不同的类别

scat=s.cat.categories
print scat

For which I get

我得到

Index([u'a', u'b', u'c'], dtype='object')

I was wondering what is a good way to make this list appear as a column. Something like

我想知道什么是使此列表显示为列的好方法。就像是

a
b
c

I could get rid of u'by doing np.asarraybut still do not get the format I need.

我可以u'通过这样做来摆脱,np.asarray但仍然没有得到我需要的格式。

回答by Alexander

I'm not sure by what you mean when you say 'appear' as a column.

当您说“作为列出现”时,我不确定您的意思。

You can create a list instead of an index via:

您可以通过以下方式创建列表而不是索引:

>>> s.cat.categories.tolist()
['a', 'b', 'c']

Or you can simply print them out in a column structure using a for loop:

或者您可以使用 for 循环简单地将它们打印在列结构中:

for c in s.cat.categories:
    print c

a
b
c

Or you could create a series (or dataframe):

或者您可以创建一个系列(或数据框):

>>> pd.Series(s.cat.categories)
0    a
1    b
2    c
dtype: object

>>> pd.DataFrame(s.cat.categories)
   0
0  a
1  b
2  c

回答by jezrael

I think it is no problem - 'u'means unicodestring:

我认为没问题 -'u'表示unicode字符串:

s = pd.Series(["a","b","c","a"], dtype="category")
print s
0    a
1    b
2    c
3    a
dtype: category
Categories (3, object): [a, b, c]

scat=s.cat.categories
print scat
Index([u'a', u'b', u'c'], dtype='object')

print scat[0]
a

print type(scat[0])
<type 'str'>   

If you want print column without loop use numpy reshape:

如果您想打印没有循环的列,请使用numpy reshape

print len(scat)
3
print scat.values.reshape(len(scat),1)
[['a']
 ['b']
 ['c']]