Python 熊猫,将 unicodes 列转换为字符串列表列
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/25180694/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas, convert column of unicodes to column of list of strings
提问by foebu
One of my pandas dataframe columns has unicodes of this kind u'asd,abc,tre,der34,whatever'. The final results should be a column of lists of strings: ['asd','abc','tre','der34','whatever']. A list of unicodes might do, too: [u'asd',u'abc',u'tre',u'der34',u'whatever'].
我的 Pandas 数据框列之一有这种 unicodes u'asd,abc,tre,der34,whatever'。最终结果应该是一列字符串列表:['asd','abc','tre','der34','whatever']. Unicode 列表也可以:[u'asd',u'abc',u'tre',u'der34',u'whatever'].
By the way, tt can happen that in the column of unicodes there is a nan or a u''.
顺便说一句,在 unicodes 列中可能会出现 nan 或 u''。
Any suggestion? I know I can do str(df['column'].iloc[0]).split(',')and manually add a new column or do something trickier, but I was looking for something more pythonic.
有什么建议吗?我知道我可以str(df['column'].iloc[0]).split(',')手动添加一个新列或做一些更棘手的事情,但我一直在寻找更 Pythonic 的东西。
回答by foebu
This solution seems to work:
这个解决方案似乎有效:
df['Column'] =df['Column'].astype(str).str.split(',')
回答by rick debbout
This should work, if there were nan or empty string you'd have to handle that however you see fit.
这应该有效,如果有 nan 或空字符串,您必须根据需要处理它。
In [1]: [str(col) for col in u'asd,abc,tre,der34,whatever'.split(',')]
Out[1]: ['asd', 'abc', 'tre', 'der34', 'whatever']

