pandas 将熊猫数据帧转换为 utf8
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/42456867/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
convert pandas dataframe to utf8
提问by ADITYA KUMAR
How to convert pandas dataframe to unicode?
如何将Pandas数据帧转换为 unicode?
`messages=pandas.read_csv('data/SMSSpamCollection',sep='\t',quoting=csv.QUOTE_NONE,names=["label", "message"])
def split_into_tokens(message):
message = unicode(message, 'utf8') # convert bytes into proper unicode
return TextBlob(message).words
messages.head().apply(split_into_tokens(messages))`
It gives error
它给出了错误
Traceback (most recent call last):
File "minor.py", line 46, in <module>
messages.head().apply(split_into_tokens(messages))
File "minor.py", line 42, in split_into_tokens
message = unicode(message, 'utf8') # convert bytes into proper unicode
TypeError: coercing to Unicode: need string or buffer, DataFrame found
采纳答案by sandepp
Change the code
更改代码
messages.head().apply(split_into_tokens(messages))
to
到
messages.head().apply(split_into_tokens)
while using 'apply' with a funtion like in your case passing parameters is not required, as your code shows it is passing a dataframe which is giving error on execution.
虽然在您的情况下使用具有类似功能的“应用”不需要传递参数,因为您的代码显示它正在传递一个在执行时出错的数据帧。
回答by jason m
Df.x.str.encode('utf-8')
df.x.str.encode('utf-8')
Will fix your problems.
会解决你的问题。
http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.str.encode.html
http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.str.encode.html