pandas 将熊猫数据帧转换为 utf8

Question

提问by ADITYA KUMAR

How to convert pandas dataframe to unicode?

如何将Pandas数据帧转换为 unicode？

`messages=pandas.read_csv('data/SMSSpamCollection',sep='\t',quoting=csv.QUOTE_NONE,names=["label", "message"])
def split_into_tokens(message):
  message = unicode(message, 'utf8')  # convert bytes into proper unicode
  return TextBlob(message).words


messages.head().apply(split_into_tokens(messages))`

It gives error

它给出了错误

Traceback (most recent call last):
File "minor.py", line 46, in <module>
messages.head().apply(split_into_tokens(messages))
File "minor.py", line 42, in split_into_tokens
message = unicode(message, 'utf8')  # convert bytes into proper unicode
TypeError: coercing to Unicode: need string or buffer, DataFrame found

Answer 1

采纳答案by sandepp

Change the code

更改代码

messages.head().apply(split_into_tokens(messages))

to

到

messages.head().apply(split_into_tokens)

while using 'apply' with a funtion like in your case passing parameters is not required, as your code shows it is passing a dataframe which is giving error on execution.

虽然在您的情况下使用具有类似功能的“应用”不需要传递参数，因为您的代码显示它正在传递一个在执行时出错的数据帧。

Answer 2

回答by jason m

Df.x.str.encode('utf-8')

df.x.str.encode('utf-8')

Will fix your problems.

会解决你的问题。

http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.str.encode.html

pandas 将熊猫数据帧转换为 utf8

提问by ADITYA KUMAR

采纳答案by sandepp

回答by jason m

相关推荐

最近更新

标签

pandas 将熊猫数据帧转换为 utf8

提问by ADITYA KUMAR

采纳答案by sandepp

回答by jason m

相关推荐

pandas 为什么我的熊猫数据框没有定义

pandas Sklearn 将字符串类标签更改为 int

pandas 使用熊猫获取所有日期时间类型的列？

Pandas 根据列中的值将字符串映射到 int

相关推荐

最近更新

标签