无法在 Python 2.x 下从 Pandas 的列名中删除 unicode char

Question

提问by

I have read a csv file in pandas dataframe and am trying to remove the unicode char ufrom the column names but with no luck.

我已经在 Pandas 数据框中读取了一个 csv 文件，并试图从列名中删除 unicode char u但没有运气。

fl.columns
Index([ u'time', u'contact', u'address'], dtype='object')

headers=[ 'time', 'contact', 'address']
fl=pandas.read_csv('file.csv',header=None,names=headers)

Still doesnt work

还是不行

fl.columns
Index([ u'time', u'contact', u'address'], dtype='object')

Even the rename doesnt work either

即使重命名也不起作用

fl.rename(columns=lambda x:x.replace(x,x.value.encode('ascii','ignore')),inplace=True)
fl.columns
Index([ u'time', u'contact', u'address'], dtype='object')

Can anybody please tell me why this is happening and how to fix it ? Thanks.

谁能告诉我为什么会发生这种情况以及如何解决？谢谢。

Answer 1

回答by paulo.filip3

If you really need to remove the u(since this is only a display issue) you can do the following very dirty trick:

如果您真的需要删除u（因为这只是一个显示问题），您可以执行以下非常肮脏的技巧：

from pandas import compat

compat.PY3 = True

df.columns
Index(['time', 'contact', 'address'], dtype='object')

Answer 2

回答by elPastor

I had an issue with this today and used: df['var'] = df['var'].astype(str)

我今天遇到了这个问题并使用了： df['var'] = df['var'].astype(str)

Answer 3

回答by ashok suthar

I was facing a similar issue while building ML pipeline. My features list was having Unicode along with names.

我在构建 ML 管道时遇到了类似的问题。我的功能列表包含 Unicode 和名称。

features

特征

[u'Customer_id', u'Age',.....]

One way to get away with it is using str() function. Create a new list with applying an str function to each of the value.

摆脱它的一种方法是使用 str() 函数。创建一个新列表，将 str 函数应用于每个值。

features_new= [str(x) for x in features]

Now the features_newlist will not have any Unicode char. Let me know how it works.

现在features_new列表将没有任何 Unicode 字符。让我知道它是如何工作的。

Answer 4

回答by Dean Hu

Here is one way to remove Unicode from column names:

这是从列名中删除 Unicode 的一种方法：

df.columns = [strip_non_ascii(x) for x in df.columns]

The following is the function strip_non_asciito remove Unicode:

以下是strip_non_ascii去除Unicode的函数：

def strip_non_ascii(string):
''' Returns the string without non ASCII characters'''
stripped = (c for c in string if 0 < ord(c) < 127)
return ''.join(stripped)

无法在 Python 2.x 下从 Pandas 的列名中删除 unicode char

提问by

回答by paulo.filip3

回答by elPastor

回答by ashok suthar

回答by Dean Hu

相关推荐

最近更新

标签

无法在 Python 2.x 下从 Pandas 的列名中删除 unicode char

提问by

回答by paulo.filip3

回答by elPastor

回答by ashok suthar

回答by Dean Hu

相关推荐

Pandas.DataFrame 按索引间隔选择

pandas 熊猫中csv的条件行读取

pandas 不同时区的时间数组的时间戳减法

Pandas - 是否可以在没有quotechar 的情况下读取_csv？

相关推荐

最近更新

标签