Python Pandas.read_csv() 在列名中带有特殊字符（重音符号）？

Question

提问by farhawa

I have a csvfile that contains some data with columns names:

我有一个csv文件，其中包含一些带有列名的数据：

"PERIODE"
"IAS_brut"
"IAS_lissé"
"Incidence_Sentinelles"

“时期”
“IAS_brut”
“IAS_lissé”
“事件_哨兵”

I have a problem with the third one "IAS_lissé"which is misinterpreted by pd.read_csv()method and returned as ?.

我对第三个“IAS_lissé”有问题，它被pd.read_csv()方法误解并返回为 ?。

What is that character?

那是什么性格？

Because it's generating a bug in my flask application, is there a way to read that column in an other way without modifying the file?

因为它在我的烧瓶应用程序中产生了一个错误，有没有办法在不修改文件的情况下以其他方式读取该列？

In [1]: import pandas as pd

In [2]: pd.read_csv("Openhealth_S-Grippal.csv",delimiter=";").columns

Out[2]: Index([u'PERIODE', u'IAS_brut', u'IAS_liss?', u'Incidence_Sentinelles'], dtype='object')

Answer 1

回答by shawnheide

You can change the encodingparameter for read_csv, see the pandas doc here. Also the python standard encodings are here.

您可以更改encodingread_csv的参数，请参阅此处的Pandas 文档。还有 python 标准编码在这里。

I believe for your example you can use the utf-8encoding (assuming that your language is French).

我相信对于您的示例，您可以使用utf-8编码（假设您的语言是法语）。

df = pd.read_csv("Openhealth_S-Grippal.csv", delimiter=";", encoding='utf-8')

Here's an example showing some sample output. All I did was make a csv file with one column, using the problem characters.

这是一个显示一些示例输出的示例。我所做的只是使用问题字符制作一个包含一列的 csv 文件。

df = pd.read_csv('sample.csv', encoding='utf-8')

Output:

输出：

    IAS_lissé
0   1
1   2
2   3

Answer 2

回答by Francisco del Valle Bas

I found the same problem with spanish, solved it with with "latin1" encoding:

我发现西班牙语也有同样的问题，用“latin1”编码解决了这个问题：

import pandas as pd

 pd.read_csv("Openhealth_S-Grippal.csv",delimiter=";", encoding='latin1')

Hope it helps!

希望能帮助到你！

Answer 3

回答by pantherentheitroade

Using utf-8 didn't work for me. E.g. this piece of code:

使用 utf-8 对我不起作用。例如这段代码：

    bla = pd.DataFrame(data = [1, 2])
    bla.to_csv('funkyNamé , things.csv')
    blabla = pd.read_csv('funkyNamé , things.csv', delimiter=";", encoding='utf-8')
    blabla

Ultimately returned: OSError: Initializing from file failed

最终返回：OSError: Initializing from file failed

I know you said you didn't want to modify the file. If you meant the file content vs the filename, I would rename the file to something without an accent, read the csv file under its new name, then reset the filename back to its original name.

我知道你说过你不想修改文件。如果您指的是文件内容与文件名，我会将文件重命名为不带重音的名称，以新名称读取 csv 文件，然后将文件名重置为其原始名称。

    originalfilepath = r'C:\Users\myself\funkyNamé , things.csv'
    originalfolder = r'C:\Users\myself'
    os.rename(originalfilepath, originalFolder+"\tempName.csv")
    df = pd.read_csv(originalFolder+"\tempName.csv", encoding='ISO-8859-1')
    os.rename(originalFolder+"\tempName.csv", originalfilepath)

If you did mean "without modifying the filename, my apologies for not being helpful to you, and I hope this helps someone else.

如果你的意思是“不修改文件名，我很抱歉没有对你有帮助，我希望这能帮助别人。

Python Pandas.read_csv() 在列名中带有特殊字符（重音符号）？

提问by farhawa

回答by shawnheide

回答by Francisco del Valle Bas

回答by pantherentheitroade

相关推荐

最近更新

标签

Python Pandas.read_csv() 在列名中带有特殊字符（重音符号）？

提问by farhawa

回答by shawnheide

回答by Francisco del Valle Bas

回答by pantherentheitroade

相关推荐

Python 使用 Keras 获取模型输出 wrt 权重的梯度

如何使用python3创建虚拟环境

Python 加载 MySQLdb 模块时出错：没有名为“MySQLdb”的模块

Python 在 django 文件字段中保存 base64 图像

相关推荐

最近更新

标签