pandas 如何在熊猫数据框中显示汉字?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/39308065/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to display Chinese characters inside a pandas dataframe?
提问by Daniel
I can read a csv file in which there is a column containing Chinese characters (other columns are English and numbers). However, Chinese characters don't display correctly. see photo below
我可以读取一个csv文件,其中有一列包含汉字(其他列是英文和数字)。但是,中文字符无法正确显示。看下面的照片
I loaded the csv file with pd.read_csv()
.
我用 .csv 文件加载了 .csv 文件pd.read_csv()
。
Either display(data06_16)
or data06_16.head()
won't display Chinese characters correctly.
无论是display(data06_16)
或data06_16.head()
将无法正确显示 CN 文字。
I tried to add the following lines into my .bash_profile
:
我尝试将以下行添加到我的.bash_profile
:
export LC_ALL=zh_CN.UTF-8
export LANG=zh_CN.UTF-8
export LC_ALL=en_US.UTF-8
export LANG=en_US.UTF-8
but it doesn't help.
但它没有帮助。
Also I have tried to add encoding
arg to pd.read_csv()
:
我也尝试将encoding
arg添加到pd.read_csv()
:
pd.read_csv('data.csv', encoding='utf_8')
pd.read_csv('data.csv', encoding='utf_16')
pd.read_csv('data.csv', encoding='utf_32')
These won't work at all.
这些根本行不通。
How can I display the Chinese characters properly?
如何正确显示汉字?
回答by Daniel
I just remembered that the source dataset was created using encoding='GBK'
, so I tried again using
我只记得源数据集是使用创建的encoding='GBK'
,所以我再次尝试使用
data06_16 = pd.read_csv("../data/stocks1542monthly.csv", encoding="GBK")
Now, I can see all the Chinese characters.
现在,我可以看到所有的汉字。
Thanks guys!
谢谢你们!
回答by vlad.rad
I see here three possible issues:
我在这里看到三个可能的问题:
1) You can try this:
1)你可以试试这个:
import codecs
x = codecs.open("testdata.csv", "r", "utf-8")
2) Another possibility can be theoretically this:
2)另一种可能性理论上可以是这样:
import pandas as pd
df = pd.DataFrame(pd.read_csv('testdata.csv',encoding='utf-8'))
3) Maybe you should convert you csv file into utf-8 before importing with Python (for example in Notepad++)? It can be a solution for one-time-import, not for automatic process, of course.
3)也许您应该在使用 Python 导入之前将 csv 文件转换为 utf-8(例如在 Notepad++ 中)?当然,它可以是一次性导入的解决方案,而不是自动处理的解决方案。
回答by blacksheep
Try this
尝试这个
df = pd.read_csv(path, engine='python', encoding='utf-8-sig')