写入或记录 Pandas Dataframe 的打印输出

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/42515493/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 03:05:59  来源:igfitidea点击:

Write or log print output of pandas Dataframe

pythonpandasdataframeunicode

提问by Chris

I have a Dataframe I wish to write a few rows of into a file and logger in Python 2.7. print(dataframe.iloc[0:4])outputs a nice grid of the column headers and top 4 rows in the dataframe. However logging.info(dataframe.iloc[0:4])throws:

我有一个数据框,我希望在 Python 2.7 中将几行写入文件和记录器中。print(dataframe.iloc[0:4])在数据框中输出列标题和前 4 行的漂亮网格。然而logging.info(dataframe.iloc[0:4])抛出:

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 87: ordinal not in range(128)

UnicodeDecodeError: 'ascii' 编解码器无法解码位置 87 中的字节 0xc2:序号不在范围内 (128)

Here is the output to console, works either directly to console or via print()(note the 2):

这是控制台的输出,可直接用于控制台或通过print()(注意2):

In[89]: d.iloc[0:4]    OR   print(d.iloc[0:4])
Out[89]: 
   ISO  ID_0     NAME_0  ID_1                           NAME_1    ID_2    NAME_2  Area(km.2)  Pop2001_Cen  Pop2010_Cen  HHold2010  Hhold_Size
0  ARG    12  Argentina     2  Ciudad Autónoma de Buenos Aires     NaN       NaN       203.0    2776138.0      2890151  1150134.0    2.512882
1  ARG    12  Argentina     2  Ciudad Autónoma de Buenos Aires  2001.0  Comuna 1         NaN     171975.0       205886    84468.0    2.437444
2  ARG    12  Argentina     2  Ciudad Autónoma de Buenos Aires  2002.0  Comuna 2         NaN     165494.0       157932    73156.0    2.158839
3  ARG    12  Argentina     2  Ciudad Autónoma de Buenos Aires  2003.0  Comuna 3         NaN     184015.0       187537    80489.0    2.329971

As does file.write(dataframe.iloc[0:4])and so on, as one of the column headers includes a non-ascii character. I have tried all sorts of variations of decode(), encode(), etc, but cannot avoid this error.

就像file.write(dataframe.iloc[0:4])这样,因为列标题之一包含非 ascii 字符。我已经试过各种变化decode()encode()等等,但是不能避免这个错误。

print(d.iloc[0:4])works, so another approach was to use print(d.iloc[0:4], file=f)but even with from __future__ import print_functionI get the above ascii encoding error.

print(d.iloc[0:4])有效,所以另一种方法是使用,print(d.iloc[0:4], file=f)但即使from __future__ import print_function我得到上述 ascii 编码错误。

Other ways to replicate this problem are logging.info('Area(km.2)')or 'Area(km.2)'.decode()

复制此问题的其他方法是logging.info('Area(km.2)')'Area(km.2)'.decode()

How can I render this dataframe?

如何呈现此数据框?

[Edit:]

[编辑:]

I also want to understand fundamentally how I deal with string encoding/decoding in Python 2.7 - I've been hacking away at this for more time than it deserves because this isn't the only time I've had this UnicodeDecodeErrorerror, and I don't know when it'll occur and I am still just throwing fixes at the console to see what sticks, without any underlying understanding of what's going on.

我还想从根本上了解我如何处理 Python 2.7 中的字符串编码/解码 - 我已经在这方面进行了比应有的时间更长的时间,因为这不是我唯一一次遇到此UnicodeDecodeError错误,而且我没有不知道它什么时候会发生,我仍然只是在控制台上进行修复以查看什么会发生,而对正在发生的事情没有任何潜在的了解。

采纳答案by Fabio Lamanna

IIUC, you can try to pass encoding='utf-8'when writing out the first n rows of the dataframe with:

IIUC,您可以尝试encoding='utf-8'在写出数据帧的前 n 行时通过:

df.head(n).to_csv('yourfileout.csv', encoding='utf-8')

回答by gageorge

With python 3 and the latest pandas, this worked for me ...

使用 python 3 和最新的Pandas,这对我有用......

logging.info('dataframe head - {}'.format(df.head()))

回答by Supun De Silva

Improving gageorge's answer, Following rendered better when there are more than 5 rows

改进 gageorge 的答案,当超过 5 行时,以下效果更好

logging.info('dataframe head - {}'.format(df.to_string()))

Reference: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_string.html

参考:https: //pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_string.html