Python 漂亮地打印熊猫数据框

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/18528533/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 10:57:15  来源:igfitidea点击:

Pretty Printing a pandas dataframe

pythonpandasdataframeprinting

提问by Ofer

How can I print a pandas dataframe as a nice text-based table, like the following?

如何将 Pandas 数据框打印为一个漂亮的基于文本的表格,如下所示?

+------------+---------+-------------+
| column_one | col_two |   column_3  |
+------------+---------+-------------+
|          0 |  0.0001 | ABCD        |
|          1 |  1e-005 | ABCD        |
|          2 |  1e-006 | long string |
|          3 |  1e-007 | ABCD        |
+------------+---------+-------------+

回答by Ofer

You can use prettytableto render the table as text. The trick is to convert the data_frame to an in-memory csv file and have prettytable read it. Here's the code:

您可以使用prettytable将表格呈现为文本。诀窍是将 data_frame 转换为内存中的 csv 文件,并可以很好地读取它。这是代码:

from StringIO import StringIO
import prettytable    

output = StringIO()
data_frame.to_csv(output)
output.seek(0)
pt = prettytable.from_csv(output)
print pt

回答by ejrb

I used Ofer's answer for a while and found it great in most cases. Unfortunately, due to inconsistencies between pandas's to_csvand prettytable's from_csv, I had to use prettytable in a different way.

我使用了 Ofer 的答案一段时间,发现在大多数情况下它都很棒。不幸的是,由于pandas 的to_csvprettytable的from_csv之间不一致,我不得不以不同的方式使用prettytable。

One failure case is a dataframe containing commas:

一个失败案例是包含逗号的数据帧:

pd.DataFrame({'A': [1, 2], 'B': ['a,', 'b']})

Prettytable raises an error of the form:

Prettytable 引发了以下形式的错误:

Error: Could not determine delimiter

The following function handles this case:

以下函数处理这种情况:

def format_for_print(df):    
    table = PrettyTable([''] + list(df.columns))
    for row in df.itertuples():
        table.add_row(row)
    return str(table)

If you don't care about the index, use:

如果您不关心索引,请使用:

def format_for_print2(df):    
    table = PrettyTable(list(df.columns))
    for row in df.itertuples():
        table.add_row(row[1:])
    return str(table)

回答by Romain

I've just found a great tool for that need, it is called tabulate.

我刚刚找到了一个很好的工具来满足这个需求,它被称为tabulate

It prints tabular data and works with DataFrame.

它打印表格数据并使用DataFrame.

from tabulate import tabulate
import pandas as pd

df = pd.DataFrame({'col_two' : [0.0001, 1e-005 , 1e-006, 1e-007],
                   'column_3' : ['ABCD', 'ABCD', 'long string', 'ABCD']})
print(tabulate(df, headers='keys', tablefmt='psql'))

+----+-----------+-------------+
|    |   col_two | column_3    |
|----+-----------+-------------|
|  0 |    0.0001 | ABCD        |
|  1 |    1e-05  | ABCD        |
|  2 |    1e-06  | long string |
|  3 |    1e-07  | ABCD        |
+----+-----------+-------------+

Note:

笔记:

To suppress row indices for all types of data, pass showindex="never"or showindex=False.

要抑制所有类型数据的行索引,请传递showindex="never"showindex=False

回答by ErichBSchulz

A simple approach is to output as html, which pandas does out of the box:

一种简单的方法是输出为 html,pandas 开箱即用

df.to_html('temp.html')

回答by jon

I wanted a paper printout of a dataframe but I wanted to add some results and comments as well on the same page. I have worked through the above and I could not get what I wanted. I ended up using file.write(df1.to_csv()) and file.write(",,,blah,,,,,,blah") statements to get my extras on the page. When I opened the csv file it went straight to a spreadsheet which printed everything in the right pace and format.

我想要一个数据框的纸质打印输出,但我想在同一页面上添加一些结果和评论。我已经完成了上述工作,但我无法得到我想要的。我最终使用 file.write(df1.to_csv()) 和 file.write(",,,blah,,,,,,,,blah") 语句在页面上获取我的额外内容。当我打开 csv 文件时,它直接进入了一个电子表格,该电子表格以正确的速度和格式打印了所有内容。

回答by Mark Andersen

If you are in Jupyter notebook, you could run the following code to interactively display the dataframe in a well formatted table.

如果您在 Jupyter notebook 中,则可以运行以下代码以交互方式在格式良好的表格中显示数据框。

This answer builds on the to_html('temp.html') answer above, but instead of creating a file displays the well formatted table directly in the notebook:

此答案建立在上面的 to_html('temp.html') 答案之上,但不是创建文件,而是直接在笔记本中显示格式良好的表格:

from IPython.display import display, HTML

display(HTML(df.to_html()))

Credit for this code due to example at: Show DataFrame as table in iPython Notebook

此代码归功于以下示例:在 iPython Notebook 中将 DataFrame 显示为表格

回答by sigint

Following up on Mark's answer, if you're notusing Jupyter for some reason, e.g. you want to do some quick testing on the console, you can use the DataFrame.to_stringmethod, which works from -- at least -- Pandas 0.12 (2014) onwards.

跟进 Mark 的回答,如果您出于某种原因使用 Jupyter,例如您想在控制台上进行一些快速测试,则可以使用该DataFrame.to_string方法,该方法至少适用于 Pandas 0.12(2014) .

import pandas as pd

matrix = [(1, 23, 45), (789, 1, 23), (45, 678, 90)]
df = pd.DataFrame(matrix, columns=list('abc'))
print(df.to_string())

#  outputs:
#       a    b   c
#  0    1   23  45
#  1  789    1  23
#  2   45  678  90

回答by cs95

pandas >= 1.0

熊猫 >= 1.0

If you want an inbuilt function to dump your data into some github markdown, you now have one. Take a look at to_markdown:

如果你想要一个内置函数将你的数据转储到一些 github markdown 中,你现在有一个。看看to_markdown

df = pd.DataFrame({"A": [1, 2, 3], "B": [1, 2, 3]}, index=['a', 'a', 'b'])  
print(df.to_markdown()) 

|    |   A |   B |
|:---|----:|----:|
| a  |   1 |   1 |
| a  |   2 |   2 |
| b  |   3 |   3 |

Here's what that looks like on github:

这是github上的样子:

enter image description here

在此处输入图片说明

Note that you will still need to have the tabulatepackage installed.

请注意,您仍然需要tabulate安装该软件包。

回答by Pafkone

Maybe you're looking for something like this:

也许你正在寻找这样的东西:

def tableize(df):
    if not isinstance(df, pd.DataFrame):
        return
    df_columns = df.columns.tolist() 
    max_len_in_lst = lambda lst: len(sorted(lst, reverse=True, key=len)[0])
    align_center = lambda st, sz: "{0}{1}{0}".format(" "*(1+(sz-len(st))//2), st)[:sz] if len(st) < sz else st
    align_right = lambda st, sz: "{0}{1} ".format(" "*(sz-len(st)-1), st) if len(st) < sz else st
    max_col_len = max_len_in_lst(df_columns)
    max_val_len_for_col = dict([(col, max_len_in_lst(df.iloc[:,idx].astype('str'))) for idx, col in enumerate(df_columns)])
    col_sizes = dict([(col, 2 + max(max_val_len_for_col.get(col, 0), max_col_len)) for col in df_columns])
    build_hline = lambda row: '+'.join(['-' * col_sizes[col] for col in row]).join(['+', '+'])
    build_data = lambda row, align: "|".join([align(str(val), col_sizes[df_columns[idx]]) for idx, val in enumerate(row)]).join(['|', '|'])
    hline = build_hline(df_columns)
    out = [hline, build_data(df_columns, align_center), hline]
    for _, row in df.iterrows():
        out.append(build_data(row.tolist(), align_right))
    out.append(hline)
    return "\n".join(out)


df = pd.DataFrame([[1, 2, 3], [11111, 22, 333]], columns=['a', 'b', 'c'])
print tableize(df)
Output:
+-------+----+-----+
|    a  |  b |   c |
+-------+----+-----+
|     1 |  2 |   3 |
| 11111 | 22 | 333 |
+-------+----+-----+