Python 从 Pandas 数据帧转换为 html 时,如何在 html 中显示完整的(未截断的)数据帧信息?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/25351968/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 20:06:21  来源:igfitidea点击:

How to display full (non-truncated) dataframe information in html when converting from pandas dataframe to html?

pythonhtmlpandas

提问by Amy

I converted a pandas dataframe to an html output using the DataFrame.to_htmlfunction. When I save this to a separate html file, the file shows truncated output.

我使用该DataFrame.to_html函数将熊猫数据帧转换为 html 输出。当我将其保存到单独的 html 文件时,该文件显示截断的输出。

For example, in my TEXT column,

例如,在我的 TEXT 列中,

df.head(1)will show

df.head(1)将会呈现

The film was an excellent effort...

这部电影是一个很好的努力......

instead of

代替

The film was an excellent effort in deconstructing the complex social sentiments that prevailed during this period.

这部电影在解构这一时期盛行的复杂社会情绪方面做出了出色的努力。

This rendition is fine in the case of a screen-friendly format of a massive pandas dataframe, but I need an html file that will show complete tabular data contained in the dataframe, that is, something that will show the latter text element rather than the former text snippet.

在大熊猫数据帧的屏幕友好格式的情况下,这种再现很好,但我需要一个 html 文件来显示数据帧中包含的完整表格数据,也就是说,将显示后一个文本元素而不是以前的文本片段。

How would I be able to show the complete, non-truncated text data for each element in my TEXT column in the html version of the information? I would imagine that the html table would have to display long cells to show the complete data, but as far as I understand, only column-width parameters can be passed into the DataFrame.to_htmlfunction.

我如何才能在信息的 html 版本中为我的 TEXT 列中的每个元素显示完整的、未截断的文本数据?我会想象 html 表必须显示长单元格才能显示完整的数据,但据我所知,只有列宽参数可以传递到DataFrame.to_html函数中。

采纳答案by behzad.nouri

Set the display.max_colwidthoption to -1:

display.max_colwidth选项设置为-1

pd.set_option('display.max_colwidth', -1)

set_optiondocs

set_option文档

For example, in iPython, we see that the information is truncated to 50 characters. Anything in excess is ellipsized:

例如,在 iPython 中,我们看到信息被截断为 50 个字符。任何多余的东西都被省略了:

enter image description here

在此处输入图片说明

If you set the display.max_colwidthoption, the information will be displayed fully:

如果设置了该display.max_colwidth选项,信息将完整显示:

enter image description here

在此处输入图片说明

回答by user7579768

pd.set_option('display.max_columns', None)  

id(second argument) can fully show the columns.

id(第二个参数)可以完全显示列。

回答by Karl Adler

While pd.set_option('display.max_columns', None)sets the number of the maximum columns shown, the option pd.set_option('display.max_colwidth', -1)sets the maximum width of each single field.

pd.set_option('display.max_columns', None)设置显示的最大列数时,该选项pd.set_option('display.max_colwidth', -1)设置每个单个字段的最大宽度。

For my purposes I wrote a small helper function to fully print huge data frames without affecting the rest of the code, it also reformats float numbers and sets the virtual display width. You may adopt it for your use cases.

为了我的目的,我编写了一个小的辅助函数来完全打印巨大的数据帧而不影响其余代码,它还重新格式化浮点数并设置虚拟显示宽度。您可以在您的用例中采用它。

def print_full(x):
    pd.set_option('display.max_rows', len(x))
    pd.set_option('display.max_columns', None)
    pd.set_option('display.width', 2000)
    pd.set_option('display.float_format', '{:20,.2f}'.format)
    pd.set_option('display.max_colwidth', None)
    print(x)
    pd.reset_option('display.max_rows')
    pd.reset_option('display.max_columns')
    pd.reset_option('display.width')
    pd.reset_option('display.float_format')
    pd.reset_option('display.max_colwidth')

回答by Prabhat

For those looking to do this in dask. I could not find a similar option in dask but if I simply do this in same notebook for pandas it works for dask too.

对于那些希望在 dask 中执行此操作的人。我在 dask 中找不到类似的选项,但如果我只是在同一个笔记本中为熊猫做这个,它也适用于 dask。

import pandas as pd
import dask.dataframe as dd
pd.set_option('display.max_colwidth', -1) # This will set the no truncate for pandas as well as for dask. Not sure how it does for dask though. but it works

train_data = dd.read_csv('./data/train.csv')    
train_data.head(5)