Pandas DataFrames:如何在没有空格的情况下包装文本

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/34376896/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 00:24:27  来源:igfitidea点击:

Pandas DataFrames: how to wrap text with no whitespace

pythonpandasipython

提问by user1956609

I'm viewing a Pandas DataFrame in a Jupyter Notebook, and my DataFrame contains URL request strings that can be hundreds of characters long without any whitespace separating characters.

我正在 Jupyter Notebook 中查看 Pandas DataFrame,我的 DataFrame 包含 URL 请求字符串,该字符串长度可能为数百个字符,没有任何空格分隔字符。

Pandas seems to only wrap text in a cell when there's whitespace, as shown on the attached picture:

Pandas 似乎只在有空格时在单元格中换行,如附图所示:

enter image description here

在此处输入图片说明

If there isn't whitespace, the string is displayed in a single line, and if there isn't enough space my options are either to see a '...' or I have to set display.max_colwidthto a huge number and now I have a hard-to-read table with a lot of scrolling.

如果没有空格,则字符串显示在一行中,如果没有足够的空间,我的选择要么是查看“...”,要么必须设置display.max_colwidth为一个巨大的数字,现在我有一个大量滚动的难以阅读的表格。

Is there a way to force Pandas to wrap text, say, every 100 characters, regardless of whether there is whitespace?

有没有办法强制 Pandas 换行文本,比如每 100 个字符,而不管是否有空格?

回答by paulo.filip3

You can set

你可以设置

import pandas as pd
pd.set_option('display.max_colwidth', 0)

and then each column will be just as big as it needs to bein order to fully display it's content. It will not wrap the textcontent of the cells though (unless they contain spaces).

然后每一列将和它需要的一样大,以便完全显示它的内容。它不会包装单元格的文本内容(除非它们包含空格)。

回答by mr_snuffles

If you're only in this for ad-hoc, temporary display purposes in Jupyter, you can simply insert whitespace every 100 characters:

如果你只是为了在 Jupyter 中进行临时的临时显示,你可以简单地每 100 个字符插入一个空格:

chunk_size = 100

块大小 = 100

data['new_column'] = [' '.join([val[0+i:chunk_size+i] for i in range(0, len(string), chunk_size)] for val in data['old_column']

data['new_column'] = [''.join([val[0+i:chunk_size+i] for i in range(0, len(string), chunk_size)] for val in data['old_column']

Though it looks like the reason this is a problem in the first place is because multiple features are collapsed into a single column. It's hard to say without seeing your larger dataset, but if they all follow they same pattern, I'd strongly suggest splitting this out into multiple features (browser, browser version, OS, OS version, etc), which will make any additional work with this dataset easier.

虽然看起来这是一个问题的首要原因是因为多个功能被折叠到一个列中。很难说没有看到更大的数据集,但如果它们都遵循相同的模式,我强烈建议将其拆分为多个功能(浏览器、浏览器版本、操作系统、操作系统版本等),这将进行任何额外的工作有了这个数据集就更容易了。

回答by O.Suleiman

You can use str.wrapmethod:

您可以使用str.wrap方法:

df['user_agent'] = df['user_agent'].str.wrap(100) #to set max line width of 100

回答by vestland

If you don't mind solving this before you put the whole thing into a dataframe, you can do it like described here. In your particular case, if you'd like each line to be 10 characters long, you would have:

如果您不介意在将整个内容放入数据帧之前解决这个问题,您可以按照此处所述进行操作。在您的特定情况下,如果您希望每行的长度为 10 个字符,您可以:

# Input
line = 'Mozilla/5.0 (Windows NT 6.3; Trident/7.0; rv:11.0; GomezAgent 3.0) 
like Gecko'
n = 10

# Split
line = [line[i:i+n] for i in range(0, len(line), n)]

# The rest is easy
df = pd.DataFrame(line)
print(df)

enter image description here

在此处输入图片说明

Without the white spaces, you'll get:

没有空格,你会得到:

enter image description here

在此处输入图片说明

And by the way, the white space at the beginning of the last row occurs because there are not 10 characters to fill the row like there is in the preceding rows. In jupyter you could remedy this by using df.style.set_properties(**{'text-align': 'left'}):

顺便说一下,在最后一行的开头出现空白是因为没有 10 个字符来填充该行,就像前几行那样。在 jupyter 中,您可以使用df.style.set_properties(**{'text-align': 'left'})以下方法解决此问题:

enter image description here

在此处输入图片说明

回答by Pato Navarro

You can create a new column with the first 100 characters of the data

您可以使用数据的前 100 个字符创建一个新列

data['new_column'] = [i[:100] for i in data['old_column']]