Pandas DataFrames：如何在没有空格的情况下包装文本

Question

提问by user1956609

I'm viewing a Pandas DataFrame in a Jupyter Notebook, and my DataFrame contains URL request strings that can be hundreds of characters long without any whitespace separating characters.

我正在 Jupyter Notebook 中查看 Pandas DataFrame，我的 DataFrame 包含 URL 请求字符串，该字符串长度可能为数百个字符，没有任何空格分隔字符。

Pandas seems to only wrap text in a cell when there's whitespace, as shown on the attached picture:

Pandas 似乎只在有空格时在单元格中换行，如附图所示：

If there isn't whitespace, the string is displayed in a single line, and if there isn't enough space my options are either to see a '...' or I have to set display.max_colwidthto a huge number and now I have a hard-to-read table with a lot of scrolling.

如果没有空格，则字符串显示在一行中，如果没有足够的空间，我的选择要么是查看“...”，要么必须设置display.max_colwidth为一个巨大的数字，现在我有一个大量滚动的难以阅读的表格。

Is there a way to force Pandas to wrap text, say, every 100 characters, regardless of whether there is whitespace?

有没有办法强制 Pandas 换行文本，比如每 100 个字符，而不管是否有空格？

Answer 1

回答by paulo.filip3

You can set

你可以设置

import pandas as pd
pd.set_option('display.max_colwidth', 0)

and then each column will be just as big as it needs to bein order to fully display it's content. It will not wrap the textcontent of the cells though (unless they contain spaces).

然后每一列将和它需要的一样大，以便完全显示它的内容。它不会包装单元格的文本内容（除非它们包含空格）。

Answer 2

回答by mr_snuffles

If you're only in this for ad-hoc, temporary display purposes in Jupyter, you can simply insert whitespace every 100 characters:

如果你只是为了在 Jupyter 中进行临时的临时显示，你可以简单地每 100 个字符插入一个空格：

chunk_size = 100

块大小 = 100

data['new_column'] = [' '.join([val[0+i:chunk_size+i] for i in range(0, len(string), chunk_size)] for val in data['old_column']

data['new_column'] = [''.join([val[0+i:chunk_size+i] for i in range(0, len(string), chunk_size)] for val in data['old_column']

Though it looks like the reason this is a problem in the first place is because multiple features are collapsed into a single column. It's hard to say without seeing your larger dataset, but if they all follow they same pattern, I'd strongly suggest splitting this out into multiple features (browser, browser version, OS, OS version, etc), which will make any additional work with this dataset easier.

虽然看起来这是一个问题的首要原因是因为多个功能被折叠到一个列中。很难说没有看到更大的数据集，但如果它们都遵循相同的模式，我强烈建议将其拆分为多个功能（浏览器、浏览器版本、操作系统、操作系统版本等），这将进行任何额外的工作有了这个数据集就更容易了。

Answer 3

回答by O.Suleiman

You can use str.wrapmethod:

您可以使用str.wrap方法：

df['user_agent'] = df['user_agent'].str.wrap(100) #to set max line width of 100

Answer 4

回答by vestland

If you don't mind solving this before you put the whole thing into a dataframe, you can do it like described here. In your particular case, if you'd like each line to be 10 characters long, you would have:

如果您不介意在将整个内容放入数据帧之前解决这个问题，您可以按照此处所述进行操作。在您的特定情况下，如果您希望每行的长度为 10 个字符，您可以：

# Input
line = 'Mozilla/5.0 (Windows NT 6.3; Trident/7.0; rv:11.0; GomezAgent 3.0) 
like Gecko'
n = 10

# Split
line = [line[i:i+n] for i in range(0, len(line), n)]

# The rest is easy
df = pd.DataFrame(line)
print(df)

Without the white spaces, you'll get:

没有空格，你会得到：

And by the way, the white space at the beginning of the last row occurs because there are not 10 characters to fill the row like there is in the preceding rows. In jupyter you could remedy this by using df.style.set_properties(**{'text-align': 'left'}):

顺便说一下，在最后一行的开头出现空白是因为没有 10 个字符来填充该行，就像前几行那样。在 jupyter 中，您可以使用df.style.set_properties(**{'text-align': 'left'})以下方法解决此问题：

Answer 5

回答by Pato Navarro

You can create a new column with the first 100 characters of the data

您可以使用数据的前 100 个字符创建一个新列

data['new_column'] = [i[:100] for i in data['old_column']]

Pandas DataFrames：如何在没有空格的情况下包装文本

提问by user1956609

回答by paulo.filip3

回答by mr_snuffles

回答by O.Suleiman

回答by vestland

回答by Pato Navarro

相关推荐

最近更新

标签

Pandas DataFrames：如何在没有空格的情况下包装文本

提问by user1956609

回答by paulo.filip3

回答by mr_snuffles

回答by O.Suleiman

回答by vestland

回答by Pato Navarro

相关推荐

Pandas 错误只能将 .str 访问器与字符串一起使用

pandas Python：如何比较两个数据框

从 Pandas 数据框中删除带有空列表的行

pandas 如何重新索引熊猫数据帧以将起始索引值重置为零？

相关推荐

最近更新

标签