在 Pandas DataFrame 的字符串中漂亮地打印换行符

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/34322448/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 00:23:16  来源:igfitidea点击:

Pretty printing newlines inside a string in a Pandas DataFrame

pythonstringpython-3.xpandasprinting

提问by shadowtalker

I have a Pandas DataFrame in which one of the columns contains string elements, and those string elements contain new lines that I would like to print literally. But they just appear as \nin the output.

我有一个 Pandas DataFrame,其中一列包含字符串元素,而这些字符串元素包含我想逐字打印的新行。但它们只是出现\n在输出中。

That is, I want to print this:

也就是说,我想打印这个:

  pos     bidder
0   1
1   2
2   3  <- alice
       <- bob
3   4

but this is what I get:

但这就是我得到的:

  pos            bidder
0   1
1   2
2   3  <- alice\n<- bob
3   4

How can I accomplish what I want? Can I use a DataFrame, or will I have to revert to manually printing padded columns one row at a time?

我怎样才能完成我想要的?我可以使用 DataFrame 吗,还是必须恢复为一次手动打印一行填充的列?

Here's what I have so far:

这是我到目前为止所拥有的:

n = 4
output = pd.DataFrame({
    'pos': range(1, n+1),
    'bidder': [''] * n
})
bids = {'alice': 3, 'bob': 3}
used_pos = []
for bidder, pos in bids.items():
    if pos in used_pos:
        arrow = output.ix[pos, 'bidder']
        output.ix[pos, 'bidder'] = arrow + "\n<- %s" % bidder
    else:
        output.ix[pos, 'bidder'] = "<- %s" % bidder
print(output)

采纳答案by oystein-hr

From pandas.DataFrame documention:

来自 pandas.DataFrame文档

Two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns). Arithmetic operations align on both row and column labels. Can be thought of as a dict-like container for Series objects. The primary pandas data structure

二维大小可变、具有标记轴(行和列)的潜在异构表格数据结构。算术运算在行和列标签上对齐。可以将其视为系列对象的类 dict 容器。主要的pandas数据结构

So you can't have a row without an index. Newline "\n" won't work in DataFrame.

所以你不能有没有索引的行。换行符 "\n" 在 DataFrame 中不起作用。

You could overwrite 'pos' with an empty value, and output the next 'bidder' on the next row. But then index and 'pos' would be offset every time you do that. Like:

您可以用空值覆盖 'pos',并在下一行输出下一个 'bidder'。但是每次你这样做时, index 和 'pos' 都会被抵消。喜欢:

  pos    bidder
0   1          
1   2          
2   3  <- alice
3        <- bob
4   5   

So if a bidder called 'frank' had 4 as value, it would overwrite 'bob'. This would cause problems as you add more. It is probably possible to use DataFrame and write code to work around this issue, but probably worth looking into other solutions.

因此,如果名为“frank”的投标人的价值为 4,它将覆盖“bob”。当您添加更多时,这会导致问题。可能可以使用 DataFrame 并编写代码来解决此问题,但可能值得研究其他解决方案。

Here is the code to produce the output structure above.

这是生成上述输出结构的代码。

import pandas as pd

n = 5
output = pd.DataFrame({'pos': range(1, n + 1),
                      'bidder': [''] * n},
                      columns=['pos', 'bidder'])
bids = {'alice': 3, 'bob': 3}
used_pos = []
for bidder, pos in bids.items():
    if pos in used_pos:
        output.ix[pos, 'bidder'] = "<- %s" % bidder
        output.ix[pos, 'pos'] = ''
    else:
        output.ix[pos - 1, 'bidder'] = "<- %s" % bidder
        used_pos.append(pos)
print(output)

Edit:

编辑:

Another option is to restructure the data and output. You could have pos as columns, and create a new row for each key/person in the data. In the code example below it prints the DataFrame with NaN values replaced with an empty string.

另一种选择是重构数据和输出。您可以将 pos 作为列,并为数据中的每个键/人创建一个新行。在下面的代码示例中,它打印了用空字符串替换 NaN 值的 DataFrame。

import pandas as pd

data = {'johnny\nnewline': 2, 'alice': 3, 'bob': 3,
        'frank': 4, 'lisa': 1, 'tom': 8}
n = range(1, max(data.values()) + 1)

# Create DataFrame with columns = pos
output = pd.DataFrame(columns=n, index=[])

# Populate DataFrame with rows
for index, (bidder, pos) in enumerate(data.items()):
    output.loc[index, pos] = bidder

# Print the DataFrame and remove NaN to make it easier to read.
print(output.fillna(''))

# Fetch and print every element in column 2
for index in range(1, 5):
    print(output.loc[index, 2])

It depends what you want to do with the data though. Good luck :)

不过,这取决于您想对数据做什么。祝你好运 :)

回答by unsorted

If you're trying to do this in ipython notebook, you can do:

如果您尝试在 ipython notebook 中执行此操作,您可以执行以下操作:

from IPython.display import display, HTML

def pretty_print(df):
    return display( HTML( df.to_html().replace("\n","<br>") ) )

回答by yongjieyongjie

Using pandas .set_properties()and CSS white-spaceproperty

使用 pandas.set_properties()和 CSSwhite-space属性

[For use in IPython notebooks]

[用于 IPython 笔记本]

Another way will be to use pandas's pandas.io.formats.style.Styler.set_properties()method and the CSS "white-space": "pre-wrap"property:

另一种方法是使用 pandas 的pandas.io.formats.style.Styler.set_properties()方法和 CSS"white-space": "pre-wrap"属性:

from IPython.display import display

# Assuming the variable df contains the relevant DataFrame
display(df.style.set_properties(**{
    'white-space': 'pre-wrap',
})

To keep the text left-aligned, you might want to add 'text-align': 'left'as below:

要保持文本左对齐,您可能需要添加'text-align': 'left'如下内容:

from IPython.display import display

# Assuming the variable df contains the relevant DataFrame
display(df.style.set_properties(**{
    'text-align': 'left',
    'white-space': 'pre-wrap',
})

回答by Roger d'Amiens

Somewhat in line with unsorted's answer:

有点符合 unsorted 的回答:

import pandas as pd

# Save the original `to_html` function to call it later
pd.DataFrame.base_to_html = pd.DataFrame.to_html
# Call it here in a controlled way
pd.DataFrame.to_html = (
    lambda df, *args, **kwargs: 
        (df.base_to_html(*args, **kwargs)
           .replace(r"\n", "<br/>"))
)

This way, you don't need to call any explicit function in Jupyter notebooks, as to_htmlis called internally. If you want the original function, call base_to_html(or whatever you named it).

这样,您就不需要在 Jupyter 笔记本中调用任何显式函数,就像to_html内部调用的那样。如果您想要原始函数,请调用base_to_html(或任何您命名的函数)。

I'm using jupyter 1.0.0, notebook 5.7.6.

我正在使用jupyter 1.0.0notebook 5.7.6