Python 将 Panda DF 列表转换为字符串

Question

提问by Rusty Coder

I have a panda data frame. One of the columns contains a list. I want that column to be a single string.

我有一个熊猫数据框。其中一列包含一个列表。我希望该列是单个字符串。

For example my list ['one','two','three'] should simply be 'one, two, three'

例如我的列表 ['one','two','three'] 应该只是 'one, two,three'

df['col'] = df['col'].astype(str).apply(lambda x: ', '.join(df['col'].astype(str)))

gives me ['one, two, three],['four','five','six'] where the second list is from the next row. Needless to say with millions of rows this concatenation across rows is not only incorrect, it kills my memory.

给我 ['one, two,three],['four','five','six'] 其中第二个列表来自下一行。不用说，对于数百万行，这种跨行串联不仅不正确，而且会扼杀我的记忆。

Answer 1

回答by IanS

You should certainly not convert to string before you transform the list. Try:

在转换列表之前，您当然不应该转换为字符串。尝试：

df['col'].apply(', '.join)

Also note that applyapplies the function to the elements of the series, so using df['col']in the lambda function is probably not what you want.

另请注意，apply将函数应用于系列的元素，因此df['col']在 lambda 函数中使用可能不是您想要的。

Edit: thanks Yakymfor pointing out that there is no need for a lambda function.

编辑：感谢Yakym指出不需要 lambda 函数。

Edit: as noted by Anton Protopopov, there is a native .str.joinmethod, but it is (surprisingly) a bit slower than apply.

编辑：正如Anton Protopopov所指出的，有一种本地.str.join方法，但它（令人惊讶地）比apply.

Answer 2

回答by hilberts_drinking_problem

When you cast colto strwith astype, you get a string representation of a python list, brackets and all. You do not need to do that, just applyjoindirectly:

当您转换col为strwith 时astype，您将获得一个 Python 列表、方括号和所有内容的字符串表示形式。你不需要这样做，applyjoin直接：

import pandas as pd

df = pd.DataFrame({
    'A': [['a', 'b', 'c'], ['A', 'B', 'C']]
    })

# Out[8]: 
#            A
# 0  [a, b, c]
# 1  [A, B, C]

df['Joined'] = df.A.apply(', '.join)

#            A   Joined
# 0  [a, b, c]  a, b, c
# 1  [A, B, C]  A, B, C

Answer 3

回答by Anton Protopopov

You could convert your list to str with astype(str)and then remove ', [, ]characters. Using @Yakim example:

您可以将列表转换为 str ，astype(str)然后删除', [,]字符。使用@Yakim 示例：

In [114]: df
Out[114]:
           A
0  [a, b, c]
1  [A, B, C]

In [115]: df.A.astype(str).str.replace('\[|\]|\'', '')
Out[115]:
0    a, b, c
1    A, B, C
Name: A, dtype: object

Timing

定时

import pandas as pd
df = pd.DataFrame({'A': [['a', 'b', 'c'], ['A', 'B', 'C']]})
df = pd.concat([df]*1000)


In [2]: timeit df['A'].apply(', '.join)
292 μs ± 10.8 μs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [3]: timeit df['A'].str.join(', ')
368 μs ± 24.6 μs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [4]: timeit df['A'].apply(lambda x: ', '.join(x))
505 μs ± 5.74 μs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [5]: timeit df['A'].str.replace('\[|\]|\'', '')
2.43 ms ± 62.7 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Answer 4

回答by AMC

Pandas offers a method for this, Series.str.join.

Pandas 为此提供了一种方法，Series.str.join.

Python 将 Panda DF 列表转换为字符串

提问by Rusty Coder

回答by IanS

回答by hilberts_drinking_problem

回答by Anton Protopopov

回答by AMC

相关推荐

最近更新

标签

Python 将 Panda DF 列表转换为字符串

提问by Rusty Coder

回答by IanS

回答by hilberts_drinking_problem

回答by Anton Protopopov

回答by AMC

相关推荐

Python Matplotlib：鼻子，龙卷风

剪下一部分视频——python

Python Django 视图集没有属性“get_extra_actions”

如何在python中的tkinter中对齐标签中的文本需要在tkinter中对齐

相关推荐

最近更新

标签