Python 列表列,将列表转换为字符串作为新列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/45306988/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 16:54:04  来源:igfitidea点击:

Column of lists, convert list to string as a new column

pythonstringpandaslist

提问by clg4

I have a dataframe with a column of lists which can be created with:

我有一个包含一列列表的数据框,可以使用以下方法创建:

import pandas as pd
lists={1:[[1,2,12,6,'ABC']],2:[[1000,4,'z','a']]}
#create test dataframe
df=pd.DataFrame.from_dict(lists,orient='index')
df=df.rename(columns={0:'lists'})

The dataframe dflooks like:

数据框df看起来像:

                lists
1  [1, 2, 12, 6, ABC]
2     [1000, 4, z, a]

I need to create a new column called 'liststring' which takes every element of each list in listsand creates a string with each element separated by commas. The elements of each list can be int, float, or string. So the result would be:

我需要创建一个名为 ' liststring'的新列,它接受每个列表的每个元素lists并创建一个字符串,每个元素用逗号分隔。每个列表的元件可以是intfloat,或string。所以结果将是:

                lists    liststring
1  [1, 2, 12, 6, ABC]  1,2,12,6,ABC
2     [1000, 4, z, a]    1000,4,z,a

I have tried various things, including from Converting a Panda DF List into a string:

我尝试了各种方法,包括将 Panda DF 列表转换为字符串

df['liststring']=df.lists.apply(lambda x: ', '.join(str(x)))

but unfortunately the result takes every character and seperates by comma:

但不幸的是,结果需要每个字符并用逗号分隔:

                lists                                         liststring
1  [1, 2, 12, 6, ABC]  [, 1, ,,  , 2, ,,  , 1, 2, ,,  , 6, ,,  , ', A...
2     [1000, 4, z, a]  [, 1, 0, 0, 0, ,,  , 4, ,,  , ', z, ', ,,  , '...

Thanks in advance for the help!

在此先感谢您的帮助!

回答by cs95

List Comprehension

列表理解

If performance is important, I strongly recommend this solution and I can explain why.

如果性能很重要,我强烈推荐这个解决方案,我可以解释原因。

df['liststring'] = [','.join(map(str, l)) for l in df['lists']]
df

                lists    liststring
0  [1, 2, 12, 6, ABC]  1,2,12,6,ABC
1     [1000, 4, z, a]    1000,4,z,a

You can extend this to more complicated use cases using a function.

您可以使用函数将其扩展到更复杂的用例。

def try_join(l):
    try:
        return ','.join(map(str, l))
    except TypeError:
        return np.nan

df['liststring'] = [try_join(l) for l in df['lists']]


Series.apply/Series.aggwith ','.join

Series.apply/Series.agg','.join

You need to convert your list items to strings first, that's where the mapcomes in handy.

您需要先将列表项转换为字符串,这就是map派上用场的地方。

df['liststring'] = df['lists'].apply(lambda x: ','.join(map(str, x)))

Or,

或者,

df['liststring'] = df['lists'].agg(lambda x: ','.join(map(str, x)))

df
                lists    liststring
0  [1, 2, 12, 6, ABC]  1,2,12,6,ABC
1     [1000, 4, z, a]    1000,4,z,a


pd.DataFrameconstructor with DataFrame.agg

pd.DataFrame构造函数 DataFrame.agg

A non-loopy/non-lambda solution.

非循环/非 lambda 解决方案。

df['liststring'] = (
 pd.DataFrame(df.lists.tolist())
   .fillna('')
   .astype(str)
   .agg(','.join, 1)
   .str.strip(',')
)

df
                lists    liststring
0  [1, 2, 12, 6, ABC]  1,2,12,6,ABC
1     [1000, 4, z, a]    1000,4,z,a

回答by Scott Boston

One way you could do it is to use list comprehension, str, and join:

一种方法是使用列表理解str、 和join

df['liststring'] = df.lists.apply(lambda x: ', '.join([str(i) for i in x]))

Output:

输出:

                lists        liststring
1  [1, 2, 12, 6, ABC]  1, 2, 12, 6, ABC
2     [1000, 4, z, a]     1000, 4, z, a

回答by Souha

All of these didn't work for me (dealing with text data) what worked for me is this:

所有这些都对我不起作用(处理文本数据)对我有用的是:

    df['liststring'] = df['lists'].apply(lambda x: x[1:-1])

回答by Memin

The previous explanations are well and quite straight forward. But let say if you want to convert multiple columns to string separated format. Without going into individual columns you can apply the following function to dataframe and if any column is a list then it will convert to string format.

前面的解释很好而且很直接。但是假设您想将多列转换为字符串分隔格式。无需进入单个列,您可以将以下函数应用于数据框,如果任何列是列表,则它将转换为字符串格式。

def list2Str(lst):
    if type(lst) is list: # apply conversion to list columns
        return";".join(lst)
    else:
        return lst

df.apply(lambda x: [list2Str(i) for i in x])

of course, if you want to apply only to certain columns then you can select the subset of columns as follows

当然,如果您只想应用于某些列,则可以按如下方式选择列的子集

df[['col1',...,'col2']].apply(lambda x: [list2Str(i) for i in x])