Pandas 将具有多个值的行数据合并到列的 Python 列表中

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/46275765/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 04:28:52  来源:igfitidea点击:

Pandas Merge row data with multiple values to Python list for a column

pythonlistpandasdataframemerge

提问by Kalpish Singhal

I have a data-frame that looks like

我有一个看起来像的数据框

DATA

数据

*id*,             *name*,                      *URL*,                 *Type*  
    2,             birth_france_by_region,    http://abc. com,       T1 
    2,             birth_france_by_region,    http://pt. python,     T2 
    3,             long_lat,                  http://abc. com,       T3 
    3,             long_lat,                  http://pqur. com,      T1 
    4,             random_time_series,        http://sadsdc. com,    T2 
    4,             random_time_series,        http://sadcadf. com,   T3
    5,             birth_names,               http://google. com,    T1 
    5,             birth_names,               http://helloworld. com,T2 
    5,             birth_names,               http://hu. com,        T3

I want a this dataframe to merge the rows where id are equal and have a list of Typecorresponding list of URLso final output should be like

我想要一个这个数据框来合并 id 相等的行,并有一个Type对应的URL列表的列表, 所以最终输出应该像

*id*, *name*,             *URL*,                               *Type*  
2,birth_france_by_region,  [http://abc .com,http://pt.python], [T1,T2] 
3,long_lat,           [http://abc .com,http://pqur. com],       [T3,T1] 
4,random_time_series, [http://sadsdc. com,http://sadcadf .com,],[T2,T3] 
5,birth_names,        [http://google .com,http://helloworld. com,
                                       http://hu. com] ,   [T1,T2,T3]

回答by jezrael

I think you need groupbyand aggregate tupleand then convert to list:

我认为您需要groupby并聚合tuple然后转换为list

df = df.groupby(['id','name']).agg(lambda x: tuple(x)).applymap(list).reset_index()

print (df)
   id                    name  \
0   2  birth_france_by_region   
1   3                long_lat   
2   4      random_time_series   
3   5             birth_names   

                                                 URL          Type  
0                 [http://abc.cm, http://pt.python]      [T1, T2]  
1                  [http://abc.cm, http://pqur.com]      [T3, T1]  
2            [http://sadsdc.com, http://sadcadf.com]      [T2, T3]  
3  [http://google.;com, http://helloworld.com, ht...  [T1, T2, T3] 

Because in version 0.20.3 raise error:

因为在 0.20.3 版本中引发错误:

df = df.groupby(['id','name']).agg(lambda x: x.tolist())

ValueError: Function does not reduce

ValueError:函数不减少

回答by Laurent

This will give you the expected result for the "URL" column:

这将为您提供“URL”列的预期结果:

test.groupby(["id", "name"])['URL'].apply(list)

id  name                  
2   birth_france_by_region                 [http://abc. com, http://pt. python]
3   long_lat                                [http://abc. com, http://pqur. com]
4   random_time_series                [http://sadsdc. com, http://sadcadf. com]
5   birth_names               [http://google. com, http://helloworld. com, h...

However, I can't find a solution for both URL and Type columns.

但是,我找不到 URL 和 Type 列的解决方案。

I could propose to do it in 2 steps:

我可以建议分两步完成:

  • temp_table1 = test.groupby(["id", "name"])['URL'].apply(list)
  • temp_table2 = test.groupby(["id", "name"])['Type'].apply(list)
  • Merge temp_table1& temp_table2
  • temp_table1 = test.groupby(["id", "name"])['URL'].apply(list)
  • temp_table2 = test.groupby(["id", "name"])['Type'].apply(list)
  • 合并temp_table1&temp_table2