Pandas 将具有多个值的行数据合并到列的 Python 列表中
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/46275765/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas Merge row data with multiple values to Python list for a column
提问by Kalpish Singhal
I have a data-frame that looks like
我有一个看起来像的数据框
DATA
数据
*id*, *name*, *URL*, *Type*
2, birth_france_by_region, http://abc. com, T1
2, birth_france_by_region, http://pt. python, T2
3, long_lat, http://abc. com, T3
3, long_lat, http://pqur. com, T1
4, random_time_series, http://sadsdc. com, T2
4, random_time_series, http://sadcadf. com, T3
5, birth_names, http://google. com, T1
5, birth_names, http://helloworld. com,T2
5, birth_names, http://hu. com, T3
I want a this dataframe to merge the rows where id are equal and have a list of Typecorresponding list of URLso final output should be like
我想要一个这个数据框来合并 id 相等的行,并有一个Type对应的URL列表的列表, 所以最终输出应该像
*id*, *name*, *URL*, *Type*
2,birth_france_by_region, [http://abc .com,http://pt.python], [T1,T2]
3,long_lat, [http://abc .com,http://pqur. com], [T3,T1]
4,random_time_series, [http://sadsdc. com,http://sadcadf .com,],[T2,T3]
5,birth_names, [http://google .com,http://helloworld. com,
http://hu. com] , [T1,T2,T3]
回答by jezrael
I think you need groupby
and aggregate tuple
and then convert to list
:
我认为您需要groupby
并聚合tuple
然后转换为list
:
df = df.groupby(['id','name']).agg(lambda x: tuple(x)).applymap(list).reset_index()
print (df)
id name \
0 2 birth_france_by_region
1 3 long_lat
2 4 random_time_series
3 5 birth_names
URL Type
0 [http://abc.cm, http://pt.python] [T1, T2]
1 [http://abc.cm, http://pqur.com] [T3, T1]
2 [http://sadsdc.com, http://sadcadf.com] [T2, T3]
3 [http://google.;com, http://helloworld.com, ht... [T1, T2, T3]
Because in version 0.20.3 raise error:
因为在 0.20.3 版本中引发错误:
df = df.groupby(['id','name']).agg(lambda x: x.tolist())
ValueError: Function does not reduce
ValueError:函数不减少
回答by Laurent
This will give you the expected result for the "URL" column:
这将为您提供“URL”列的预期结果:
test.groupby(["id", "name"])['URL'].apply(list)
id name
2 birth_france_by_region [http://abc. com, http://pt. python]
3 long_lat [http://abc. com, http://pqur. com]
4 random_time_series [http://sadsdc. com, http://sadcadf. com]
5 birth_names [http://google. com, http://helloworld. com, h...
However, I can't find a solution for both URL and Type columns.
但是,我找不到 URL 和 Type 列的解决方案。
I could propose to do it in 2 steps:
我可以建议分两步完成:
temp_table1 = test.groupby(["id", "name"])['URL'].apply(list)
temp_table2 = test.groupby(["id", "name"])['Type'].apply(list)
- Merge
temp_table1
&temp_table2
temp_table1 = test.groupby(["id", "name"])['URL'].apply(list)
temp_table2 = test.groupby(["id", "name"])['Type'].apply(list)
- 合并
temp_table1
&temp_table2