pandas 熊猫:将列转换为列表

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/40784200/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 02:30:00  来源:igfitidea点击:

Pandas: convert column to list

pythonpandas

提问by Petr Petrov

I have a dataframe

我有一个数据框

date    member_id   val
2016-06-01  2377264 14
2016-06-01  289719  6
2016-06-02  289719  12
2016-06-02  2377264 1
2016-06-03  289719  0
2016-06-04  289719  0
2016-06-05  289719  3

I need to get member_id val 2377264 [14, 1] 289719 [6, 12, 0, 3] And next I want to sum elements in list and if there is 0 in list, write it. I mean

我需要得到 member_id val 2377264 [14, 1] 289719 [6, 12, 0, 3] 接下来我想对列表中的元素求和,如果列表中有 0,就写出来。我的意思是

member_id   val
2377264   [15]
289719    [18, 0, 0, 3]

I tried

我试过

vals = []
print df.groupby('member_id')['val'].apply(lambda x: vals.append(x))

but it returns all None values in a column. How can I fix that?

但它返回列中的所有 None 值。我该如何解决?

回答by Mr. A

try this

尝试这个

1. if you want val list

1.如果你想要val列表

df.groupby('member_id')['val'].apply(lambda x: list(x))

output

输出

member_id
289719     [6, 12, 0, 0, 3]
2377264             [14, 1]
Name: val, dtype: object

2. To get list of list

2.获取列表列表

df.groupby('member_id')['val'].apply(lambda x: list(x)).tolist()

output

输出

[[6, 12, 0, 0, 3], [14, 1]]

3. To get dict

3. 得到 dict

df.groupby('member_id')['val'].apply(lambda x: list(x)).to_dict()

output

输出

{2377264: [14, 1], 289719: [6, 12, 0, 0, 3]}

4. To get sum

4.求和

df.groupby('member_id')['val'].apply(lambda x: sum(x))

output

输出

member_id
289719     21
2377264    15
Name: val, dtype: int64

5. Get Sum of numbers between zero's

5. 获取零之间数字的总和

As per your comment you need to get a list of vals and sum elements between 0's and to do that you should use bellow code

根据您的评论,您需要获取 0 之间的 vals 和 sum 元素列表,为此您应该使用以下代码

def sumNumberBetweenZero(values):
    valsum=[0]
    for i in values:
      if i==0:
        if valsum[-1]!=0:valsum.append(0)
        valsum.append(0)
      valsum[-1]+=i
    return valsum

5.A. get sum of all elements

5.A. 获取所有元素的总和

sumNumberBetweenZero(df["val"].tolist())

output

输出

[33L, 0, 0L, 3L]

5.B. get sum of values groupby member_id

5.B. 获取值的总和 groupbymember_id

df.groupby('member_id')['val'].apply(lambda x: sumNumberBetweenZero((x))

output

输出

member_id
289719     [18, 0, 0, 3]
2377264             [15]
Name: val, dtype: object

5.iii. For the list given as example

5.iii. 对于作为示例给出的列表

sumNumberBetweenZero([1, 2, 5, 0, 3,2, 6, 7, 45, 0, 23, 0, 0, 0, 34])

output

输出

[8, 0, 63, 0, 23, 0, 0, 0, 34]