从 Pandas Column 解压字典

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/50512188/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 05:35:42  来源:igfitidea点击:

Unpack dictionary from Pandas Column

pythonpython-3.xpandasdictionary

提问by DBa

I have a dataframe that has one of the columns as a dictionary. I want to unpack it into multiple columns (i.e. code, amount are separate columns in the below Raw column format). The following code used to work with pandas v0.22, now (0.23) giving an index error:

我有一个数据框,其中一列作为字典。我想将它解包成多列(即代码,金额是下面原始列格式中的单独列)。以下代码用于使用 pandas v0.22,现在 (0.23) 给出索引错误:

pd.DataFrame.from_records(df.col_name.fillna(pd.Series([{'code':'not applicable'}], index=df.index)).values.tolist())

ValueError: Length of passed values is 1, index implies x

I searched google/stack overflow for hours and none of the other solutions previously presented work anymore.

我在 google/stack overflow 上搜索了几个小时,以前的其他解决方案都没有工作了。

Raw column format:

原始列格式:

     dict_codes
0   {'code': 'xx', 'amount': '10.00',...
1   {'code': 'yy', 'amount': '20.00'...
2   {'code': 'bb', 'amount': '30.00'...
3   {'code': 'aa', 'amount': '40.00'...
10  {'code': 'zz', 'amount': '50.00'...
11                            NaN
12                            NaN
13                            NaN

Does anyone have any suggestions?

有没有人有什么建议?

Thanks

谢谢

采纳答案by piRSquared

Setup

设置

df = pd.DataFrame(dict(
    codes=[
        {'amount': 12, 'code': 'a'},
        {'amount': 19, 'code': 'x'},
        {'amount': 37, 'code': 'm'},
        np.nan,
        np.nan,
        np.nan,
    ]
))

df

                         codes
0  {'amount': 12, 'code': 'a'}
1  {'amount': 19, 'code': 'x'}
2  {'amount': 37, 'code': 'm'}
3                          NaN
4                          NaN
5                          NaN


applywith pd.Series

applypd.Series

Make sure to dropnafirst

确保dropna

df.codes.dropna().apply(pd.Series)

   amount code
0      12    a
1      19    x
2      37    m


df.drop('codes', 1).assign(**df.codes.dropna().apply(pd.Series))

   amount code
0    12.0    a
1    19.0    x
2    37.0    m
3     NaN  NaN
4     NaN  NaN
5     NaN  NaN


tolistand from_records

tolistfrom_records

Same idea but skip the apply

同样的想法,但跳过 apply

pd.DataFrame.from_records(df.codes.dropna().tolist())

   amount code
0      12    a
1      19    x
2      37    m


df.drop('codes', 1).assign(**pd.DataFrame.from_records(df.codes.dropna().tolist()))

   amount code
0    12.0    a
1    19.0    x
2    37.0    m
3     NaN  NaN
4     NaN  NaN
5     NaN  NaN

回答by user3483203

Setup

设置

                        codes
0  {'amount': 12, 'code': 10}
1    {'amount': 3, 'code': 3}

applywith pd.Series

applypd.Series

df.codes.apply(pd.Series)

   amount  code
0      12    10
1       3     3