Pandas:将长度不等的列表列拆分为多列
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/44663903/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas: split column of lists of unequal length into multiple columns
提问by user139014
I have a Pandas dataframe that looks like the below:
我有一个 Pandas 数据框,如下所示:
codes
1 [71020]
2 [77085]
3 [36415]
4 [99213, 99287]
5 [99233, 99233, 99233]
I'm trying to split the lists in df['codes']
into columns, like the below:
我正在尝试将列表df['codes']
分成几列,如下所示:
code_1 code_2 code_3
1 71020
2 77085
3 36415
4 99213 99287
5 99233 99233 99233
where columns that don't have a value (because the list was not that long) are filled with blanks or NaNs or something.
其中没有值的列(因为列表没有那么长)用空格或 NaN 或其他东西填充。
I've seen answers like this oneand others similar to it, and while they work on lists of equal length, they all throw errors when I try to use the methods on lists of unequal length. Is there a good way do to this?
我见过像这样的答案和其他类似的答案,虽然它们处理等长的列表,但当我尝试在不等长的列表上使用这些方法时,它们都会抛出错误。有没有什么好办法呢?
回答by piRSquared
Try:
尝试:
pd.DataFrame(df.codes.values.tolist()).add_prefix('code_')
code_0 code_1 code_2
0 71020 NaN NaN
1 77085 NaN NaN
2 36415 NaN NaN
3 99213 99287.0 NaN
4 99233 99233.0 99233.0
Include the index
包括 index
pd.DataFrame(df.codes.values.tolist(), df.index).add_prefix('code_')
code_0 code_1 code_2
1 71020 NaN NaN
2 77085 NaN NaN
3 36415 NaN NaN
4 99213 99287.0 NaN
5 99233 99233.0 99233.0
We can nail down all the formatting with this:
我们可以用这个来确定所有的格式:
f = lambda x: 'code_{}'.format(x + 1)
pd.DataFrame(
df.codes.values.tolist(),
df.index, dtype=object
).fillna('').rename(columns=f)
code_1 code_2 code_3
1 71020
2 77085
3 36415
4 99213 99287
5 99233 99233 99233
回答by MaxU
Another solution:
另一种解决方案:
In [95]: df.codes.apply(pd.Series).add_prefix('code_')
Out[95]:
code_0 code_1 code_2
1 71020.0 NaN NaN
2 77085.0 NaN NaN
3 36415.0 NaN NaN
4 99213.0 99287.0 NaN
5 99233.0 99233.0 99233.0