pandas 将列表设置为熊猫数据框列中的值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/38307489/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
set list as value in a column of a pandas dataframe
提问by ysearka
Let's say I have a dataframe df
and I would like to create a new column filled with 0, I use:
假设我有一个数据框df
,我想创建一个填充为 0 的新列,我使用:
df['new_col'] = 0
This far, no problem. But if the value I want to use is a list, it doesn't work:
到此为止,没问题。但是如果我想使用的值是一个列表,它就不起作用:
df['new_col'] = my_list
ValueError: Length of values does not match length of index
I understand why this doesn't work (pandas is trying to assign one value of the list per cell of the column), but how can we avoid this behavior? (if it isn't clear I would like every cell of my new column to contain the same predefined list)
我明白为什么这不起作用(Pandas试图为列的每个单元格分配一个列表值),但是我们如何避免这种行为?(如果不清楚,我希望新列的每个单元格都包含相同的预定义列表)
Note: I also tried: df.assign(new_col = my_list)
, same problem
注意:我也试过:df.assign(new_col = my_list)
,同样的问题
采纳答案by EdChum
You'd have to do:
你必须这样做:
df['new_col'] = [my_list] * len(df)
Example:
例子:
In [13]:
df = pd.DataFrame(np.random.randn(5,3), columns=list('abc'))
df
Out[13]:
a b c
0 -0.010414 1.859791 0.184692
1 -0.818050 -0.287306 -1.390080
2 -0.054434 0.106212 1.542137
3 -0.226433 0.390355 0.437592
4 -0.204653 -2.388690 0.106218
In [17]:
df['b'] = [[234]] * len(df)
df
Out[17]:
a b c
0 -0.010414 [234] 0.184692
1 -0.818050 [234] -1.390080
2 -0.054434 [234] 1.542137
3 -0.226433 [234] 0.437592
4 -0.204653 [234] 0.106218
Note that dfs are optimised for scalar values, storing non scalar values defeats the point in my opinion as filtering, looking up, getting and setting become problematic to the point that it becomes a pain
请注意,dfs 针对标量值进行了优化,在我看来,存储非标量值违背了这一点,因为过滤、查找、获取和设置变得有问题,以至于变得痛苦