Python pandas 将列表插入到单元格中
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/26483254/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python pandas insert list into a cell
提问by ragesz
I have a list 'abc' and a dataframe 'df':
我有一个列表“abc”和一个数据框“df”:
abc = ['foo', 'bar']
df =
A B
0 12 NaN
1 23 NaN
I want to insert the list into cell 1B, so I want this result:
我想将列表插入单元格 1B,所以我想要这个结果:
A B
0 12 NaN
1 23 ['foo', 'bar']
Ho can I do that?
我可以这样做吗?
1) If I use this:
1)如果我使用这个:
df.ix[1,'B'] = abc
I get the following error message:
我收到以下错误消息:
ValueError: Must have equal len keys and value when setting with an iterable
because it tries to insert the list (that has two elements) into a row / column but not into a cell.
因为它试图将列表(具有两个元素)插入行/列而不是单元格。
2) If I use this:
2)如果我使用这个:
df.ix[1,'B'] = [abc]
then it inserts a list that has only one element that is the 'abc' list ( [['foo', 'bar']]).
然后它插入一个只有一个元素的列表,即“abc”列表([['foo', 'bar']])。
3) If I use this:
3)如果我使用这个:
df.ix[1,'B'] = ', '.join(abc)
then it inserts a string: ( foo, bar) but not a list.
然后它插入一个字符串:( foo, bar) 但不是一个列表。
4) If I use this:
4)如果我使用这个:
df.ix[1,'B'] = [', '.join(abc)]
then it inserts a list but it has only one element ( ['foo, bar']) but not two as I want ( ['foo', 'bar']).
然后它插入一个列表,但它只有一个元素 ( ['foo, bar']) 而不是我想要的两个元素( ['foo', 'bar'])。
Thanks for help!
感谢帮助!
EDIT
编辑
My new dataframe and the old list:
我的新数据框和旧列表:
abc = ['foo', 'bar']
df2 =
A B C
0 12 NaN 'bla'
1 23 NaN 'bla bla'
Another dataframe:
另一个数据框:
df3 =
A B C D
0 12 NaN 'bla' ['item1', 'item2']
1 23 NaN 'bla bla' [11, 12, 13]
I want insert the 'abc' list into df2.loc[1,'B']and/or df3.loc[1,'B'].
我想将 'abc' 列表插入df2.loc[1,'B']和/或df3.loc[1,'B'].
If the dataframe has columns only with integer values and/or NaN values and/or list values then inserting a list into a cell works perfectly. If the dataframe has columns only with string values and/or NaN values and/or list values then inserting a list into a cell works perfectly. But if the dataframe has columns with integer and string values and other columns then the error message appears if I use this: df2.loc[1,'B'] = abcor df3.loc[1,'B'] = abc.
如果数据框的列仅包含整数值和/或 NaN 值和/或列表值,则将列表插入单元格非常有效。如果数据框的列仅包含字符串值和/或 NaN 值和/或列表值,则将列表插入单元格非常有效。但是,如果数据框具有带有整数和字符串值的列以及其他列,那么如果我使用以下内容,则会出现错误消息:df2.loc[1,'B'] = abc或df3.loc[1,'B'] = abc.
Another dataframe:
另一个数据框:
df4 =
A B
0 'bla' NaN
1 'bla bla' NaN
These inserts work perfectly: df.loc[1,'B'] = abcor df4.loc[1,'B'] = abc.
这些插入工作完美:df.loc[1,'B'] = abc或df4.loc[1,'B'] = abc。
采纳答案by ragesz
df3.set_value(1, 'B', abc)works for any dataframe. Take care of the data type of column 'B'. Eg. a list can not be inserted into a float column, at that case df['B'] = df['B'].astype(object)can help.
df3.set_value(1, 'B', abc)适用于任何数据框。注意“B”列的数据类型。例如。列表不能插入浮动列中,在这种情况下df['B'] = df['B'].astype(object)可以提供帮助。
回答by Michael Hays
Since set_valuehas been deprecatedsince version 0.21.0, you should now use at. It can insert a list into a cell without raising a ValueErroras locdoes. I think this is because atalwaysrefers to a single value, while loccan refer to values as well as rows and columns.
由于自 0.21.0 版以来set_value已被弃用,您现在应该使用at. 它可以将一个列表插入到一个单元格中,而不会ValueError像loc那样提高 a 。我认为这是因为at总是引用单个值,而loc可以引用值以及行和列。
df = pd.DataFrame(data={'A': [1, 2, 3], 'B': ['x', 'y', 'z']})
df.at[1, 'B'] = ['m', 'n']
df =
A B
0 1 x
1 2 [m, n]
2 3 z
You also need to make sure the columnyou are inserting into has dtype=object. For example
您还需要确保要插入的列具有dtype=object. 例如
>>> df = pd.DataFrame(data={'A': [1, 2, 3], 'B': [1,2,3]})
>>> df.dtypes
A int64
B int64
dtype: object
>>> df.at[1, 'B'] = [1, 2, 3]
ValueError: setting an array element with a sequence
>>> df['B'] = df['B'].astype('object')
>>> df.at[1, 'B'] = [1, 2, 3]
>>> df
A B
0 1 1
1 2 [1, 2, 3]
2 3 3
回答by Ando Jurai
As mentionned in this post pandas: how to store a list in a dataframe?; the dtypes in the dataframe may influence the results, as well as calling a dataframe or not to be assigned to.
正如这篇文章中提到的pandas: how to store a list in a dataframe? ; 数据帧中的 dtypes 可能会影响结果,以及调用或不分配给数据帧。
回答by cs95
Pandas >= 0.21
熊猫 >= 0.21
set_valuehas been deprecated. You can now use DataFrame.atto set by label, and DataFrame.iatto set by integer position.
set_value已被弃用。 您现在可以使用DataFrame.at按标签DataFrame.iat设置和按整数位置设置。
Setting Cell Values with at/iat
使用at/设置单元格值iat
# Setup
df = pd.DataFrame({'A': [12, 23], 'B': [['a', 'b'], ['c', 'd']]})
df
A B
0 12 [a, b]
1 23 [c, d]
df.dtypes
A int64
B object
dtype: object
If you want to set a value in second row of the "B" to some new list, use DataFrane.at:
如果要将“B”的第二行中的值设置为某个新列表,请使用DataFrane.at:
df.at[1, 'B'] = ['m', 'n']
df
A B
0 12 [a, b]
1 23 [m, n]
You can also set by integer position using DataFrame.iat
您还可以使用整数位置设置 DataFrame.iat
df.iat[1, df.columns.get_loc('B')] = ['m', 'n']
df
A B
0 12 [a, b]
1 23 [m, n]
What if I get ValueError: setting an array element with a sequence?
如果我得到ValueError: setting an array element with a sequence怎么办?
I'll try to reproduce this with:
我会尝试重现这一点:
df
A B
0 12 NaN
1 23 NaN
df.dtypes
A int64
B float64
dtype: object
df.at[1, 'B'] = ['m', 'n']
# ValueError: setting an array element with a sequence.
This is because of a your object is of float64dtype, whereas lists are objects, so there's a mismatch there. What you would have to do in this situation is to convert the column to object first.
这是因为您的对象是float64dtype,而列表是objects,所以那里不匹配。在这种情况下,您必须先将列转换为对象。
df['B'] = df['B'].astype(object)
df.dtypes
A int64
B object
dtype: object
Then, it works:
然后,它的工作原理:
df.at[1, 'B'] = ['m', 'n']
df
A B
0 12 NaN
1 23 [m, n]
Possible, But Hacky
可能,但哈奇
Even more wacky, I've found you can hack through DataFrame.locto achieve something similar if you pass nested lists.
更奇怪的是,我发现DataFrame.loc如果你传递嵌套列表,你可以通过破解来实现类似的东西。
df.loc[1, 'B'] = [['m'], ['n'], ['o'], ['p']]
df
A B
0 12 [a, b]
1 23 [m, n, o, p]
You can read more about why this works here.
回答by Pallavi Jindal
Quick work around
快速解决
Simply enclose the list within a new list, as done for col2 in the data frame below. The reason it works is that python takes the outer list (of lists) and converts it into a column as if it were containing normal scalar items, which is lists in our case and not normal scalars.
只需将列表包含在一个新列表中,就像下面数据框中的 col2 一样。它起作用的原因是python获取(列表的)外部列表并将其转换为一列,就好像它包含普通标量项目一样,在我们的例子中是列表而不是普通标量。
mydict={'col1':[1,2,3],'col2':[[1, 4], [2, 5], [3, 6]]}
data=pd.DataFrame(mydict)
data
col1 col2
0 1 [1, 4]
1 2 [2, 5]
2 3 [3, 6]
回答by Maxime Beau
Also getting
还得到
ValueError: Must have equal len keys and value when setting with an iterable,
ValueError: Must have equal len keys and value when setting with an iterable,
using .at rather than .loc did not make any difference in my case, but enforcing the datatype of the dataframe column did the trick:
在我的情况下,使用 .at 而不是 .loc 没有任何区别,但是强制执行数据框列的数据类型可以解决问题:
df['B'] = df['B'].astype(object)
Then I could set lists, numpy array and all sorts of things as single cell values in my dataframes.
然后我可以在我的数据框中将列表、numpy 数组和各种事物设置为单个单元格值。

