创建 Pandas DataFrame 的元素并将其设置为列表
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/25751453/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Create and set an element of a Pandas DataFrame to a list
提问by DrMisha
I have a Pandas DataFrame that I'm creating row-by-row (I know, I know, it's not Pandorable/Pythonic..). I'm creating elements using .loclike so
我有一个 Pandas DataFrame,我正在逐行创建它(我知道,我知道,它不是 Pandorable/Pythonic ..)。我正在像这样使用.loc创建元素
output.loc[row_id, col_id]
and I'd like to set this value to an empty list, [].
我想将此值设置为空列表 []。
output.loc[row_id, col_id] = []
Unfortunately, I get an error saying the size of my keys and values do not match (Pandas thinks I'm trying to set values withnot toan iterable).
不幸的是,我得到一个错误,说我的钥匙和值不匹配(Pandas认为我试图设定值的大小与不来的迭代)。
Is there a way to do this?
有没有办法做到这一点?
Thanks!
谢谢!
回答by Andy Hayden
You need to make sure two things:
你需要确保两件事:
- there is precisely one entry for that loc,
- the column has dtype object (actually, on testing this seems not to be an issue).
- 那个位置正好有一个条目,
- 该列具有 dtype 对象(实际上,在测试时这似乎不是问题)。
A hacky way to do this is to use a Series with []:
一个hacky的方法是使用带有[]的系列:
In [11]: df = pd.DataFrame([[1, 2], [3, 4]], columns=['A', 'B'])
In [12]: df.loc[[0], 'A'] = pd.Series([[]])
In [13]: df
Out[13]:
A B
0 [] 2
1 3 4
pandas doesn't really want you use []as elements because it's usually not so efficient and makes aggregations more complicated (and un-cythonisable).
pandas 并不真的希望您将其[]用作元素,因为它通常效率不高并且使聚合更加复杂(并且无法进行 cythonisable)。
In general you don't want to build up DataFrames cell-by-cell, there is (almost?) always a better way.
通常,您不想逐个单元地构建 DataFrame,但(几乎?)总是有更好的方法。
回答by Misha
The answer by MishaTeplitskiy works when the index label is 0. More generally, if you want to assign an array x to an element of a DataFrame df with row r and column c, you can use:
当索引标签为 0 时,MishaTeplitskiy 的答案有效。更一般地,如果要将数组 x 分配给具有行 r 和列 c 的 DataFrame df 的元素,您可以使用:
df.loc[[r], c] = pd.Series([x], index = [r])
回答by Tan Dat
You can use pd.atinstead:
您可以使用pd.at代替:
df = pd.DataFrame()
df['B'] = [1, 2, 3]
df['A'] = None
df.at[1, 'A'] = np.array([1, 2, 3])
When you use pd.loc, pandas thinks you are interacting with a set of rows. So if you try to assign an array using pd.loc, pandas will try to match each element of an array with a corresponding element accessed by pd.loc, hence the error.
当您使用 pd.loc 时,pandas 认为您正在与一组行进行交互。因此,如果您尝试使用 pd.loc 分配数组,pandas 将尝试将数组的每个元素与 pd.loc 访问的相应元素进行匹配,因此会出现错误。

