创建 Pandas DataFrame 的元素并将其设置为列表

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/25751453/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 22:26:40  来源:igfitidea点击:

Create and set an element of a Pandas DataFrame to a list

pythonpandasdataframe

提问by DrMisha

I have a Pandas DataFrame that I'm creating row-by-row (I know, I know, it's not Pandorable/Pythonic..). I'm creating elements using .loclike so

我有一个 Pandas DataFrame,我正在逐行创建它(我知道,我知道,它不是 Pandorable/Pythonic ..)。我正在像这样使用.loc创建元素

output.loc[row_id, col_id]

and I'd like to set this value to an empty list, [].

我想将此值设置为空列表 []。

output.loc[row_id, col_id] = []

Unfortunately, I get an error saying the size of my keys and values do not match (Pandas thinks I'm trying to set values withnot toan iterable).

不幸的是,我得到一个错误,说我的钥匙和值不匹配(Pandas认为我试图设定值的大小的迭代)。

Is there a way to do this?

有没有办法做到这一点?

Thanks!

谢谢!

回答by Andy Hayden

You need to make sure two things:

你需要确保两件事:

  1. there is precisely one entry for that loc,
  2. the column has dtype object (actually, on testing this seems not to be an issue).
  1. 那个位置正好有一个条目,
  2. 该列具有 dtype 对象(实际上,在测试时这似乎不是问题)。


A hacky way to do this is to use a Series with []:

一个hacky的方法是使用带有[]的系列:

In [11]: df = pd.DataFrame([[1, 2], [3, 4]], columns=['A', 'B'])

In [12]: df.loc[[0], 'A'] = pd.Series([[]])

In [13]: df
Out[13]:
    A  B
0  []  2
1   3  4

pandas doesn't really want you use []as elements because it's usually not so efficient and makes aggregations more complicated (and un-cythonisable).

pandas 并不真的希望您将其[]用作元素,因为它通常效率不高并且使聚合更加复杂(并且无法进行 cythonisable)。



In general you don't want to build up DataFrames cell-by-cell, there is (almost?) always a better way.

通常,您不想逐个单元地构建 DataFrame,但(几乎?)总是有更好的方法。

回答by Misha

The answer by MishaTeplitskiy works when the index label is 0. More generally, if you want to assign an array x to an element of a DataFrame df with row r and column c, you can use:

当索引标签为 0 时,MishaTeplitskiy 的答案有效。更一般地,如果要将数组 x 分配给具有行 r 和列 c 的 DataFrame df 的元素,您可以使用:

df.loc[[r], c] = pd.Series([x], index = [r])

回答by Tan Dat

You can use pd.atinstead:

您可以使用pd.at代替:

df = pd.DataFrame()
df['B'] = [1, 2, 3]
df['A'] = None
df.at[1, 'A'] = np.array([1, 2, 3])

When you use pd.loc, pandas thinks you are interacting with a set of rows. So if you try to assign an array using pd.loc, pandas will try to match each element of an array with a corresponding element accessed by pd.loc, hence the error.

当您使用 pd.loc 时,pandas 认为您正在与一组行进行交互。因此,如果您尝试使用 pd.loc 分配数组,pandas 将尝试将数组的每个元素与 pd.loc 访问的相应元素进行匹配,因此会出现错误。