Python 使用可迭代对象进行设置时必须具有相等的 len 键和值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/48000225/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 18:28:41  来源:igfitidea点击:

Must have equal len keys and value when setting with an iterable

pythonpandas

提问by user3806649

I have two dataframe as flow:

我有两个数据框作为流程:

leader:
    0 11
    1 8
    2 5
    3 9
    4 8
    5 6
    [6065 rows x 2 columns]

DatasetLabel:    
    Unnamed: 0      0    1  ....    7     8    9  10  11  12  
    0               A    J  ....    1     2    5 NaN NaN NaN  
    1               B    K  ....    3     4   NaN  NaN NaN NaN  

    [4095 rows x 14 columns]

The Information dataset column names 0 to 6 are DatasetLabel about data and 7 to 12 are indexes where refer to first column of leader Dataframe.

信息数据集列名 0 到 6 是关于数据的 DatasetLabel,7 到 12 是索引,其中引用了领导者 Dataframe 的第一列。

I want to create dataset where instead of the indexes in DatasetLabel Dataset I have the value of each index from the leader dataset which is leader.iloc[index,1]

我想创建数据集,而不是 DatasetLabel 数据集中的索引,我有领导数据集中每个索引的值,它是 leader.iloc[index,1]

How can I do it using python features?

我如何使用 python 功能来做到这一点?

The output should look like:

输出应如下所示:

 DatasetLabel:    
        Unnamed: 0      0    1  ....    7     8    9  10  11  12  
        0               A    J  ....    8     5    6 NaN NaN NaN  
        1               B    K  ....    9     8   NaN  NaN NaN NaN  

I have came up with following, but I get error:

我想出了以下内容,但出现错误:

    for column in DatasetLabel.ix[:,8:13]:
        DatasetLabel[DatasetLabel[column].notnull ()]=leader.iloc[DatasetLabel[DatasetLabel[column].notnull ()][column].values,1]

Error:

错误:

ValueError: Must have equal len keys and value when setting with an iterable

回答by andrew_reece

You can use applyto index into leaderand exchange values with DatasetLabel, although it's not very pretty.

您可以使用apply来索引leader和交换值DatasetLabel,尽管它不是很漂亮。

One issue is that Pandas won't let us index with NaN. Converting to strprovides a workaround. But that creates a second issue, namely, column 9is of type float(because NaNis float), so 5becomes 5.0. Once it's a string, that's "5.0", which will fail to match the index values in leader. We can remove the .0, and then this solution will work - but it's a bit of a hack.

一个问题是 Pandas 不会让我们用NaN. 转换为str提供了一种解决方法。但这会产生第二个问题,即 column9是类型float(因为NaNfloat),所以5变成5.0. 一旦它是一个字符串,那就是"5.0",它将无法匹配 中的索引值leader。我们可以删除.0,然后此解决方案将起作用 - 但这有点麻烦。

With DatasetLabelas:

DatasetLabel

   Unnamed:0  0  1  7  8    9  10  11  12
0          0  A  J  1  2  5.0 NaN NaN NaN
1          1  B  K  3  4  NaN NaN NaN NaN

And leaderas:

leader作为:

   0   1
0  0  11
1  1   8
2  2   5
3  3   9
4  4   8
5  5   6

Then:

然后:

cols = ["7","8","9","10","11","12"]
updated = DatasetLabel[cols].apply(
    lambda x: leader.loc[x.astype(str).str.split(".").str[0], 1].values, axis=1)

updated
     7    8    9  10  11  12
0  8.0  5.0  6.0 NaN NaN NaN
1  9.0  8.0  NaN NaN NaN NaN

Now we can concatthe unmodified columns (which we'll call original) with updated:

现在我们可以concat将未修改的列(我们称之为originalupdated

original_cols = DatasetLabel.columns[~DatasetLabel.columns.isin(cols)]
original = DatasetLabel[original_cols]
pd.concat([original, updated], axis=1)

Output:

输出:

   Unnamed:0  0  1    7    8    9  10  11  12
0          0  A  J  8.0  5.0  6.0 NaN NaN NaN
1          1  B  K  9.0  8.0  NaN NaN NaN NaN

Note: It may be clearer to use concathere, but here's another, cleaner way of merging originaland updated, using assign:

注意:concat在这里使用可能更清晰,但这里有另一种更简洁的合并originalupdated使用方式assign

DatasetLabel.assign(**updated)