Python 使用可迭代对象进行设置时必须具有相等的 len 键和值

Question

提问by user3806649

I have two dataframe as flow:

我有两个数据框作为流程：

leader:
    0 11
    1 8
    2 5
    3 9
    4 8
    5 6
    [6065 rows x 2 columns]

DatasetLabel:    
    Unnamed: 0      0    1  ....    7     8    9  10  11  12  
    0               A    J  ....    1     2    5 NaN NaN NaN  
    1               B    K  ....    3     4   NaN  NaN NaN NaN  

    [4095 rows x 14 columns]

The Information dataset column names 0 to 6 are DatasetLabel about data and 7 to 12 are indexes where refer to first column of leader Dataframe.

信息数据集列名 0 到 6 是关于数据的 DatasetLabel，7 到 12 是索引，其中引用了领导者 Dataframe 的第一列。

I want to create dataset where instead of the indexes in DatasetLabel Dataset I have the value of each index from the leader dataset which is leader.iloc[index,1]

我想创建数据集，而不是 DatasetLabel 数据集中的索引，我有领导数据集中每个索引的值，它是 leader.iloc[index,1]

How can I do it using python features?

我如何使用 python 功能来做到这一点？

The output should look like:

输出应如下所示：

 DatasetLabel:    
        Unnamed: 0      0    1  ....    7     8    9  10  11  12  
        0               A    J  ....    8     5    6 NaN NaN NaN  
        1               B    K  ....    9     8   NaN  NaN NaN NaN

I have came up with following, but I get error:

我想出了以下内容，但出现错误：

    for column in DatasetLabel.ix[:,8:13]:
        DatasetLabel[DatasetLabel[column].notnull ()]=leader.iloc[DatasetLabel[DatasetLabel[column].notnull ()][column].values,1]

Error:

错误：

ValueError: Must have equal len keys and value when setting with an iterable

Answer 1

回答by andrew_reece

You can use applyto index into leaderand exchange values with DatasetLabel, although it's not very pretty.

您可以使用apply来索引leader和交换值DatasetLabel，尽管它不是很漂亮。

One issue is that Pandas won't let us index with NaN. Converting to strprovides a workaround. But that creates a second issue, namely, column 9is of type float(because NaNis float), so 5becomes 5.0. Once it's a string, that's "5.0", which will fail to match the index values in leader. We can remove the .0, and then this solution will work - but it's a bit of a hack.

一个问题是 Pandas 不会让我们用NaN. 转换为str提供了一种解决方法。但这会产生第二个问题，即 column9是类型float（因为NaN是float），所以5变成5.0. 一旦它是一个字符串，那就是"5.0"，它将无法匹配中的索引值leader。我们可以删除.0，然后此解决方案将起作用 - 但这有点麻烦。

With DatasetLabelas:

与DatasetLabel：

   Unnamed:0  0  1  7  8    9  10  11  12
0          0  A  J  1  2  5.0 NaN NaN NaN
1          1  B  K  3  4  NaN NaN NaN NaN

And leaderas:

并leader作为：

Then:

然后：

cols = ["7","8","9","10","11","12"]
updated = DatasetLabel[cols].apply(
    lambda x: leader.loc[x.astype(str).str.split(".").str[0], 1].values, axis=1)

updated
     7    8    9  10  11  12
0  8.0  5.0  6.0 NaN NaN NaN
1  9.0  8.0  NaN NaN NaN NaN

Now we can concatthe unmodified columns (which we'll call original) with updated:

现在我们可以concat将未修改的列（我们称之为original）updated：

original_cols = DatasetLabel.columns[~DatasetLabel.columns.isin(cols)]
original = DatasetLabel[original_cols]
pd.concat([original, updated], axis=1)

Output:

输出：

   Unnamed:0  0  1    7    8    9  10  11  12
0          0  A  J  8.0  5.0  6.0 NaN NaN NaN
1          1  B  K  9.0  8.0  NaN NaN NaN NaN

Note: It may be clearer to use concathere, but here's another, cleaner way of merging originaland updated, using assign:

注意：concat在这里使用可能更清晰，但这里有另一种更简洁的合并original和updated使用方式assign：

DatasetLabel.assign(**updated)

Python 使用可迭代对象进行设置时必须具有相等的 len 键和值

提问by user3806649

回答by andrew_reece

相关推荐

最近更新

标签

Python 使用可迭代对象进行设置时必须具有相等的 len 键和值

提问by user3806649

回答by andrew_reece

相关推荐

Python 读取 Json 文件作为 Pandas Dataframe 错误

Python Django manage.py runserver 无效语法

Python 2.7 不再工作：无法导入名称 md5

Python 将列表绑定到 Pandas read_sql_query 中的参数与其他参数

相关推荐

最近更新

标签