Python Pandas - 将特定 iloc 的值添加到新的数据框列中

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/46113078/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 17:29:14  来源:igfitidea点击:

Pandas - add value at specific iloc into new dataframe column

pythonpandasnumpy

提问by Rob

I have a large dataframe containing lots of columns.

我有一个包含大量列的大型数据框。

For each row/index in the dataframe I do some operations, read in some ancilliary ata, etc and get a new value. Is there a way to add that new value into a new column at the correct row/index?

对于数据帧中的每一行/索引,我执行一些操作,读取一些辅助数据等并获得一个新值。有没有办法将该新值添加到正确行/索引处的新列中?

I can use .assign to add a new column but as I'm looping over the rows and only generating the data to add for one value at a time (generating it is quite involved). When it's generated I'd like to immediately add it to the dataframe rather than waiting until I've generated the entire series.

我可以使用 .assign 添加一个新列,但是当我遍历行并且一次只生成要添加一个值的数据时(生成它非常复杂)。当它生成时,我想立即将它添加到数据框中,而不是等到我生成了整个系列。

This doesn't work and gives a key error:

这不起作用并给出一个关键错误:

df['new_column_name'].iloc[this_row]=value

Do I need to initialise the column first or something?

我需要先初始化列吗?

回答by RumbleFish

There are two steps to created & populate a new column using only a row number... (in this approach ilocis not used)

仅使用行号创建和填充新列有两个步骤......(在这种方法中不使用iloc

First, get the row indexvalue by using the row number

首先,通过行号获取行索引

rowIndex = df.index[someRowNumber]

Then, use row indexwith the locfunction to reference the specific row and add the new column / value

然后,使用带有loc函数的行索引来引用特定行并添加新列/值

df.loc[rowIndex, 'New Column Title'] = "some value"

These two steps can be combine into one line as follows

这两个步骤可以合并为一行,如下所示

df.loc[df.index[someRowNumber], 'New Column Title'] = "some value"

回答by Bharath

If you have a dataframe like

如果你有一个像

import pandas as pd
df = pd.DataFrame(data={'X': [1.5, 6.777, 2.444, pd.np.NaN], 'Y': [1.111, pd.np.NaN, 8.77, pd.np.NaN], 'Z': [5.0, 2.333, 10, 6.6666]})

Instead of iloc,you can use .locwith row index and column name like df.loc[row_indexer,column_indexer]=value

您可以使用.loc行索引和列名代替 iloc,例如df.loc[row_indexer,column_indexer]=value

df.loc[[0,3],'Z'] = 3

Output:

输出:

       X      Y       Z
0  1.500  1.111   3.000
1  6.777    NaN   2.333
2  2.444  8.770  10.000
3    NaN    NaN   3.000