Python 在 Pandas 中使用 iloc 的正确方法

Question

提问by supernovaee

I have the following dataframe df:

我有以下数据框 df：

print(df)

    Food         Taste
0   Apple        NaN
1   Banana       NaN
2   Candy        NaN
3   Milk         NaN
4   Bread        NaN
5   Strawberry   NaN

I am trying to replace values in a range of rows using iloc:

我正在尝试使用 iloc 替换一系列行中的值：

df.Taste.iloc[0:2] = 'good'
df.Taste.iloc[2:6] = 'bad'

But it returned the following SettingWithCopyWarning message:

但它返回了以下 SettingWithCopyWarning 消息：

SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame

So, I found this Stackoverflow pageand tried this:

所以，我找到了这个Stackoverflow 页面并尝试了这个：

df.iloc[0:2, 'Taste'] = 'good'
df.iloc[2:6, 'Taste'] = 'bad'

Unfortunately, it returned the following error:

不幸的是，它返回了以下错误：

ValueError: Can only index by location with a [integer, integer slice (START point is INCLUDED, END point is EXCLUDED), listlike of integers, boolean array]

What would be the proper way to use iloc in this situation? Also, is there a way to combine these two lines above?

在这种情况下使用 iloc 的正确方法是什么？另外，有没有办法将上面这两行结合起来？

Answer 1

回答by jezrael

You can use Index.get_locfor position of column Taste, because DataFrame.ilocselect by positions:

您可以Index.get_loc用于 column的位置Taste，因为DataFrame.iloc按位置选择：

#return second position (python counts from 0, so 1)
print (df.columns.get_loc('Taste'))
1

df.iloc[0:2, df.columns.get_loc('Taste')] = 'good'
df.iloc[2:6, df.columns.get_loc('Taste')] = 'bad'
print (df)
         Food Taste
0       Apple  good
1      Banana  good
2       Candy   bad
3        Milk   bad
4       Bread   bad
5  Strawberry   bad

Possible solution with ixis not recommended because deprecate ixin next version of pandas:

ix不推荐使用可能的解决方案，因为在下一版本的熊猫中弃用 ix：

df.ix[0:2, 'Taste'] = 'good'
df.ix[2:6, 'Taste'] = 'bad'
print (df)
         Food Taste
0       Apple  good
1      Banana  good
2       Candy   bad
3        Milk   bad
4       Bread   bad
5  Strawberry   bad

Answer 2

回答by Jared Stufft

.iloc uses integer location, whereas .loc uses name. Both options also take both row AND column identifiers (for DataFrames). Your inital code didn't work because you didn't specify within the .iloc call which column you're selecting. The second code line you tried didn't work because you mixed integer location with column name, and .iloc only accepts integer location. If you don't know the column integer location, you can use Index.get_locin place as suggested above. Otherwise, use the integer position, in this case 1.

.iloc 使用整数位置，而 .loc 使用名称。这两个选项也都采用行和列标识符（对于 DataFrames）。您的初始代码不起作用，因为您没有在 .iloc 调用中指定您选择的列。您尝试的第二行代码不起作用，因为您将整数位置与列名混合在一起，而 .iloc 只接受整数位置。如果您不知道列整数位置，则可以Index.get_loc按照上面的建议就地使用。否则，使用整数位置，在本例中为 1。

df.iloc[0:2, df.columns.get_loc('Taste')] = 'good'
df.iloc[2:6, df.columns.get_loc('Taste')] = 'bad'

is equal to:

等于：

df.iloc[0:2, 1] = 'good'
df.iloc[2:6, 1] = 'bad'

in this particular situation.

在这种特殊情况下。

Answer 3

回答by HeadAndTail

Purely integer-location based indexing for selection by position.. eg :-

纯粹基于整数位置的索引，用于按位置选择......例如：-

lang_sets = {}
lang_sets['en'] = train[train.lang == 'en'].iloc[:,:-1]
lang_sets['ja'] = train[train.lang == 'ja'].iloc[:,:-1]
lang_sets['de'] = train[train.lang == 'de'].iloc[:,:-1]

Answer 4

回答by Rob

I prefer to use .locin such cases, and explicitly use the index of the DataFrame if you want to select on position:

我更喜欢.loc在这种情况下使用，如果要选择位置，请明确使用 DataFrame 的索引：

df.loc[df.index[0:2], 'Taste'] = 'good'
df.loc[df.index[2:6], 'Taste'] = 'bad'

Python 在 Pandas 中使用 iloc 的正确方法

提问by supernovaee

回答by jezrael

回答by Jared Stufft

回答by HeadAndTail

回答by Rob

相关推荐

最近更新

标签

Python 在 Pandas 中使用 iloc 的正确方法

提问by supernovaee

回答by jezrael

回答by Jared Stufft

回答by HeadAndTail

回答by Rob

相关推荐

Python PyQt5：如何安装/运行 Qt Designer

Python 如何为 Seaborn Heatmap 颜色条添加标签？

Python 安装GDAL时出错

Python 将 NumPy 数组转换为 PIL 图像

相关推荐

最近更新

标签