使用 loc 更新数据框 python pandas

Question

提问by Data Enthusiast

I have a pandas dataframe (df) with the column structure :

我有一个列结构的Pandas数据框（df）：

month a b c d

this dataframe has data for say Jan, Feb, Mar, Apr. A,B,C,D are numeric columns. For the month of Feb , I want to recalculate column A and update it in the dataframe i.e. for month = Feb, A = B + C + D

此数据框包含 Jan、Feb、Mar、Apr 的数据。A、B、C、D 是数字列。对于 Feb 月份，我想重新计算 A 列并在数据框中更新它，即月份 = Feb, A = B + C + D

Code I used :

我使用的代码：

 df[df['month']=='Feb']['A']=df[df['month']=='Feb']['B'] + df[df['month']=='Feb']['C'] + df[df['month']=='Feb']['D']

This ran without errors but did not change the values in column A for the month Feb. In the console, it gave a message that :

这运行没有错误，但没有更改 2 月份 A 列中的值。在控制台中，它给出了一条消息：

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

试图在来自 DataFrame 的切片副本上设置值。
尝试使用 .loc[row_indexer,col_indexer] = value 代替

I tried to use .loc but right now the dataframe I am working on, I had used .reset_index()on it and I am not sure how to set index and use .loc. I followed documentation but not clear. Could you please help me out here? This is an example dataframe :

我尝试使用 .loc 但现在我正在处理的数据帧，我已经使用.reset_index()过它，但我不确定如何设置索引和使用 .loc。我遵循了文档，但不清楚。你能帮我一下吗？这是一个示例数据框：

 import pandas as pd import numpy as np
 dates = pd.date_range('1/1/2000', periods=8)
 df = pd.DataFrame(np.random.randn(8, 4), index=dates, columns=['A', 'B', 'C', 'D'])

I want to update say one date : 2000-01-03. I am unable to give the snippet of my data as it is real time data.

我想更新一个日期：2000-01-03。我无法提供我的数据片段，因为它是实时数据。

Answer 1

回答by Anton Protopopov

As you could see from the warning you should use loc[row_index, col_index]. When you subsetting your data you get index values. You just need to pass for row_indexand then with comma col_name:

正如您从警告中看到的，您应该使用loc[row_index, col_index]. 当您对数据进行子集化时，您将获得索引值。你只需要传递 forrow_index然后用逗号col_name：

df.loc[df['month'] == 'Feb', 'A'] = df.loc[df['month'] == 'Feb', 'B'] + df.loc[df['month'] == 'Feb', 'C'] + df.loc[df['month'] == 'Feb', 'D']

Answer 2

回答by DeepSpace

While not being the most beautiful, the way I would achieve your goal (without explicitly iterating over the rows) is:

虽然不是最漂亮的，但我实现目标的方式（不显式迭代行）是：

df.ix[df['month'] == 'Feb', 'a'] = df[df['month'] == 'Feb']['b'] + df[df['month'] == 'Feb']['c']

Note: ixhas been deprecatedsince Pandas v0.20.0 in favour of iloc/ loc.

注意：自 Pandas v0.20.0ix起已弃用iloc/ loc。

使用 loc 更新数据框 python pandas

提问by Data Enthusiast

回答by Anton Protopopov

回答by DeepSpace

相关推荐

最近更新

标签

使用 loc 更新数据框 python pandas

提问by Data Enthusiast

回答by Anton Protopopov

回答by DeepSpace

相关推荐

Pandas dropna - 存储删除的行

pandas 将数据框转换为字典

pandas 冒号(:) 在python 和pandas 中是如何工作的？

Pandas：合并多个数据框和控制列名称？

相关推荐

最近更新

标签