Python 在不使用索引的情况下替换 Pandas DataFrame 中选定单元格的值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/17729853/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 09:01:33  来源:igfitidea点击:

Replace value for a selected cell in pandas DataFrame without using index

pythonpandasdataframe

提问by LondonRob

this is a rather similar question to this questionbut with one key difference: I'm selecting the data I want to change not by its index but by some criteria.

这是一个与此问题相当相似的问题,但有一个关键区别:我不是通过索引而是通过某些标准来选择我想要更改的数据。

If the criteria I apply return a single row, I'd expect to be able to set the value of a certain column in that row in an easy way, but my first attempt doesn't work:

如果我应用的条件返回单行,我希望能够以简单的方式设置该行中某个列的值,但我的第一次尝试不起作用:

>>> d = pd.DataFrame({'year':[2008,2008,2008,2008,2009,2009,2009,2009], 
...                   'flavour':['strawberry','strawberry','banana','banana',
...                   'strawberry','strawberry','banana','banana'],
...                   'day':['sat','sun','sat','sun','sat','sun','sat','sun'],
...                   'sales':[10,12,22,23,11,13,23,24]})

>>> d
   day     flavour  sales  year
0  sat  strawberry     10  2008
1  sun  strawberry     12  2008
2  sat      banana     22  2008
3  sun      banana     23  2008
4  sat  strawberry     11  2009
5  sun  strawberry     13  2009
6  sat      banana     23  2009
7  sun      banana     24  2009

>>> d[d.sales==24]
   day flavour  sales  year
7  sun  banana     24  2009

>>> d[d.sales==24].sales = 100
>>> d
   day     flavour  sales  year
0  sat  strawberry     10  2008
1  sun  strawberry     12  2008
2  sat      banana     22  2008
3  sun      banana     23  2008
4  sat  strawberry     11  2009
5  sun  strawberry     13  2009
6  sat      banana     23  2009
7  sun      banana     24  2009

So rather than setting 2009 Sunday's Banana sales to 100, nothing happens! What's the nicest way to do this? Ideally the solution should use the row number, as you normally don't know that in advance!

因此,与其将 2009 年周日的香蕉销量设置为 100,什么也没有发生!什么是最好的方法来做到这一点?理想情况下,解决方案应该使用行号,因为您通常事先不知道!

采纳答案by waitingkuo

Many ways to do that

有很多方法可以做到这一点

1

1

In [7]: d.sales[d.sales==24] = 100

In [8]: d
Out[8]: 
   day     flavour  sales  year
0  sat  strawberry     10  2008
1  sun  strawberry     12  2008
2  sat      banana     22  2008
3  sun      banana     23  2008
4  sat  strawberry     11  2009
5  sun  strawberry     13  2009
6  sat      banana     23  2009
7  sun      banana    100  2009

2

2

In [26]: d.loc[d.sales == 12, 'sales'] = 99

In [27]: d
Out[27]: 
   day     flavour  sales  year
0  sat  strawberry     10  2008
1  sun  strawberry     99  2008
2  sat      banana     22  2008
3  sun      banana     23  2008
4  sat  strawberry     11  2009
5  sun  strawberry     13  2009
6  sat      banana     23  2009
7  sun      banana    100  2009

3

3

In [28]: d.sales = d.sales.replace(23, 24)

In [29]: d
Out[29]: 
   day     flavour  sales  year
0  sat  strawberry     10  2008
1  sun  strawberry     99  2008
2  sat      banana     22  2008
3  sun      banana     24  2008
4  sat  strawberry     11  2009
5  sun  strawberry     13  2009
6  sat      banana     24  2009
7  sun      banana    100  2009

回答by ram

Not sure about older version of pandas, but in 0.16 the value of a particular cell can be set based on multiple column values.

不确定旧版本的熊猫,但在 0.16 中,可以根据多个列值设置特定单元格的值。

Extending the answer provided by @waitingkuo, the same operation can also be done based on values of multiple columns.

扩展@waitingkuo 提供的答案,同样的操作也可以基于多列的值来完成。

d.loc[(d.day== 'sun') & (d.flavour== 'banana') & (d.year== 2009),'sales'] = 100

回答by elPastor

Old question, but I'm surprised nobody mentioned numpy's .where()functionality (which can be called directly from the pandas module).

老问题,但我很惊讶没有人提到 numpy 的.where()功能(可以直接从 pandas 模块调用)。

In this case the code would be:

在这种情况下,代码将是:

d.sales = pd.np.where(d.sales == 24, 100, d.sales)

To my knowledge, this is one of the fastest ways to conditionally change data across a series.

据我所知,这是有条件地更改一系列数据的最快方法之一。