pandas 如何使用熊猫替换列中的元素

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/19797216/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 21:18:36  来源:igfitidea点击:

how to replace elements in a column using pandas

pythonreplacepandas

提问by HappyPy

Given this data frame:

鉴于此数据框:

>>> a = pd.DataFrame(data={'words':['w1','w2','w3','w4','w5'],'value':np.random.rand(5)})
>>> a

     value   words
0  0.157876    w1
1  0.784586    w2
2  0.875567    w3
3  0.649377    w4
4  0.852453    w5

>>> b = pd.Series(data=['w3','w4'])
>>> b

0    w3
1    w4

I'd like to replace the elements of valuewith zerobut only for the words that match those in b. The resulting data frame should therefore look like this:

我想替换valuewith的元素,zero但只替换与b. 因此,生成的数据框应如下所示:

    value    words
0  0.157876    w1
1  0.784586    w2
2  0           w3
3  0           w4
4  0.852453    w5

I thought of something along these lines: a.value[a.words==b] = 0but it's obviously wrong.

我想到了一些类似的东西:a.value[a.words==b] = 0但这显然是错误的。

回答by Roman Pekar

You're close, just use pandas.Series.isin()instead of ==:

你很接近,只需使用pandas.Series.isin()而不是==

>>> a.value[a['words'].isin(b)] = 0
>>> a
      value words
0  0.340138    w1
1  0.533770    w2
2  0.000000    w3
3  0.000000    w4
4  0.002314    w5

Or you can use ixselector:

或者您可以使用ix选择器:

>>> a.ix[a['words'].isin(b), 'value'] = 0
>>> a
      value words
0  0.340138    w1
1  0.533770    w2
2  0.000000    w3
3  0.000000    w4
4  0.002314    w5

updateYou can see documentationabout differences betweed .ixand .loc, some quotes:

更新您可以查看有关和 之间差异的文档,一些引用:.ix.loc

.locis strictly label based, will raise KeyError when the items are not found ...

.ilocis strictly integer position based (from 0 to length-1 of the axis), will raise IndexError when the requested indicies are out of bounds ...

.ixsupports mixed integer and label based access. It is primarily label based, but will fallback to integer positional access. .ix is the most general and will support any of the inputs to .loc and .iloc, as well as support for floating point label schemes. .ix is especially useful when dealing with mixed positional and label based hierarchial indexes ...

.loc严格基于标签,当找不到项目时会引发 KeyError ...

.iloc严格基于整数位置(从 0 到轴的长度 1),当请求的索引超出范围时会引发 IndexError ......

.ix支持混合整数和基于标签的访问。它主要基于标签,但会回退到整数位置访问。.ix 是最通用的,将支持 .loc 和 .iloc 的任何输入,以及对浮点标签方案的支持。.ix 在处理混合位置和基于标签的层次索引时特别有用......

回答by EdChum

Use .locto select the column values you want to assign to:

使用.loc选择要分配给列值:

a.loc[a.words.isin(b),'value']=0

Out[10]:

      value words
0  0.065556    w1
1  0.776099    w2
2  0.000000    w3
3  0.000000    w4
4  0.331185    w5