Python 替换熊猫中的行值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/30459485/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 08:27:17  来源:igfitidea点击:

Replacing row values in pandas

pythonnumpypandas

提问by blaz

I would like to replace row values in pandas.

我想替换熊猫中的行值。

In example:

例如:

import pandas as pd
import numpy as np    

a = np.array(([100, 100, 101, 101, 102, 102],
                 np.arange(6)))
pd.DataFrame(a.T)

Result:

结果:

array([[100,   0],
       [100,   1],
       [101,   2],
       [101,   3],
       [102,   4],
       [102,   5]])

Here, I would like to replace the rows with the values [101, 3]with [200, 10]and the result should therefore be:

在这里,我想用值来代替行[101, 3][200, 10],因此其结果应该是:

array([[100,   0],
       [100,   1],
       [101,   2],
       [200,  10],
       [102,   4],
       [102,   5]])

Update

更新

In a more general case I would like to replace multiple rows.

在更一般的情况下,我想替换多行。

Therefore the old and new row values are represented by nx2 sized matrices (n is number of row values to replace). In example:

因此,旧的和新的行值由 nx2 大小的矩阵表示(n 是要替换的行值的数量)。例如:

old_vals = np.array(([[101, 3]],
                     [[100, 0]],
                     [[102, 5]]))

new_vals = np.array(([[200, 10]],
                     [[300, 20]],
                     [[400, 30]]))

And the result is:

结果是:

array([[300,  20],
       [100,   1],
       [101,   2],
       [200,  10],
       [102,   4],
       [400,  30]])

采纳答案by EdChum

For the single row case:

对于单行情况:

In [35]:

df.loc[(df[0]==101) & (df[1]==3)] = [[200,10]]
df
Out[35]:
     0   1
0  100   0
1  100   1
2  101   2
3  200  10
4  102   4
5  102   5

For the multiple row-case the following would work:

对于多行情况,以下将起作用:

In [60]:

a = np.array(([100, 100, 101, 101, 102, 102],
                 [0,1,3,3,3,4]))
df = pd.DataFrame(a.T)
df
Out[60]:
     0  1
0  100  0
1  100  1
2  101  3
3  101  3
4  102  3
5  102  4
In [61]:

df.loc[(df[0]==101) & (df[1]==3)] = 200,10
df
Out[61]:
     0   1
0  100   0
1  100   1
2  200  10
3  200  10
4  102   3
5  102   4

For multi-row update like you propose the following would work where the replacement site is a single row, first construct a dict of the old vals to search for and use the new values as the replacement value:

对于像您建议的多行更新,在替换站点是单行的情况下,以下将起作用,首先构造旧 val 的 dict 以搜索并使用新值作为替换值:

In [78]:

old_keys = [(x[0],x[1]) for x in old_vals]
new_valss = [(x[0],x[1]) for x in new_vals]
replace_vals = dict(zip(old_keys, new_vals))
replace_vals
Out[78]:
{(100, 0): array([300,  20]),
 (101, 3): array([200,  10]),
 (102, 5): array([400,  30])}

We can then iterate over the dict and then set the rows using the same method as my first answer:

然后我们可以迭代字典,然后使用与我的第一个答案相同的方法设置行:

In [93]:

for k,v in replace_vals.items():
    df.loc[(df[0] == k[0]) & (df[1] == k[1])] = [[v[0],v[1]]]
df
     0  1
0  100  0
     0  1
5  102  5
     0  1
3  101  3
Out[93]:
     0   1
0  300  20
1  100   1
2  101   2
3  200  10
4  102   4
5  400  30

回答by alec_djinn

The simplest way should be this one:

最简单的方法应该是这个:

df.loc[[3],0:1] = 200,10

In this case, 3 is the third row of the data frame while 0 and 1 are the columns.

在这种情况下,3 是数据框的第三行,而 0 和 1 是列。

This code instead, allows you to iterate over each row, check its content and replace it with what you want.

相反,此代码允许您遍历每一行,检查其内容并将其替换为您想要的内容。

target = [101,3]
mod = [200,10]

for index, row in df.iterrows():
    if row[0] == target[0] and row[1] == target[1]:
        row[0] = mod[0]
        row[1] = mod[1]

print(df)

回答by blaz

Another possibility is:

另一种可能是:

import io

a = np.array(([100, 100, 101, 101, 102, 102],
                 np.arange(6)))
df = pd.DataFrame(a.T)

string = df.to_string(header=False, index=False, index_names=False)

dictionary = {'100  0': '300  20',
              '101  3': '200  10',
              '102  5': '400  30'}

def replace_all(text, dic):
    for i, j in dic.items():
        text = text.replace(i, j)
    return text

string = replace_all(string, dictionary)
df = pd.read_csv(io.StringIO(string), delim_whitespace=True)

I found this solution better, since when dealing with large amount of data to replace, the time is shorter than by EdChum's solution.

我发现这个解决方案更好,因为在处理大量要替换的数据时,时间比 EdChum 的解决方案短。