Python 替换熊猫中的行值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/30459485/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Replacing row values in pandas
提问by blaz
I would like to replace row values in pandas.
我想替换熊猫中的行值。
In example:
例如:
import pandas as pd
import numpy as np
a = np.array(([100, 100, 101, 101, 102, 102],
np.arange(6)))
pd.DataFrame(a.T)
Result:
结果:
array([[100, 0],
[100, 1],
[101, 2],
[101, 3],
[102, 4],
[102, 5]])
Here, I would like to replace the rows with the values [101, 3]
with [200, 10]
and the result should therefore be:
在这里,我想用值来代替行[101, 3]
用[200, 10]
,因此其结果应该是:
array([[100, 0],
[100, 1],
[101, 2],
[200, 10],
[102, 4],
[102, 5]])
Update
更新
In a more general case I would like to replace multiple rows.
在更一般的情况下,我想替换多行。
Therefore the old and new row values are represented by nx2 sized matrices (n is number of row values to replace). In example:
因此,旧的和新的行值由 nx2 大小的矩阵表示(n 是要替换的行值的数量)。例如:
old_vals = np.array(([[101, 3]],
[[100, 0]],
[[102, 5]]))
new_vals = np.array(([[200, 10]],
[[300, 20]],
[[400, 30]]))
And the result is:
结果是:
array([[300, 20],
[100, 1],
[101, 2],
[200, 10],
[102, 4],
[400, 30]])
采纳答案by EdChum
For the single row case:
对于单行情况:
In [35]:
df.loc[(df[0]==101) & (df[1]==3)] = [[200,10]]
df
Out[35]:
0 1
0 100 0
1 100 1
2 101 2
3 200 10
4 102 4
5 102 5
For the multiple row-case the following would work:
对于多行情况,以下将起作用:
In [60]:
a = np.array(([100, 100, 101, 101, 102, 102],
[0,1,3,3,3,4]))
df = pd.DataFrame(a.T)
df
Out[60]:
0 1
0 100 0
1 100 1
2 101 3
3 101 3
4 102 3
5 102 4
In [61]:
df.loc[(df[0]==101) & (df[1]==3)] = 200,10
df
Out[61]:
0 1
0 100 0
1 100 1
2 200 10
3 200 10
4 102 3
5 102 4
For multi-row update like you propose the following would work where the replacement site is a single row, first construct a dict of the old vals to search for and use the new values as the replacement value:
对于像您建议的多行更新,在替换站点是单行的情况下,以下将起作用,首先构造旧 val 的 dict 以搜索并使用新值作为替换值:
In [78]:
old_keys = [(x[0],x[1]) for x in old_vals]
new_valss = [(x[0],x[1]) for x in new_vals]
replace_vals = dict(zip(old_keys, new_vals))
replace_vals
Out[78]:
{(100, 0): array([300, 20]),
(101, 3): array([200, 10]),
(102, 5): array([400, 30])}
We can then iterate over the dict and then set the rows using the same method as my first answer:
然后我们可以迭代字典,然后使用与我的第一个答案相同的方法设置行:
In [93]:
for k,v in replace_vals.items():
df.loc[(df[0] == k[0]) & (df[1] == k[1])] = [[v[0],v[1]]]
df
0 1
0 100 0
0 1
5 102 5
0 1
3 101 3
Out[93]:
0 1
0 300 20
1 100 1
2 101 2
3 200 10
4 102 4
5 400 30
回答by alec_djinn
The simplest way should be this one:
最简单的方法应该是这个:
df.loc[[3],0:1] = 200,10
In this case, 3 is the third row of the data frame while 0 and 1 are the columns.
在这种情况下,3 是数据框的第三行,而 0 和 1 是列。
This code instead, allows you to iterate over each row, check its content and replace it with what you want.
相反,此代码允许您遍历每一行,检查其内容并将其替换为您想要的内容。
target = [101,3]
mod = [200,10]
for index, row in df.iterrows():
if row[0] == target[0] and row[1] == target[1]:
row[0] = mod[0]
row[1] = mod[1]
print(df)
回答by blaz
Another possibility is:
另一种可能是:
import io
a = np.array(([100, 100, 101, 101, 102, 102],
np.arange(6)))
df = pd.DataFrame(a.T)
string = df.to_string(header=False, index=False, index_names=False)
dictionary = {'100 0': '300 20',
'101 3': '200 10',
'102 5': '400 30'}
def replace_all(text, dic):
for i, j in dic.items():
text = text.replace(i, j)
return text
string = replace_all(string, dictionary)
df = pd.read_csv(io.StringIO(string), delim_whitespace=True)
I found this solution better, since when dealing with large amount of data to replace, the time is shorter than by EdChum's solution.
我发现这个解决方案更好,因为在处理大量要替换的数据时,时间比 EdChum 的解决方案短。