用另一个 df 中的行替换 Pandas df 中的行

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/39267372/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 01:56:03  来源:igfitidea点击:

Replace rows in a Pandas df with rows from another df

pythonpandasdataframe

提问by Chris Parry

I have 2 Pandas dfs, A and B. Both have 10 columns and the index 'ID'. Where the IDs of A and B match, I want to replace the rows of B with the rows of A. I have tried to use pd.update, but no success yet. Any help appreciated.

我有 2 个 Pandas dfs,A 和 B。两者都有 10 列和索引“ID”。如果 A 和 B 的 ID 匹配,我想用 A 的行替换 B 的行。我尝试使用 pd.update,但还没有成功。任何帮助表示赞赏。

采纳答案by Shijo

below code should do the trick

下面的代码应该可以解决问题

s1 = pd.Series([5, 1, 'a'])
s2 = pd.Series([6, 2, 'b'])
s3 = pd.Series([7, 3, 'd'])
s4 = pd.Series([8, 4, 'e'])
s5 = pd.Series([9, 5, 'f'])



df1 = pd.DataFrame([list(s1), list(s2),list(s3),list(s4),list(s5)],  columns =  ["A", "B", "C"])

s1 = pd.Series([5, 6, 'p'])
s2 = pd.Series([6, 7, 'q'])
s3 = pd.Series([7, 8, 'r'])
s4 = pd.Series([8, 9, 's'])
s5 = pd.Series([9, 10, 't'])

df2 = pd.DataFrame([list(s1), list(s2),list(s3),list(s4),list(s5)],  columns =  ["A", "B", "C"])

df1.loc[df1.A.isin(df2.A), ['B', 'C']] = df2[['B', 'C']]
print df1

output

输出

   A   B  C
0  5   6  p
1  6   7  q
2  7   8  r
3  8   9  s
4  9  10  t

回答by fpersyn

You can empty your target cells in A(by setting them to NaN) and use the combine_first()method to fill those with B's values. Although it may sound counter-intuitive, this approach gives you the flexibility to both target rows and specific columns in 2 lines of code. Hope that helps.

您可以清空A 中的目标单元格(通过将它们设置为 NaN)并使用该combine_first()方法用B的值填充这些单元格。尽管听起来可能违反直觉,但这种方法使您可以在 2 行代码中灵活地处理目标行和特定列。希望有帮助。

An example replacing the full row's that have an index match:

替换具有索引匹配的完整行的示例:

# set-up
cols = ['c1','c2','c3']
A = pd.DataFrame(np.arange(9).reshape((3,3)), columns=cols)
B = pd.DataFrame(np.arange(10,16).reshape((2,3)), columns=cols)

#solution
A.loc[B.index] = np.nan
A = A.combine_first(B)

An example of only replacing certain target columns for row's that have an index match:

仅替换具有索引匹配的行的某些目标列的示例:

A.loc[B.index, ['c2','c3']] = np.nan
A = A.combine_first(B)