比较两个 Python Pandas 数据框的 2 列并获取公共行

Question

提问by Mohammad Saifullah

I have 2 Dataframe as follows:

我有 2 个数据框，如下所示：

DF1=
    A    B   C    D
0   AA   BA  KK   0
1   AD   BD  LL   0
2   AF   BF  MM   0

DF2=
    K    L
0   AA   BA
1   AD   BF
2   AF   BF

At the end what I want to get is:

最后我想得到的是：

DF1=
    A    B   C    D
0   AA   BA  KK   1
1   AD   BD  LL   0
2   AF   BF  MM   1

So, I want to compare two dataframe, I want to see which rows of first data frame (for column A and B) are in common of of second dataframe(Column K and L) and assign 1 on the coulmn D of first dataframe.

所以，我想比较两个数据帧，我想看看第一个数据帧（对于 A 列和 B 列）的哪些行与第二个数据帧（列 K 和 L）相同，并在第一个数据帧的 D 列上分配 1。

I can use for loop, but It will be very slow for large number of entries.

我可以使用 for 循环，但是对于大量条目，它会非常慢。

Any clue or suggestion will be appreciated.

任何线索或建议将不胜感激。

Answer 1

采纳答案by Mohammad Saifullah

This is how I solved it:

我是这样解决的：

df1 = pd.DataFrame({"A":['AA','AD','AD'], "B":['BA','BD','BF']})
df2 = pd.DataFrame({"A":['AA','AD'], 'B':['BA','BF']})
df1['compressed']=df1.apply(lambda x:'%s%s' % (x['A'],x['B']),axis=1)
df2['compressed']=df2.apply(lambda x:'%s%s' % (x['A'],x['B']),axis=1)
df1['Success'] = df1['compressed'].isin(df2['compressed']).astype(int)
print df1

    A   B     compressed   Success
0  AA  BA      AABA          1
1  AD  BD      ADBD          0
2  AD  BF      ADBF          1

Answer 2

回答by EdChum

This would be easier if you renamed the columns of df2and then you can compare row-wise:

如果您重命名了的列，df2然后您可以逐行比较，这会更容易：

In [35]:

df2.columns = ['A', 'B']
df2
Out[35]:
    A   B
0  AA  BA
1  AD  BF
2  AF  BF
In [38]:

df1['D'] = (df1[['A', 'B']] == df2).all(axis=1).astype(int)
df1
Out[38]:
    A   B   C  D
0  AA  BA  KK  1
1  AD  BD  LL  0
2  AF  BF  MM  1

Answer 3

回答by PiotrKu

DF1.merge(right=DF2,left_on=[DF1.A,DF1.B],right_on=[DF2.K,DF2.L], indicator=True, how='left')

gives:

给出：

A B C D K L _merge 0 AA BA KK 0 AA BA both 1 AD BD LL 0 NaN NaN left_only 2 AF BF MM 0 AF BF both

So, as above, indicator does the job.

因此，如上所述，指标可以完成这项工作。

Peter

彼得

Answer 4

回答by Vipul Saxena

df1['ColumnName'].isin(df2['ColumnName']).value_counts()

比较两个 Python Pandas 数据框的 2 列并获取公共行

提问by Mohammad Saifullah

采纳答案by Mohammad Saifullah

回答by EdChum

回答by PiotrKu

回答by Vipul Saxena

相关推荐

最近更新

标签

比较两个 Python Pandas 数据框的 2 列并获取公共行

提问by Mohammad Saifullah

采纳答案by Mohammad Saifullah

回答by EdChum

回答by PiotrKu

回答by Vipul Saxena

相关推荐

Python 为什么列表理解比附加到列表快得多？

Python Matplotlib 散点图图例

Python 使用 boto3 列出存储桶的内容

Python 从需要 stdin 的子进程实时打印 stdout

相关推荐

最近更新

标签