Python Pandas - 基于列条目的两个数据框的交集

Question

提问by Bib

Suppose I have two DataFrames like so:

假设我有两个像这样的 DataFrame：

>>dfA
             S                      T            prob
0        ! ! !                ! ! ! !   8.1623999e-05
1        ! ! !                ! ! ! "   0.00354090007
2        ! ! !                ! ! ! .   0.00210241997
3        ! ! !                ! ! ! ?  6.55684998e-05
4        ! ! !                  ! ! !     0.203119993
5        ! ! !                ! ! ! ”  6.62070015e-05
6        ! ! !                    ! !   0.00481862016
7        ! ! !                      !    0.0274260994
8        ! ! !                " ! ! !  7.99940026e-05
9        ! ! !                    " !  1.51188997e-05
10       ! ! !                      "  8.50678989e-05

>>dfB
             S                      T                                 knstats
0        ! ! !                ! ! ! !                 knstats=2,391,104,64,25
1        ! ! !                ! ! ! "                    knstats=4,391,6,64,2
2        ! ! !                ! ! ! .                    knstats=4,391,5,64,2
3        ! ! !                ! ! ! ?                    knstats=1,391,4,64,4
4        ! ! !                  ! ! !               knstats=220,391,303,64,55
5        ! ! !                    ! !               knstats=16,391,957,64,115
6        ! ! !                      !              knstats=28,391,5659,64,932
7        ! ! !                " ! ! !                    knstats=2,391,2,64,1
8        ! ! !                    " !                  knstats=1,391,37,64,13
9        ! ! !                      "     knstats=2,391,1.11721e+06,64,180642
10       ! ! !                    . "           knstats=2,391,120527,64,20368

I want to create a new DataFrame which is composed of the rows which have matching "S" and "T" entries in both matrices, along with the prob column from dfA and the knstats column from dfB. The result should look something like the following, and it is important that the order is the same:

我想创建一个新的 DataFrame，它由在两个矩阵中具有匹配“S”和“T”条目的行以及来自 dfA 的 prob 列和来自 dfB 的 knstats 列组成。结果应该类似于以下内容，并且顺序相同很重要：

             S                      T            prob                             knstats
0        ! ! !                ! ! ! !   8.1623999e-05             knstats=2,391,104,64,25
1        ! ! !                ! ! ! "   0.00354090007                knstats=4,391,6,64,2
2        ! ! !                ! ! ! .   0.00210241997                knstats=4,391,5,64,2
3        ! ! !                ! ! ! ?  6.55684998e-05                knstats=1,391,4,64,4
4        ! ! !                  ! ! !     0.203119993           knstats=220,391,303,64,55
5        ! ! !                    ! !   0.00481862016           knstats=16,391,957,64,115
6        ! ! !                      !    0.0274260994          knstats=28,391,5659,64,932
7        ! ! !                " ! ! !  7.99940026e-05                knstats=2,391,2,64,1
8        ! ! !                    " !  1.51188997e-05              knstats=1,391,37,64,13
9        ! ! !                      "  8.50678989e-05 knstats=2,391,1.11721e+06,64,180642

Answer 1

采纳答案by user308827

You can merge them so:

您可以合并它们，以便：

s1 = pd.merge(dfA, dfB, how='inner', on=['S', 'T'])

To drop NA rows:

要删除 NA 行：

s1.dropna(inplace=True)

Python Pandas - 基于列条目的两个数据框的交集

提问by Bib

采纳答案by user308827

相关推荐

最近更新

标签

Python Pandas - 基于列条目的两个数据框的交集

提问by Bib

采纳答案by user308827

相关推荐

Python Django：用于外键冲突的反向访问器

Python 选择具有 None 值的熊猫单元格

Python 如何将 matplotlib 图旋转 90 度？

Python AttributeError: 'module' 对象没有属性 'TestCase'

相关推荐

最近更新

标签