使用索引在 Pandas 中查找两个系列之间的交集

Question

提问by Boss1295

I have two series of different lengths, and I am attempting to find the intersection of the two series based on the index, where the index is a string. The end result is, hopefully, a series that has the elements of the intersection based on the common string indexes.

我有两个不同长度的系列，我试图根据索引找到两个系列的交集，其中索引是一个字符串。希望最终结果是一个具有基于公共字符串索引的交集元素的系列。

Any ideas?

有任何想法吗？

Answer 1

回答by Alex Riley

Pandas indexes have an intersection methodwhich you can use. If you have two Series, s1and s2, then

Pandas 索引有一个可以使用的交集方法。如果你有两个系列，s1和s2，然后

s1.index.intersection(s2.index)

or, equivalently:

或者，等效地：

s1.index & s2.index

gives you the index values which are in both s1and s2.

给你这是在这两个指数值s1和s2。

You can then use this list of indexes to view the corresponding elements of a series. For example:

然后，您可以使用此索引列表查看系列的相应元素。例如：

>>> ixs = s1.index.intersection(s2.index)
>>> s1.loc[ixs]
# subset of s1 with only the indexes also found in s2 appears here

Answer 2

回答by nurp

Both my data increments so I wrote a function to get the indices then filtered the data based on their indexes.

我的两个数据都会增加，所以我编写了一个函数来获取索引，然后根据它们的索引过滤数据。

np.shape(data1)  # (1330, 8)
np.shape(data2)  # (2490, 9)
index_1, index_2 = overlap(data1, data2)
data1 = data1[index1]
data2 = data2[index2]
np.shape(data1)  # (540, 8)
np.shape(data2)  # (540, 9)
def overlap(data1, data2):
    '''both data is assumed to be incrementing'''
    mask1 = np.array([False] * len(data1))
    mask2 = np.array([False] * len(data2))
    idx_1 = 0
    idx_2 = 0
    while idx_1 < len(data1) and idx_2 < len(data2):
        if data1[idx_1] < data2[idx_2]:
            mask1[idx_1] = False
            mask2[idx_2] = False
            idx_1 += 1
        elif data1[idx_1] > data2[idx_2]:
            mask1[idx_1] = False
            mask2[idx_2] = False
            idx_2 += 1
        else:
            mask1[idx_1] = True
            mask2[idx_2] = True
            idx_1 += 1
            idx_2 += 1
    return mask1, mask2

使用索引在 Pandas 中查找两个系列之间的交集

提问by Boss1295

回答by Alex Riley

回答by nurp

相关推荐

最近更新

标签

使用索引在 Pandas 中查找两个系列之间的交集

提问by Boss1295

回答by Alex Riley

回答by nurp

相关推荐

pandas Python：带熊猫的加权中值算法

Pandas：在 ID 上拆分数据帧并使用生成的文件名写入 csv

pandas 如何执行线性近似并从python中的数据数组中获得线性方程

pandas 熊猫数据框中列表上的“Where子句”

相关推荐

最近更新

标签