如何在 Pandas 中生成多个交互项？

Question

提问by pdevar

I would like to estimate an IVregression model using many interactions with year, demographic, and etc. dummies. I can't find an explicit method to do this in Pandas and am curious if anyone has tips.

我想使用与年份、人口统计等虚拟变量的许多交互来估计IV回归模型。我找不到在 Pandas 中执行此操作的明确方法，并且很好奇是否有人有提示。

I'm thinking of trying scikit-learn and this function:

我正在考虑尝试 scikit-learn 和这个功能：

http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.PolynomialFeatures.html

Answer 1

采纳答案by Marcus V.

I was now faced with a similar problem, where I needed a flexible way to create specific interactions and looked through StackOverflow. I followed the tip in the comment above of @user333700 and thanks to him found patsy(http://patsy.readthedocs.io/en/latest/overview.html) and after a Google search this scikit-learn integration patsylearn(https://github.com/amueller/patsylearn).

我现在面临着类似的问题，我需要一种灵活的方式来创建特定的交互并查看 StackOverflow。我跟着注释顶端的上方@ user333700，并感谢他找到替罪羊（http://patsy.readthedocs.io/en/latest/overview.html）和谷歌搜索后，这scikit学习整合patsylearn（HTTPS： //github.com/amueller/patsylearn）。

So going through the example of @motam79, this is possible:

所以通过@motam79 的例子，这是可能的：

import numpy as np
import pandas as pd
from patsylearn import PatsyModel, PatsyTransformer
x = np.array([[ 3, 20, 11],
   [ 6,  2,  7],
   [18,  2, 17],
   [11, 12, 19],
   [ 7, 20,  6]])
df = pd.DataFrame(x, columns=["a", "b", "c"])
x_t = PatsyTransformer("a:b + a:c + b:c", return_type="dataframe").fit_transform(df)

This returns the following:

这将返回以下内容：

     a:b    a:c    b:c
0   60.0   33.0  220.0
1   12.0   42.0   14.0
2   36.0  306.0   34.0
3  132.0  209.0  228.0
4  140.0   42.0  120.0

I answered to a similar question here, where I provide another example with categorical variables: How can an interaction design matrix be created from categorical variables?

我在这里回答了一个类似的问题，在那里我提供了另一个分类变量的例子：如何从分类变量创建交互设计矩阵？

Answer 2

回答by motam79

You can use sklearn's PolynomialFeatures function. Here is an example:

您可以使用 sklearn 的 PolynomialFeatures 函数。下面是一个例子：

Let's assume, this is your design (i.e. feature) matrix:

让我们假设，这是您的设计（即特征）矩阵：

x = array([[ 3, 20, 11],
       [ 6,  2,  7],
       [18,  2, 17],
       [11, 12, 19],
       [ 7, 20,  6]])


x_t = PolynomialFeatures(2, interaction_only=True, include_bias=False).fit_transform(x)

Here is the result:

结果如下：

array([[   3.,   20.,   11.,   60.,   33.,  220.],
       [   6.,    2.,    7.,   12.,   42.,   14.],
       [  18.,    2.,   17.,   36.,  306.,   34.],
       [  11.,   12.,   19.,  132.,  209.,  228.],
       [   7.,   20.,    6.,  140.,   42.,  120.]])

The first 3 features are the original features, and the next three are interactions of the original features.

前 3 个特征是原始特征，接下来的三个特征是原始特征的交互作用。

如何在 Pandas 中生成多个交互项？

提问by pdevar

采纳答案by Marcus V.

回答by motam79

相关推荐

最近更新

标签

如何在 Pandas 中生成多个交互项？

提问by pdevar

采纳答案by Marcus V.

回答by motam79

相关推荐

pandas 如何将列表转换为熊猫中的集合？

pandas 从数据框熊猫创建多索引

pandas 快速删除只有一个不同值的数据框列

如何在 Pandas 中创建 groupby 子图？

相关推荐

最近更新

标签