Python 如何并排合并两个数据帧?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/23891575/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 03:37:01  来源:igfitidea点击:

How to merge two dataframes side-by-side?

pythonpandas

提问by James Bond

is there a way to conveniently merge two data frames side by side?

有没有办法方便地并排合并两个数据帧?

both two data frames have 30 rows, they have different number of columns, say, df1 has 20 columns and df2 has 40 columns.

两个数据框都有 30 行,它们的列数不同,例如,df1 有 20 列,df2 有 40 列。

how can i easily get a new data frame of 30 rows and 60 columns?

如何轻松获得 30 行 60 列的新数据框?

df3 = pd.someSpecialMergeFunct(df1, df2)

or maybe there is some special parameter in append

或者可能附加了一些特殊参数

df3 = pd.append(df1, df2, left_index=False, right_index=false, how='left')

ps: if possible, i hope the replicated column names could be resolved automatically.

ps:如果可能,我希望可以自动解析复制的列名。

thanks!

谢谢!

回答by joris

You can use the concatfunction for this (axis=1is to concatenate as columns):

您可以concat为此使用该函数(axis=1连接为列):

pd.concat([df1, df2], axis=1)

See the pandas docs on merging/concatenating: http://pandas.pydata.org/pandas-docs/stable/merging.html

请参阅有关合并/连接的熊猫文档:http: //pandas.pydata.org/pandas-docs/stable/merging.html

回答by Hyman

I came across your question while I was trying to achieve something like the following:

我在尝试实现以下目标时遇到了您的问题:

Merge dataframe sideways

横向合并数据框

So once I sliced my dataframes, I first ensured that their index are the same. In your case both dataframes needs to be indexed from 0 to 29. Then merged both dataframes by the index.

因此,一旦我对数据帧进行切片,我首先确保它们的索引相同。在您的情况下,两个数据帧都需要从 0 到 29 进行索引。然后按索引合并两个数据帧。

df1.reset_index(drop=True).merge(df2.reset_index(drop=True), left_index=True, right_index=True)

回答by Rohit Madan

  • There is way, you can do it via a Pipeline.
  • 有办法,您可以通过管道来完成。

** Use a pipeline to transform your numerical Data for ex-

** 使用管道来转换您的数值数据,例如

Num_pipeline = Pipeline
([("select_numeric", DataFrameSelector([columns with numerical value])),
("imputer", SimpleImputer(strategy="median")),
])

**And for categorical data

**对于分类数据

cat_pipeline = Pipeline([
    ("select_cat", DataFrameSelector([columns with categorical data])),
    ("cat_encoder", OneHotEncoder(sparse=False)),
])

** Then use a Feature union to add these transformations together

** 然后使用 Feature union 将这些转换加在一起

preprocess_pipeline = FeatureUnion(transformer_list=[
    ("num_pipeline", num_pipeline),
    ("cat_pipeline", cat_pipeline),
])