pandas 熊猫连接不同的索引

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/43318515/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 03:21:51  来源:igfitidea点击:

Pandas concat with different indices

pythonpandasdataframeconcat

提问by Hamperfait

I have three data frames that I want to concatenate, but they all have different indices. All three indices have the same length. My first df look like this:

我有三个要连接的数据框,但它们都有不同的索引。所有三个索引都具有相同的长度。我的第一个 df 看起来像这样:

Index    Time_start    Time_end    duration    value
0        5             10          5           1.0
1        10            16          6           NaN
...
39       50            53          3           NaN

The second df looks like this:

第二个 df 看起来像这样:

Index    Time_start    Time_end    duration    value
40        5             10         5           2.0
42        10            16         6           NaN
...
79        50            53         3           NaN

And the third looks exactly the same but with Index = [80..119] But time_start, Time_end and duration are exactly the same. Value differs.

第三个看起来完全一样,但是 Index = [80..119] 但是 time_start、Time_end 和 duration 完全相同。价值不同。

I want to concatenate the value column so that it looks like this

我想连接值列,使其看起来像这样

Index    Time_start    Time_end    duration    value1    value2 value3
1        5             10          5           1.0       2      3
2        10            16          6           NaN       NaN    NaN
...
39       50            53          3           NaN       NaN    NaN

So far I tried this

到目前为止我试过这个

pd.concat([df1, df2.value, ms3.value], axis=1, join_axes = [df1.index])

but indices are not the same, so it doesn't work. I know I can try first with

但索引不一样,所以它不起作用。我知道我可以先尝试

df2.reset_index(drop=True)

and then do the concat, which works, but I'm sure there's a better way.

然后进行 concat,它有效,但我相信有更好的方法。

回答by piRSquared

dfs = [df1, df2]
cols = ['Time_start', 'Time_end', 'duration']
keys = ['value1', 'value2']
pd.concat(
    [df.set_index(cols).value for df in dfs],
    axis=1, keys=keys)

                              value1  value2
Time_start Time_end duration                
5          10       5            1.0     2.0
10         16       6            NaN     NaN
50         53       3            NaN     NaN

回答by jezrael

Use:

用:

dfs = [df1,df2]
k = ['value1','value2']
    df = pd.concat([x.set_index(['Time_start','Time_end','duration']) for x in dfs], 
                    axis=1,keys=k)
df.columns = df.columns.droplevel(-1)
print (df)
                              value1  value2
Time_start Time_end duration                
5          10       5            1.0     2.0
10         16       6            NaN     NaN
50         53       3            NaN     NaN

Another solution:

另一种解决方案:

dfs = [df1,df2]
df = pd.concat([x.set_index(['Time_start','Time_end','duration']) for x in dfs],axis=1)
df.columns = [x + str(i+1) for i, x in enumerate(df.columns)]
print (df)
                              value1  value2
Time_start Time_end duration                
5          10       5            1.0     2.0
10         16       6            NaN     NaN
50         53       3            NaN     NaN