Pandas 合并多个 csv 文件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/49111093/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 05:16:17  来源:igfitidea点击:

Pandas combine multiple csv files

pythonpandas

提问by warrenfitzhenry

I have multiple csv files that I would like to combine into one df.

我有多个 csv 文件,我想将它们合并为一个 df。

They are all in this general format, with two index columns:

它们都是这种通用格式,有两个索引列:

                                           1     2
CU0112-005287-7 Output Energy, (Wh/h)   0.064   0.066
CU0112-005287-7 Lights (Wh)                0     0

                                            1     2
CU0112-001885-L Output Energy, (Wh/h)   1.33    1.317
CU0112-001885-L Lights (Wh)             1.33    1.317

and so on...

等等...

The combined df would be:

合并后的 df 将是:

                                           1     2
CU0112-005287-7 Output Energy, (Wh/h)   0.064   0.066
CU0112-005287-7 Lights (Wh)                0     0
CU0112-001885-L Output Energy, (Wh/h)   1.33    1.317
CU0112-001885-L Lights (Wh)             1.33    1.317

I am trying this code:

我正在尝试这个代码:

import os
import pandas as pd
import glob

files = glob.glob(r'2017-12-05\Aggregated\*.csv')   //folder which contains all the csv files

df = pd.merge([pd.read_csv(f, index_col=[0,1])for f in files], how='outer')

df.to_csv(r'\merged.csv')

But I am getting this error:

但我收到此错误:

TypeError: merge() takes at least 2 arguments (2 given)

回答by jezrael

I think you need concatinstead merge:

我认为你需要concat,而不是merge

df = pd.concat([pd.read_csv(f, index_col=[0,1]) for f in files])

回答by Yayati Sule

You can try the following. I made some changes to the DataFrame combining logic

您可以尝试以下操作。我对 DataFrame 组合逻辑进行了一些更改

import os
import pandas as pd
import glob

files = glob.glob(r'2017-12-05\Aggregated\*.csv')   //folder which contains all the csv files

df = reduce(lambda df1,df2: pd.merge(df1,df2,on='id',how='outer'),[pd.read_csv(f, index_col=[0,1])for f in files] )

df.to_csv(r'\merged.csv')

回答by Billy Bonaros

A simple way:

一个简单的方法:

Creating a list with the names of csvs:

创建一个带有 csvs 名称的列表:

files=listdir()
csvs=list()
for file in files:
    if file.endswith(".csv"):
        csvs.append(file)

concatenate the csvs:

连接 csvs:

data=pd.DataFrame()
for i in csvs:
    table=pd.read_csv(i)
    data=pd.concat([data,table])