使用 Pandas 循环读取 CSV 文件,然后将它们连接起来
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/46502943/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Reading CSV files in a loop using pandas, then concatenating them
提问by Zheng
I have 10 csv files, named data_run1_all.csv
, data_run2_all.csv
, ..., data_run10_all.csv
. CSV files have same columns, but different rows.
我有 10 个 csv 文件,名为data_run1_all.csv
, data_run2_all.csv
, ..., data_run10_all.csv
。CSV 文件具有相同的列,但具有不同的行。
Now I am importing them one by one to df_run1
, df_run2
, ..., df_run10
.
现在我将它们一一导入到df_run1
, df_run2
, ..., df_run10
。
Can I use a loop to import them? Something like: i=1 to 10, df_runi=pandas.read_csv('data_runi_all.csv')
.
我可以使用循环导入它们吗?类似的东西:i=1 to 10, df_runi=pandas.read_csv('data_runi_all.csv')
。
I am asking because the data analysis, plotting, etc. for each data frame are same, too. All the code for each data frame is repeated 10 times. If I can use a loop to do 10 times, the code will be much shorter and readable.
我问是因为每个数据框的数据分析、绘图等也是相同的。每个数据帧的所有代码都重复 10 次。如果我可以使用循环执行 10 次,则代码将更短且易读。
回答by cs95
Read your CSVs in a loop and call pd.concat
:
循环读取您的 CSV 并调用pd.concat
:
file_name = 'data_run{}_all.csv'
df_list = []
for i in range(1, 11):
df_list.append(pd.read_csv(file_name.format(i))
df = pd.concat(df_list)
Alternatively, you could build the list inside a comprehension:
或者,您可以在理解中构建列表:
file_name = 'data_run{}_all.csv'
df = pd.concat([pd.read_csv(file_name.format(i)) for i in range(1, 11)])
回答by Horia Coman
You need to make df_run
a list. You could do something like this:
你需要df_run
列一个清单。你可以这样做:
df_run = []
for i in range(1,10):
df_run.append(pandas.read_csv('data_run{0}_all.csv'.format(i))
for df in df_run:
// Do your processing
Or do everything in a single loop, and avoid having the list.
或者在一个循环中完成所有事情,并避免使用列表。