pandas 将多个 CSV 文件合并到 Python 电子表格的单独选项卡中
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/51964001/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Merging multiple CSV files into separate tabs of a spreadsheet in Python
提问by Soubhik Banerjee
I have a code which generates multiple CSV files in a directory. I want to generate a report in excel which will consist of the CSV files as separate tabs. I have used the below code for the same:
我有一个代码,它在一个目录中生成多个 CSV 文件。我想在 excel 中生成一个报告,它将包含作为单独选项卡的 CSV 文件。我已经使用了以下代码:
import pandas as pd
import os
import csv
import glob
path = "/MyScripts"
all_files = glob.glob(os.path.join(path, "*.csv"))
df_from_each_file = (pd.read_csv(f) for f in all_files)
df_from_each_file.to_excel(writer, sheet_name='ReturnData.csv')
writer.save()
But it gives below error: AttributeError: 'generator' object has no attribute 'to_excel'Not sure where i am going wrong. Do i need to import any specific library to solve the issue?
但它给出了以下错误: AttributeError: 'generator' object has no attribute 'to_excel'不确定我哪里出错了。我需要导入任何特定的库来解决问题吗?
Python Version is 2.7
Python 版本是 2.7
回答by jpp
There are two issues here:
这里有两个问题:
- Your generator expression allows you to lazily iterate dataframe objects. You can't export a generator expression to an Excel file.
- Your
sheet_name
parameter is a constant. To export to multiple worksheets, you need to specify a different name for each worksheet.
- 您的生成器表达式允许您延迟迭代数据帧对象。您无法将生成器表达式导出到 Excel 文件。
- 你的
sheet_name
参数是一个常数。要导出到多个工作表,您需要为每个工作表指定不同的名称。
You can use a simple for
loop for this purpose:
for
为此,您可以使用一个简单的循环:
df_from_each_file = (pd.read_csv(f) for f in all_files)
for idx, df in enumerate(df_from_each_file):
df.to_excel(writer, sheet_name='data{0}.csv'.format(idx))
Your worksheets will be named data0.csv
, data1.csv
, etc. If you need the filename as your sheet name, you can restructure your logic and use the os
module to extract the filename from path:
您的工作表将被命名为data0.csv
、data1.csv
等。 如果您需要文件名作为您的工作表名称,您可以重组您的逻辑并使用该os
模块从路径中提取文件名:
import os
for f in all_files:
df = pd.read_csv(f)
df.to_excel(writer, sheet_name=os.path.basename(f))
回答by Dan.faudemer
Here is the complete source code from jpp solution:
以下是 jpp 解决方案的完整源代码:
import os
import pandas as pd
import glob
path = './'
all_files = glob.glob(os.path.join(path, "*.csv"))
writer = pd.ExcelWriter('out.xlsx', engine='xlsxwriter')
for f in all_files:
df = pd.read_csv(f)
df.to_excel(writer, sheet_name=os.path.splitext(os.path.basename(f))[0], index=False)
writer.save()
回答by ASH
Although Python requires many fewer lines of code compared to VBA, I would probably use VBA for this kind of task.
尽管与 VBA 相比,Python 需要的代码行要少得多,但我可能会使用 VBA 来完成此类任务。
' Merge data from multiple sheets into separate sheets
Sub R_AnalysisMerger2()
Dim WSA As Worksheet
Dim bookList As Workbook
Dim SelectedFiles As Variant
Dim NFile As Long
Dim FileName As String
Dim Ws As Worksheet, vDB As Variant, rngT As Range
Dim vFn, myFn As String
Application.ScreenUpdating = False
SelectedFiles = Application.GetOpenFilename(filefilter:="Excel Files (*.csv*), *.csv*", MultiSelect:=True)
If IsEmpty(SelectedFiles) Then Exit Sub
For NFile = LBound(SelectedFiles) To UBound(SelectedFiles)
FileName = SelectedFiles(NFile)
vFn = Split(FileName, "\")
myFn = vFn(UBound(vFn))
myFn = Replace(myFn, ".csv", "")
Set bookList = Workbooks.Open(FileName, Format:=2)
Set WSA = bookList.Sheets(1)
vDB = WSA.UsedRange
bookList.Close (0)
Set Ws = Sheets.Add(after:=Sheets(Sheets.Count))
ActiveSheet.Name = myFn
Ws.Range("a1").Resize(UBound(vDB, 1), UBound(vDB, 2)) = vDB
Next
Application.ScreenUpdating = True
End Sub
' Merge data from multime files into one sheet.
Sub R_AnalysisMerger()
Dim WSA As Worksheet
Dim bookList As Workbook
Dim SelectedFiles() As Variant
Dim NFile As Long
Dim FileName As String
Dim Ws As Worksheet, vDB As Variant, rngT As Range
Application.ScreenUpdating = False
Set Ws = ThisWorkbook.Sheets(1)
Ws.UsedRange.Clear
'change folder path of excel files here
SelectedFiles = Application.GetOpenFilename(filefilter:="Excel Files (*.csv*), *.csv*", MultiSelect:=True)
For NFile = LBound(SelectedFiles) To UBound(SelectedFiles)
FileName = SelectedFiles(NFile)
Set bookList = Workbooks.Open(FileName, Format:=2)
Set WSA = bookList.Sheets(1)
With WSA
vDB = .UsedRange
Set rngT = Ws.Range("a" & Rows.Count).End(xlUp)(2)
If rngT.Row = 2 Then Set rngT = Ws.Range("a1")
rngT.Resize(UBound(vDB, 1), UBound(vDB, 2)) = vDB
bookList.Close (0)
End With
Next
Application.ScreenUpdating = True
Ws.Range("A1").Select
End Sub
回答by aman kumar
you can user the pandas concate method
您可以使用 pandas concate 方法
csv1 = pd.read_csv(csv1_file_path)
csv2 = pd.read_csv(csv2_file_path)
merge_csv = pd.concat((csv1, csv2), axis=0)
axis is user for merge in which directions
轴是用户在哪个方向合并