pandas 将多个 CSV 文件合并到 Python 电子表格的单独选项卡中

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/51964001/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 05:58:04  来源:igfitidea点击:

Merging multiple CSV files into separate tabs of a spreadsheet in Python

pythonexcelpython-3.xpython-2.7pandas

提问by Soubhik Banerjee

I have a code which generates multiple CSV files in a directory. I want to generate a report in excel which will consist of the CSV files as separate tabs. I have used the below code for the same:

我有一个代码,它在一个目录中生成多个 CSV 文件。我想在 excel 中生成一个报告,它将包含作为单独选项卡的 CSV 文件。我已经使用了以下代码:

import pandas as pd
import os
import csv
import glob    
path = "/MyScripts"
all_files = glob.glob(os.path.join(path, "*.csv"))
df_from_each_file = (pd.read_csv(f) for f in all_files)
df_from_each_file.to_excel(writer, sheet_name='ReturnData.csv')
writer.save()

But it gives below error: AttributeError: 'generator' object has no attribute 'to_excel'Not sure where i am going wrong. Do i need to import any specific library to solve the issue?

但它给出了以下错误: AttributeError: 'generator' object has no attribute 'to_excel'不确定我哪里出错了。我需要导入任何特定的库来解决问题吗?

Python Version is 2.7

Python 版本是 2.7

回答by jpp

There are two issues here:

这里有两个问题:

  1. Your generator expression allows you to lazily iterate dataframe objects. You can't export a generator expression to an Excel file.
  2. Your sheet_nameparameter is a constant. To export to multiple worksheets, you need to specify a different name for each worksheet.
  1. 您的生成器表达式允许您延迟迭代数据帧对象。您无法将生成器表达式导出到 Excel 文件。
  2. 你的sheet_name参数是一个常数。要导出到多个工作表,您需要为每个工作表指定不同的名称。

You can use a simple forloop for this purpose:

for为此,您可以使用一个简单的循环:

df_from_each_file = (pd.read_csv(f) for f in all_files)

for idx, df in enumerate(df_from_each_file):
    df.to_excel(writer, sheet_name='data{0}.csv'.format(idx))

Your worksheets will be named data0.csv, data1.csv, etc. If you need the filename as your sheet name, you can restructure your logic and use the osmodule to extract the filename from path:

您的工作表将被命名为data0.csvdata1.csv等。 如果您需要文件名作为您的工作表名称,您可以重组您的逻辑并使用该os模块从路径中提取文件名:

import os

for f in all_files:
    df = pd.read_csv(f)
    df.to_excel(writer, sheet_name=os.path.basename(f))

回答by Dan.faudemer

Here is the complete source code from jpp solution:

以下是 jpp 解决方案的完整源代码:

import os
import pandas as pd
import glob

path = './'
all_files = glob.glob(os.path.join(path, "*.csv"))

writer = pd.ExcelWriter('out.xlsx', engine='xlsxwriter')

for f in all_files:
    df = pd.read_csv(f)
    df.to_excel(writer, sheet_name=os.path.splitext(os.path.basename(f))[0], index=False)

writer.save()

回答by ASH

Although Python requires many fewer lines of code compared to VBA, I would probably use VBA for this kind of task.

尽管与 VBA 相比,Python 需要的代码行要少得多,但我可能会使用 VBA 来完成此类任务。

' Merge data from multiple sheets into separate sheets
Sub R_AnalysisMerger2()
    Dim WSA As Worksheet
    Dim bookList As Workbook
    Dim SelectedFiles As Variant
    Dim NFile As Long
    Dim FileName As String
    Dim Ws As Worksheet, vDB As Variant, rngT As Range
    Dim vFn, myFn As String

    Application.ScreenUpdating = False

    SelectedFiles = Application.GetOpenFilename(filefilter:="Excel Files (*.csv*), *.csv*", MultiSelect:=True)
    If IsEmpty(SelectedFiles) Then Exit Sub

    For NFile = LBound(SelectedFiles) To UBound(SelectedFiles)
        FileName = SelectedFiles(NFile)
        vFn = Split(FileName, "\")
        myFn = vFn(UBound(vFn))
        myFn = Replace(myFn, ".csv", "")
        Set bookList = Workbooks.Open(FileName, Format:=2)
        Set WSA = bookList.Sheets(1)
        vDB = WSA.UsedRange
        bookList.Close (0)
        Set Ws = Sheets.Add(after:=Sheets(Sheets.Count))
        ActiveSheet.Name = myFn
        Ws.Range("a1").Resize(UBound(vDB, 1), UBound(vDB, 2)) = vDB
    Next
    Application.ScreenUpdating = True

End Sub

' Merge data from multime files into one sheet.
Sub R_AnalysisMerger()
    Dim WSA As Worksheet
    Dim bookList As Workbook
    Dim SelectedFiles() As Variant
    Dim NFile As Long
    Dim FileName As String
    Dim Ws As Worksheet, vDB As Variant, rngT As Range

    Application.ScreenUpdating = False


    Set Ws = ThisWorkbook.Sheets(1)
    Ws.UsedRange.Clear
    'change folder path of excel files here
    SelectedFiles = Application.GetOpenFilename(filefilter:="Excel Files (*.csv*), *.csv*", MultiSelect:=True)


    For NFile = LBound(SelectedFiles) To UBound(SelectedFiles)
        FileName = SelectedFiles(NFile)
        Set bookList = Workbooks.Open(FileName, Format:=2)
        Set WSA = bookList.Sheets(1)
        With WSA
            vDB = .UsedRange
            Set rngT = Ws.Range("a" & Rows.Count).End(xlUp)(2)
            If rngT.Row = 2 Then Set rngT = Ws.Range("a1")
            rngT.Resize(UBound(vDB, 1), UBound(vDB, 2)) = vDB

            bookList.Close (0)
        End With
    Next
    Application.ScreenUpdating = True
    Ws.Range("A1").Select

End Sub

回答by aman kumar

you can user the pandas concate method

您可以使用 pandas concate 方法

csv1 = pd.read_csv(csv1_file_path)
csv2 = pd.read_csv(csv2_file_path)

merge_csv = pd.concat((csv1, csv2), axis=0)

axis is user for merge in which directions

轴是用户在哪个方向合并