pandas Python:在列表中存储多个数据帧

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/48211358/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 05:02:50  来源:igfitidea点击:

Python: Store multiple dataframe in list

pythonlistpandasdataframestore

提问by Dr.Will

I have a loop that read Excel sheets in a document. I want to store them all in a list:

我有一个循环读取文档中的 Excel 工作表。我想将它们全部存储在一个列表中:

  DF_list= list()

  for sheet in sheets:
     df= pd.read_excel(...)
     DF_list = DF_list.append(df)

If I type:

如果我输入:

[df df df df]

it works.

有用。

Sorry I have a Matlab background and not very used to Python, but I like it. Thanks.

抱歉,我有 Matlab 背景,不太习惯 Python,但我喜欢它。谢谢。

回答by SuperStew

Try this

尝试这个

DF_list= list()

for sheet in sheets:

   df = pd.read_excel(...)

   DF_list.append(df)

or for more compact python, something like this would probably do

或者对于更紧凑的python,这样的事情可能会做

DF_list=[pd.read_excel(...) for sheet in sheets]

回答by Mike Müller

.append()modifies a list and returns None. You override DF_listwith Nonein your first loop and the append will fail in the second loop.

.append()修改列表并返回None. 您在第一个循环中覆盖DF_listwithNone并且追加将在第二个循环中失败。

Therefore:

所以:

DF_list = list()

for sheet in sheets:
    DF_list.append(pd.read_excel(...))

Or use a list comprehension:

或者使用列表理解:

DF_list = [pd.read_excel(...) for sheet in sheets] 

回答by MaxU

If you will use parameter sheet_name=None:

如果您将使用参数sheet_name=None

dfs = pd.read_excel(..., sheet_name=None)

it will return a dictionary of Dataframes:

它将返回一个数据框字典:

sheet_name : string, int, mixed list of strings/ints, or None, default 0

    Strings are used for sheet names, Integers are used in zero-indexed
    sheet positions.

    Lists of strings/integers are used to request multiple sheets.

    Specify None to get all sheets.

    str|int -> DataFrame is returned.
    list|None -> Dict of DataFrames is returned, with keys representing
    sheets.

    Available Cases

    * Defaults to 0 -> 1st sheet as a DataFrame
    * 1 -> 2nd sheet as a DataFrame
    * "Sheet1" -> 1st sheet as a DataFrame
    * [0,1,"Sheet5"] -> 1st, 2nd & 5th sheet as a dictionary of DataFrames
    * None -> All sheets as a dictionary of DataFrames

回答by flying_fluid_four

Complete solution is as follows:

完整的解决方案如下:

# (0) Variable containing location of excel file containing many sheets
excelfile_wt_many_sheets = 'C:\this\is\my\location\and\filename.xlsx'

# (1) Initiate empty list to hold all sheet specific dataframes
df_list= []

# (2) create unicode object 'sheets' to hold all sheet names in the excel file
df = pd.ExcelFile(excelfile_wt_many_sheets)
sheets = df.sheet_names

# (3) Iterate over the (2) to read in every sheet in the excel into a dataframe 
#     and append that dataframe into (1)
for sheet in sheets:
    df = pd.read_excel(excelfile_wt_many_sheets, sheet)
    df_list.append(df)

回答by Manideep Karthik

Actually there's no need to define new list to store bunch of dataframes. The pandas.ExcelFile function applied on excel file with multiple sheets returns ExcelFile object which is a collection that can catch hold bunch of dataframes together. Hope the below code helps.

实际上没有必要定义新列表来存储一堆数据帧。应用于具有多个工作表的 excel 文件的 pandas.ExcelFile 函数返回 ExcelFile 对象,该对象是一个可以将一堆数据帧捕获在一起的集合。希望下面的代码有帮助。

df = pd.ExcelFile('C:\read_excel_file_with_multiple_sheets.xlsx')
Sheet_names_list = df.sheet_names
for sheet in Sheet_names_list :
   df_to_print = df.parse(sheet_name=sheet)
   print df_to_print