将目录中的所有 csv 文件导入为 pandas dfs 并将它们命名为 csv 文件名
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/40058133/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
import all csv files in directory as pandas dfs and name them as csv filenames
提问by user
I'm trying to write a script that will import all .csv files in a directory to my workspace as dataframes. Each dataframe should be named as the csv file (minus the extension: .csv).
我正在尝试编写一个脚本,将目录中的所有 .csv 文件作为数据帧导入到我的工作区。每个数据框都应命名为 csv 文件(减去扩展名:.csv)。
This is what i have so far, but struggling to understand how to assign the correct name to the dataframe in the loop. I've seen posts that suggest using exec()
but this does not seem like a great solution.
到目前为止,这是我所拥有的,但很难理解如何为循环中的数据帧分配正确的名称。我看过建议使用的帖子,exec()
但这似乎不是一个很好的解决方案。
path = "../3_Data/Benefits" # dir path
all_files = glob.glob(os.path.join(path, "*.csv")) #make list of paths
for file in all_files:
dfn = file.split('\')[-1].split('.')[0] # create string for df name
dfn = pd.read_csv(file,skiprows=5) # This line should assign to the value stored in dfn
Any help appreciated, thanks.
任何帮助表示赞赏,谢谢。
回答by Romain
DataFrame
have no name
their index can have a name
. This is how to set it.
DataFrame
没有name
他们的索引可以有一个name
. 这是如何设置它。
import glob
import os
path = "./data/"
all_files = glob.glob(os.path.join(path, "*.csv")) #make list of paths
for file in all_files:
# Getting the file name without extension
file_name = os.path.splitext(os.path.basename(file))[0]
# Reading the file content to create a DataFrame
dfn = pd.read_csv(file)
# Setting the file name (without extension) as the index name
dfn.index.name = file_name
# Example showing the Name in the print output
# FirstYear LastYear
# Name
# 0 1990 2007
# 1 2001 2001
# 2 2001 2008