在 Pandas 中动态命名 DataFrame

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/32577939/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 23:53:33  来源:igfitidea点击:

Dynamically naming DataFrames in Pandas

pythonpython-2.7pandas

提问by Harrison Wallace

Imagine I have 2 data frames as such:

想象一下,我有 2 个数据框:

foo = pd.DataFrame({'a': [1,2,3], 'b': [4,5,6]})
bar = pd.DataFrame({'c':[7,8,9], 'd':[10,11,12]})

I want to subset each of these data frames and put them in a new data frame with a dynamic name. When I look up anything on dynamic naming in python, they say, don't do it, use a dictionary. I can't quite figure out how to make it work though. Essentially I want the following:

我想对这些数据帧中的每一个进行子集化,并将它们放入一个具有动态名称的新数据帧中。当我在 python 中查找有关动态命名的任何内容时,他们说,不要这样做,使用字典。我不太清楚如何使它工作。基本上我想要以下内容:

foo_first = foo[0:1]
bar_first = bar[0:1]

But I want to be able to do it looping through a list. I would imagine it something like this, if I'm trying to do it with a dictionary:

但我希望能够循环遍历列表。如果我想用字典来做,我会想象它是这样的:

dfs_list = [foo, bar]
dfs_dict = {}

for x in dfs_list:
    dfs_dict[x+'_first']=foo[0:1]

Which does not work.

这不起作用。

You might be wondering what I'm actually trying to do, as my example is so arbitrary and pointless. In my real world example, I have several data frames indexed by date. I want to create new dataframes based on the names of these old dataframes for the current year and month. So if foo and bar were giant datasets with date indexes, I want to automate:

你可能想知道我到底在做什么,因为我的例子太武断和毫无意义。在我的真实示例中,我有几个按日期索引的数据框。我想根据当前年份和月份的这些旧数据框的名称创建新数据框。因此,如果 foo 和 bar 是带有日期索引的巨型数据集,我想自动化:

foo_year = foo['2015-01-01':'2015-12-31']
bar_year = bar['2015-01-01':'2015-12-31']
foo_month = foo['2015-08-01':'2015-08-31']
bar_month = foo['2015-08-01':'2015-08-31']

Any help?

有什么帮助吗?

回答by Blckknght

I can't think of any reason you couldn't use a dictionary of DataFrames. This will let you avoid needing to treat the variable names as data:

我想不出任何原因您不能使用DataFrames字典。这将使您无需将变量名称视为数据:

whole_dataframes = {"foo": foo, "bar": bar}
first_dataframes = {name: value[:1] for name, value in whole_dataframes.items()}

I'm using the fooand barvariables you've described to initialize the first dict, but you can probably skip that step and just create the values directly in the dict:

我正在使用您描述的foobar变量来初始化第一个 dict,但您可以跳过该步骤,直接在 dict 中创建值:

whole_dataframes = {}
whole_dataframes["foo"] = pd.DataFrame({'a': [1,2,3], 'b': [4,5,6]})
whole_dataframes["bar"] = pd.DataFrame({'c':[7,8,9], 'd':[10,11,12]})