Python 在循环中创建多个数据帧

Question

提问by Luis Ibá?ez Herrera

I have a list, with each entry being a company name

我有一个列表，每个条目都是一个公司名称

companies = ['AA', 'AAPL', 'BA', ....., 'YHOO']

I want to create a new dataframe for each entry in the list.

我想为列表中的每个条目创建一个新的数据框。

Something like

就像是

(pseudocode)

（伪代码）

for c in companies:
     c = pd.DataFrame()

I have searched for a way to do this but can't find it. Any ideas?

我一直在寻找一种方法来做到这一点，但找不到。有任何想法吗？

Answer 1

采纳答案by maxymoo

You can do this (although obviously use execwith extreme caution if this is going to be public-facing code)

你可以这样做（尽管exec如果这是面向公众的代码，显然使用时要格外小心）

for c in companies:
     exec('{} = pd.DataFrame()'.format(c))

Answer 2

回答by holdenweb

Just to underline my comment to @maxymoo's answer, it's almost invariably a bad idea ("code smell") to add names dynamically to a Python namespace. There are a number of reasons, the most salient being:

只是为了强调我对@maxymoo 的回答的评论，将名称动态添加到 Python 命名空间几乎总是一个坏主意（“代码味道”）。原因有很多，最突出的是：

Created names might easily conflict with variables already used by your logic.
Since the names are dynamically created, you typically also end up using dynamic techniques to retrieve the data.

创建的名称可能很容易与您的逻辑已使用的变量发生冲突。
由于名称是动态创建的，您通常最终也会使用动态技术来检索数据。

This is why dicts were included in the language. The correct way to proceed is:

这就是为什么 dicts 被包含在语言中的原因。正确的操作方法是：

d = {}
for name in companies:
    d[name] = pd.DataFrame()

Nowadays you can write a single dict comprehensionexpression to do the same thing, but some people find it less readable:

现在你可以编写一个单独的dict 理解表达式来做同样的事情，但有些人发现它不太可读：

d = {name: pd.DataFrame() for name in companies}

Once dis created the DataFramefor company xcan be retrieved as d[x], so you can look up a specific company quite easily. To operate on all companies you would typically use a loop like:

一旦d创建DataFrame了公司x可以为被检索d[x]，这样你就可以查找特定的公司很容易。要对所有公司进行操作，您通常会使用如下循环：

for name, df in d.items():
    # operate on DataFrame 'df' for company 'name'

In Python 2 you are better writing

在 Python 2 中你写得更好

for name, df in d.iteritems():

because this avoids instantiating a list of (name, df)tuples.

因为这避免了实例化(name, df)元组列表。

Answer 3

回答by ak3191

Adding to the above great answers. The above will work flawless if you need to create empty data frames but if you need to create multiple dataframe based on some filtering:

添加到上述伟大的答案。如果您需要创建空数据框，但如果您需要基于某些过滤创建多个数据框，则上述内容将完美无缺：

Suppose the list you got is a column of some dataframe and you want to make multiple data frames for each unique companies fro the bigger data frame:-

假设您得到的列表是某个数据框的列，并且您想为更大的数据框为每个独特的公司制作多个数据框：-

First take the unique names of the companies:-
```
compuniquenames = df.company.unique()
```

Create a data frame dictionary to store your data frames

companydict = {elem : pd.DataFrame() for elem in compuniquenames}

首先取公司的唯一名称：-
```
compuniquenames = df.company.unique()
```

创建一个数据框字典来存储你的数据框

companydict = {elem : pd.DataFrame() for elem in compuniquenames}

The above two are already in the post:

上面两个已经在帖子里了：

for key in DataFrameDict.keys():
    DataFrameDict[key] = df[:][df.company == key]

The above will give you a data frame for all the unique companies with matching record.

以上将为您提供所有具有匹配记录的独特公司的数据框。

Python 在循环中创建多个数据帧

提问by Luis Ibá?ez Herrera

采纳答案by maxymoo

回答by holdenweb

回答by ak3191

相关推荐

最近更新

标签

Python 在循环中创建多个数据帧

提问by Luis Ibá?ez Herrera

采纳答案by maxymoo

回答by holdenweb

回答by ak3191

相关推荐

如何在 Python3 中将“二进制字符串”转换为普通字符串？

如何在python中导入OpenSSL

使用 Python 正则表达式按换行符或句点划分字符串

Python Pip默认行为与virtualenv冲突？

相关推荐

最近更新

标签