Pandas,使用 for 循环构建新的数据框

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/22557150/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 21:50:11  来源:igfitidea点击:

Pandas, build new dataframe with for loop

pythonfor-looppandasdataframe

提问by jonas

I have a really simple problem that I cant solve in Pandas. I have a dataframe to start with, with that dataframe I want to apply some function. I want to repeat this many times and build/stack the reults from the operations in a new larger dataframe. I was thinking of doing this with a for loop. Here is a simplified example that I can not get to work:

我有一个非常简单的问题,我无法在 Pandas 中解决。我有一个数据框开始,我想用这个数据框应用一些功能。我想多次重复这一点,并在一个新的更大的数据帧中构建/堆叠操作的结果。我想用 for 循环来做这件事。这是一个我无法开始工作的简化示例:

import pandas as pd

df = pd.DataFrame(np.random.randn(3, 4), columns=list('ABCD'))

large_df = df*0

for i in range(1,10):
    df_new = df*i
    large_df= pd.concat(large_df,df_new)

large_df

Any ideas??

有任何想法吗??

回答by Dan Allan

It will be fastest to build all of the results first and concatenate once in the end. If you append one result at a time, the memory for the results has to be re-allocated each time.

首先构建所有结果并最终连接一次将是最快的。如果一次添加一个结果,则每次都必须重新分配结果的内存。

So, if you are applying some_functionwith a different parameter pthrough the loop (like iin your toy example above) I suggest:

因此,如果您通过循环应用some_function不同的参数p(如i上面的玩具示例),我建议:

pd.concat([df.apply(lambda x: some_function(x, p)) for p in parameters])