Python Pandas 动态创建数据框

Question

提问by Kyle

The code below will generate the desired output in ONEdataframe, however, I would like to dynamically create data frames in a FOR loop then assign the shifted value to that data frame. Example, data frame df_lag_12 would only contain column1_t12 and column2_12. Any ideas would be greatly appreciated. I attempted to dynamically create 12 dataframes using the EXEC statement, google searching seems to state this is poor practice.

下面的代码将在一个数据帧中生成所需的输出，但是，我想在 FOR 循环中动态创建数据帧，然后将移位的值分配给该数据帧。例如，数据框 df_lag_12 将仅包含 column1_t12 和 column2_12。任何想法将不胜感激。我尝试使用 EXEC 语句动态创建 12 个数据帧，谷歌搜索似乎表明这是一种糟糕的做法。

import pandas as pd
list1=list(range(0,20))
list2=list(range(19,-1,-1))
d={'column1':list(range(0,20)),
   'column2':list(range(19,-1,-1))}
df=pd.DataFrame(d)
df_lags=pd.DataFrame()
for col in df.columns:
    for i in range(12,0,-1):
        df_lags[col+'_t'+str(i)]=df[col].shift(i)
    df_lags[col]=df[col].values  
print(df_lags)
for df in (range(12,0,-1)):
    exec('model_data_lag_'+str(df)+'=pd.DataFrame()')

Desired output for dymanically created dataframe DF_LAGS_12:

动态创建的数据帧 DF_LAGS_12 所需的输出：

var_list=['column1_t12','column2_t12']
df_lags_12=df_lags[var_list]  
print(df_lags_12)

Answer 1

回答by jezrael

I think the best is create dictionary of DataFrames:

我认为最好的是创建dictionary of DataFrames：

d = {}
for i in range(12,0,-1):
    d['t' + str(i)] = df.shift(i).add_suffix('_t' + str(i))

If need specify columns first:

如果需要先指定列：

d = {}
cols = ['column1','column2']
for i in range(12,0,-1):
    d['t' + str(i)] = df[cols].shift(i).add_suffix('_t' + str(i))

dict comprehensionsolution:

dict comprehension解决方案：

d = {'t' + str(i): df.shift(i).add_suffix('_t' + str(i)) for i in range(12,0,-1)}

print (d['t10'])
    column1_t10  column2_t10
0           NaN          NaN
1           NaN          NaN
2           NaN          NaN
3           NaN          NaN
4           NaN          NaN
5           NaN          NaN
6           NaN          NaN
7           NaN          NaN
8           NaN          NaN
9           NaN          NaN
10          0.0         19.0
11          1.0         18.0
12          2.0         17.0
13          3.0         16.0
14          4.0         15.0
15          5.0         14.0
16          6.0         13.0
17          7.0         12.0
18          8.0         11.0
19          9.0         10.0

EDIT: Is it possible by globals, but much better is dictionary:

编辑：全局变量是否有可能，但更好的是dictionary：

d = {}
cols = ['column1','column2']
for i in range(12,0,-1):
    globals()['df' + str(i)] =  df[cols].shift(i).add_suffix('_t' + str(i))

print (df10)
    column1_t10  column2_t10
0           NaN          NaN
1           NaN          NaN
2           NaN          NaN
3           NaN          NaN
4           NaN          NaN
5           NaN          NaN
6           NaN          NaN
7           NaN          NaN
8           NaN          NaN
9           NaN          NaN
10          0.0         19.0
11          1.0         18.0
12          2.0         17.0
13          3.0         16.0
14          4.0         15.0
15          5.0         14.0
16          6.0         13.0
17          7.0         12.0
18          8.0         11.0
19          9.0         10.0

Python Pandas 动态创建数据框

提问by Kyle

回答by jezrael

相关推荐

最近更新

标签

Python Pandas 动态创建数据框

提问by Kyle

回答by jezrael

相关推荐

在 Pandas DataFrame 中存储 3 维数据

在同一 Pandas 数据框中交换行

Pandas：对多列求和并在多列中获得结果

在 Pandas 数据框中查找从点到行的欧几里德距离

相关推荐

最近更新

标签