pandas 沿短轴扩展熊猫面板框架

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/15364050/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 20:42:24  来源:igfitidea点击:

Extending a pandas panel frame along the minor axis

pythonpandaspanel

提问by David Bieber

I would like to extend a Panel frame of data along a minor axis in pandas. I start off creating a dicof DataFrames to generate a Panel.

我想沿着Pandas的短轴扩展面板数据框架。我开始创建 a dicof DataFrames 来生成一个面板。

import pandas as pd
import numpy as np
rng = pd.date_range('1/1/2013',periods=100,freq='D')
df1 = pd.DataFrame(np.random.randn(100, 4), index = rng, columns = ['A','B','C','D'])
df2 = pd.DataFrame(np.random.randn(100, 4), index = rng, columns = ['A','B','C','D'])
df3 = pd.DataFrame(np.random.randn(100, 4), index = rng, columns = ['A','B','C','D'])
pf = pd.Panel({'df1':df1,'df2':df2,'df3':df3})

As expected I, find I have a panel with the following dimensions:

正如预期的那样,我发现我有一个具有以下尺寸的面板:

Dimensions: 3 (items) x 100 (major_axis) x 4 (minor_axis) Items axis: df1 to df3 Major_axis axis: 2013-01-01 00:00:00 to 2013-04-10 00:00:00 Minor_axis axis: A to D

尺寸:3(物品)x 100(长轴)x 4(短轴) 物品轴:df1 到 df3 长轴:2013-01-01 00:00:00 到 2013-04-10 00:00:00 短轴:A到 D

I would now like to add a new data set to the Minor axis:

我现在想向短轴添加一个新数据集:

pf['df1']['E'] = pd.DataFrame(np.random.randn(100, 1), index = rng)
pf['df2']['E'] = pd.DataFrame(np.random.randn(100, 1), index = rng)
pf['df2']['E'] = pd.DataFrame(np.random.randn(100, 1), index = rng)

I find that after adding this new minor axis the shape of the panel array dimensions has not changed:

我发现在添加这个新的短轴后,面板阵列尺寸的形状没有改变:

shape(pf)

[3,100,4]

[3,100,4]

I am able to access the data for each of the items in the major_axis:

我能够访问major_axis中每个项目的数据:

pf.ix['df1',-10:,'E']

2013-04-01 0.168205 2013-04-02 0.677929 2013-04-03 0.845444 2013-04-04 0.431610 2013-04-05 0.501003 2013-04-06 -0.403605 2013-04-07 -0.185033 2013-04-08 0.270093 2013-04-09 1.569180 2013-04-10 -1.374779 Freq: D, Name: E

2013年4月1日0.168205 2013年4月2日0.677929 2013年4月3日0.845444 2013年4月4日0.431610 2013年4月5日0.501003 2013年4月6日-0.403605 2013年4月7日-0.185033 2013年4月8日0.270093 2013-04-09 1.569180 2013-04-10 -1.374779 频率:D,姓名:E

But if I extend the slicing to include more than one major axis:

但是,如果我将切片扩展为包含多个主轴:

pf.ix[:,:,'E']

Then I encounter an error saying that 'E' is unknown.

然后我遇到一个错误,说“E”是未知的。

Can anyone suggest where I am going wrong or a better way of performing this operation?

任何人都可以建议我哪里出错或执行此操作的更好方法吗?

回答by Jeff

This doesn't work right now see this, https://github.com/pydata/pandas/issues/2578But you can accomplish what you want this way. This is a pretty cheap operation as nothing is copied.

这现在不起作用,参见https://github.com/pydata/pandas/issues/2578但是您可以通过这种方式完成您想要的操作。这是一个非常便宜的操作,因为没有复制任何内容。

In [18]: x = pf.transpose(2,0,1)

In [19]: x
Out[19]: 
<class 'pandas.core.panel.Panel'>
Dimensions: 4 (items) x 3 (major_axis) x 100 (minor_axis)
Items axis: A to D
Major_axis axis: df1 to df3
Minor_axis axis: 2013-01-01 00:00:00 to 2013-04-10 00:00:00

In [20]: x['E'] = new_df

In [21]: x.transpose(1,2,0)
Out[21]: 
<class 'pandas.core.panel.Panel'>
Dimensions: 3 (items) x 100 (major_axis) x 5 (minor_axis)
Items axis: df1 to df3
Major_axis axis: 2013-01-01 00:00:00 to 2013-04-10 00:00:00
Minor_axis axis: A to E

回答by euri10

It seems like the bug was fixed but your question interested me.

似乎该错误已修复,但您的问题让我感兴趣。

Since you can effectively add a slice to a panel on the major and minor axis without transposing, the following 2 lines can avoid scratching your head on the size of the Dataframe...

由于您可以在不转置的情况下有效地将切片添加到长轴和短轴上的面板,因此以下 2 行可以避免在 Dataframe 的大小上挠头……

pf.ix[:,'another major axis',:] = pd.DataFrame(np.random.randn(pf.minor_axis.shape[0],pf.items.shape[0]), index=pf.minor_axis, columns=pf.items)

pf.ix[:, :, 'another minor axis'] = pd.DataFrame(np.random.randn(pf.major_axis.shape[0],pf.items.shape[0]), index=pf.major_axis, columns=pf.items)

I wondered however if there was something simpler ?

但是我想知道是否有更简单的东西?

Below the piece of code that add slices along various axes.

在沿各个轴添加切片的代码段下方。

import pandas as pd
import numpy as np

rng = pd.date_range('25/11/2014', periods=2, freq='D')
df1 = pd.DataFrame(np.random.randn(2, 5), index=rng, columns=['A', 'B', 'C', 'D', 'E'])
df2 = pd.DataFrame(np.random.randn(2, 5), index=rng, columns=['A', 'B', 'C', 'D', 'E'])
df3 = pd.DataFrame(np.random.randn(2, 5), index=rng, columns=['A', 'B', 'C', 'D', 'E'])


pf = pd.Panel({'df1': df1, 'df2': df2, 'df3': df3})

# print("slice before adding df4:\n")
# for i in pf.items:
#     print("{}:\n{}".format(i, pf[i]))

pf['df4'] = pd.DataFrame(np.random.randn(pf.major_axis.shape[0], pf.minor_axis.shape[0]), index=pf.major_axis, columns=pf.minor_axis)
print pf

# print("slice after df4 before transposing 1:\n")
# for i in pf.items:
#     print("{}:\n{}".format(i, pf[i]))

x = pf.transpose(1, 0, 2)

x['new major axis item'] = pd.DataFrame(np.random.randn(pf.items.shape[0], pf.minor_axis.shape[0]), index=pf.items,
                                        columns=pf.minor_axis)

pf = x.transpose(1, 0, 2)

print pf
# print("slice after:\n")
# for i in pf.items:
#     print("{}:\n{}".format(i, pf[i]))

print("success on adding slice on major axis:")
print pf.major_xs(key='new major axis item')
print("trying to add major axis directly")
pf.ix[:,'another major axis',:] = pd.DataFrame(np.random.randn(pf.minor_axis.shape[0],pf.items.shape[0]), index=pf.minor_axis, columns=pf.items)

print pf.major_xs(key='another major axis')
print pf