pandas.Panel 弃用警告实际上推荐什么?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/48482256/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
What is the pandas.Panel deprecation warning actually recommending?
提问by cge
I have a package that uses pandas Panels to generate MultiIndex pandas DataFrames. However, whenever I use pandas.Panel, I get the following DeprecationError:
我有一个使用 pandas Panels 生成 MultiIndex pandas DataFrames 的包。但是,每当我使用 pandas.Panel 时,都会收到以下 DeprecationError:
DeprecationWarning: Panel is deprecated and will be removed in a future version. The recommended way to represent these types of 3-dimensional data are with a MultiIndex on a DataFrame, via the Panel.to_frame() method. Alternatively, you can use the xarray package http://xarray.pydata.org/en/stable/. Pandas provides a
.to_xarray()
method to help automate this conversion.
弃用警告:面板已弃用,将在未来版本中删除。表示这些类型的 3 维数据的推荐方法是通过 Panel.to_frame() 方法在 DataFrame 上使用 MultiIndex。或者,您可以使用 xarray 包http://xarray.pydata.org/en/stable/。Pandas 提供了一种
.to_xarray()
方法来帮助自动化这种转换。
However, I can't understand what the first recommendation here is actually recommending in order to create MultiIndex DataFrames. If Panel is going to be removed, how am I going to be able to use Panel.to_frame?
但是,我无法理解这里的第一个建议实际上是为了创建 MultiIndex DataFrames 推荐什么。如果面板将被删除,我将如何能够使用 Panel.to_frame?
To clarify: I am not asking what deprecation is, or how to convert my Panels to DataFrames. What I am asking is, if I am using pandas.Panel and then pandas.Panel.to_frame in a library to create MultiIndex DataFrames from 3D ndarrays, and Panels are going to be deprecated, then what is the best option for making those DataFrames without using the Panel API?
澄清一下:我不是在问弃用是什么,或者如何将我的面板转换为数据帧。我要问的是,如果我在库中使用 pandas.Panel 和 pandas.Panel.to_frame 从 3D ndarrays 创建多索引数据帧,并且面板将被弃用,那么制作这些数据帧的最佳选择是什么使用面板 API?
Eg, if I'm doing the following, with X as a ndarray with shape (N,J,K):
例如,如果我执行以下操作,将 X 作为形状为 (N,J,K) 的 ndarray:
p = pd.Panel(X, items=item_names, major_axis=names0, minor_axis=names1)
df = p.to_frame()
this is clearly no longer a viable future-proof option for DataFrame construction, though it was the recommended method in this question.
这显然不再是 DataFrame 构建的可行的面向未来的选项,尽管它是本问题中推荐的方法。
采纳答案by ayhan
Consider the following panel:
考虑以下面板:
data = np.random.randint(1, 10, (5, 3, 2))
pnl = pd.Panel(
data,
items=['item {}'.format(i) for i in range(1, 6)],
major_axis=[2015, 2016, 2017],
minor_axis=['US', 'UK']
)
If you convert this to a DataFrame, this becomes:
如果将其转换为 DataFrame,则变为:
item 1 item 2 item 3 item 4 item 5
major minor
2015 US 9 6 3 2 5
UK 8 3 7 7 9
2016 US 7 7 8 7 5
UK 9 1 9 9 1
2017 US 1 8 1 3 1
UK 6 8 8 1 6
So it takes the major and minor axes as the row MultiIndex, and items as columns. The shape has become (6, 5) which was originally (5, 3, 2). It is up to you where to use the MultiIndex but if you want the exact same shape, you can do the following:
所以它将长轴和短轴作为行 MultiIndex,项目作为列。形状变成了 (6, 5),原来是 (5, 3, 2)。在何处使用 MultiIndex 取决于您,但如果您想要完全相同的形状,您可以执行以下操作:
data = data.reshape(5, 6).T
df = pd.DataFrame(
data=data,
index=pd.MultiIndex.from_product([[2015, 2016, 2017], ['US', 'UK']]),
columns=['item {}'.format(i) for i in range(1, 6)]
)
which yields the same DataFrame (use the names
parameter of pd.MultiIndex.from_product
if you want to name your indices):
产生相同的数据帧(如果要命名索引,请使用names
参数pd.MultiIndex.from_product
):
item 1 item 2 item 3 item 4 item 5
2015 US 9 6 3 2 5
UK 8 3 7 7 9
2016 US 7 7 8 7 5
UK 9 1 9 9 1
2017 US 1 8 1 3 1
UK 6 8 8 1 6
Now instead of pnl['item1 1']
, you use df['item 1']
(optionally df['item 1'].unstack()
); instead of pnl.xs(2015)
you use df.xs(2015)
and instead of pnl.xs('US', axis='minor')
, you use df.xs('US', level=1)
.
现在代替pnl['item1 1']
,您使用df['item 1']
(可选df['item 1'].unstack()
);而不是pnl.xs(2015)
你使用df.xs(2015)
,而不是pnl.xs('US', axis='minor')
,你使用df.xs('US', level=1)
.
As you see, this is just a matter of reshaping your initial 3D numpy array to 2D. You add the other (artificial) dimension with the help of MultiIndex.
如您所见,这只是将您的初始 3D numpy 数组重塑为 2D 的问题。您可以在 MultiIndex 的帮助下添加另一个(人工)维度。