Python 使用熊猫绘图方法设置图形大小时不一致

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/42215252/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 21:24:03  来源:igfitidea点击:

Inconsistency when setting figure size using pandas plot method

pythonpandasmatplotlibplot

提问by Bill

I'm trying to use the convenience of the plot method of a pandas dataframe while adjusting the size of the figure produced. (I'm saving the figures to file as well as displaying them inline in a Jupyter notebook). I found the method below successful most of the time, except when I plot two lines on the same chart - then the figure goes back to the default size.

我正在尝试使用 Pandas 数据框的 plot 方法的便利性,同时调整生成的图形的大小。(我将这些数字保存到文件中,并在 Jupyter 笔记本中内联显示它们)。我发现下面的方法大部分时间都是成功的,除非我在同一张图表上绘制两条线 - 然后数字恢复到默认大小。

I suspect this might be due to the differences between plot on a series and plot on a dataframe.

我怀疑这可能是由于系列图和数据框图之间的差异。

Setup example code:

设置示例代码:

data = {
    'A': 90 + np.random.randn(366),
    'B': 85 + np.random.randn(366)
}

date_range = pd.date_range('2016-01-01', '2016-12-31')

index = pd.Index(date_range, name='Date')

df = pd.DataFrame(data=data, index=index)

Control - this code produces the expected result (a wide plot):

控制 - 此代码产生预期的结果(一个广泛的情节):

fig = plt.figure(figsize=(10,4))

df['A'].plot()
plt.savefig("plot1.png")
plt.show()

Result:

结果:

plot1.png

绘图1.png

Plotting two lines - figure size is not (10,4)

绘制两条线 - 图形大小不是 (10,4)

fig = plt.figure(figsize=(10,4))

df[['A', 'B']].plot()
plt.savefig("plot2.png")
plt.show()

Result:

结果:

plot2.png

绘图2.png

What's the right way to do this so that the figure size is consistency set regardless of number of series selected?

这样做的正确方法是什么,以便无论选择的系列数量如何,图形大小都是一致的?

回答by ImportanceOfBeingErnest

The reason for the difference between the two cases is a bit hidden inside the logic of pandas.DataFrame.plot(). As one can see in the documentationthis method allows a lot of arguments to be passed such that it will handle all kinds of different cases.

两种情况不同的原因有点隐藏在pandas.DataFrame.plot(). 正如您在文档中看到的那样此方法允许传递大量参数,以便它可以处理各种不同的情况。

Here in the first case, you create a matplotlib figure via fig = plt.figure(figsize=(10,4))and then plot a single column DataFrame. Now the internal logic of pandas plot function is to check if there is already a figure present in the matplotlib state machine, and if so, use it's current axes to plot the columns values to it. This works as expected.

在第一种情况下,您通过创建一个 matplotlib 图fig = plt.figure(figsize=(10,4)),然后绘制一个单列 DataFrame。现在,pandas plot 函数的内部逻辑是检查 matplotlib 状态机中是否已经存在图形,如果存在,则使用它的当前轴将列值绘制到它。这按预期工作。

However in the second case, the data consists of two columns. There are several options how to handle such a plot, including using different subplots with shared or non-shared axes etc. In order for pandas to be able to apply any of those possible requirements, it will by default create a new figure to which it can add the axes to plot to. The new figure will not know about the already existing figure and its size, but rather have the default size, unless you specify the figsizeargument.

但是在第二种情况下,数据由两列组成。有几种方法可以处理这样的图,包括使用具有共享或非共享轴的不同子图等。为了让 Pandas 能够应用任何这些可能的要求,默认情况下它会创建一个新图形可以添加要绘制的轴。除非您指定figsize参数,否则新图窗不会知道现有图窗及其大小,而是具有默认大小。

In the comments, you say that a possible solution is to use df[['A', 'B']].plot(figsize=(10,4)). This is correct, but you then need to omit the creation of your initial figure. Otherwise it will produce 2 figures, which is probably undesired. In a notebook this will not be visible, but if you run this as a usual python script with plt.show()at the end, there will be two figure windows opening.

在评论中,您说可能的解决方案是使用df[['A', 'B']].plot(figsize=(10,4)). 这是正确的,但您需要省略初始图形的创建。否则它会产生 2 个数字,这可能是不受欢迎的。在笔记本中,这将不可见,但如果您将其作为通常的 python 脚本运行plt.show(),最后将打开两个图形窗口。

So the solution which lets pandas take care of figure creation is

所以让熊猫负责图形创建的解决方案是

import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame({"A":[2,3,1], "B":[1,2,2]})
df[['A', 'B']].plot(figsize=(10,4))

plt.show()

A way to circumvent the creation of a new figure is to supply the axargument to the pandas.DataFrame.plot(ax=ax)function, where axis an externally created axes. This axes can be the standard axes you obtain via plt.gca().

绕过创建新图形的一种方法是向函数提供ax参数pandas.DataFrame.plot(ax=ax),其中ax是外部创建的轴。该轴可以是您通过 获得的标准轴plt.gca()

import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame({"A":[2,3,1], "B":[1,2,2]})
plt.figure(figsize=(10,4))
df[['A', 'B']].plot(ax = plt.gca())

plt.show()

Alternatively use the more object oriented way seen in the answer from PaulH.

或者使用在PaulH答案中看到的更面向对象的方式。

回答by Paul H

Always operate explicitly and directly on your Figureand Axesobjects. Don't rely on the pyplotstate machine. In your case that means:

始终显式直接操作您的FigureAxes对象。不要依赖pyplot状态机。在你的情况下,这意味着:

fig1, ax1 = plt.subplots(figsize=(10,4))
df['A'].plot(ax=ax1)
fig1.savefig("plot1.png")


fig2, ax2 = plt.figure(figsize=(10,4)) 
df[['A', 'B']].plot(ax=ax2)
fig2.savefig("plot2.png")

plt.show()