pandas 使用 Seaborn FacetGrid 绘制时间序列
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/25702017/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Plotting time series using Seaborn FacetGrid
提问by 8one6
I have a DataFrame (data) with a simple integer index and 5 columns. The columns are Date, Country, AgeGroup, Gender, Stat. (Names changed to protect the innocent.) I would like to produce a FacetGridwhere the Countrydefines the row, AgeGroupdefines the column, and Genderdefines the hue. For each of those particulars, I would like to produce a time series graph. I.e. I should get an array of graphs each of which has 2 time series on it (1 male, 1 female). I can get very close with:
我有一个data带有简单整数索引和 5 列的 DataFrame ( )。列是Date, Country, AgeGroup, Gender, Stat。(名称变更为保护无辜)。我想以产生FacetGrid其中Country定义的行,AgeGroup定义列,并Gender限定了色调。对于这些细节中的每一个,我想制作一个时间序列图。即我应该得到一组图表,每个图表都有 2 个时间序列(1 男,1 女)。我可以非常接近:
g = sns.FacetGrid(data, row='Country', col='AgeGroup', hue='Gender')
g.map(plt.plot, 'Stat')
However this just gives me the sample number on the x-axis rather than the dates. Is there a quick fix in this context.
然而,这只是给了我 x 轴上的样本编号而不是日期。在这种情况下是否有快速修复。
More generally, I understand that the approach with FacetGridis to make the grid and then mapa plotting function to it. If I wanted to roll my own plotting function, what are the conventions it needs to follow? In particular, how can I write my own plotting function (to pass to mapfor FacetGrid) that accepts multiple columns worth of data from my dataset?
更一般地说,我理解这种方法FacetGrid是制作网格,然后对其map进行绘图功能。如果我想推出自己的绘图功能,它需要遵循哪些约定?特别是,我如何编写自己的绘图函数(传递给mapfor FacetGrid)来接受来自我的数据集的多列数据?
回答by mwaskom
I'll answer your more general question first. The rules for functions that you can pass to FacetGrid.mapare:
我会先回答你更一般的问题。您可以传递给函数的规则FacetGrid.map是:
- They must take array-like inputs as positional arguments, with the first argument corresponding to the x axis and the second argument corresponding to the y axis (though, more on the second condition shortly
- They must also accept two keyword arguments:
color, andlabel. If you want to use ahuevariable than these should get passed to the underlying plotting function, though you can just catch**kwargsand not do anything with them if it's not relevant to the specific plot you're making. - When called, they must draw a plot on the "currently active" matplotlib Axes.
- 它们必须将类似数组的输入作为位置参数,第一个参数对应于 x 轴,第二个参数对应于 y 轴(不过,很快就会有更多关于第二个条件的信息)
- 它们还必须接受两个关键字参数:
color, 和label。如果你想使用一个hue变量,那么这些变量应该传递给底层的绘图函数,但**kwargs如果它与你正在制作的特定绘图无关,你可以只捕捉它们而不对它们做任何事情。 - 当被调用时,他们必须在“当前活动”的 matplotlib 轴上绘制一个图。
There may be cases where your function draws a plot that looks correct without taking x, y, positional inputs. I think that's basically what's going on here with the way you're using plt.plot. It can be easier then to just call, e.g., g.set_axis_labels("Date", "Stat")after you use map, which will rename your axes properly. You may also want to do g.set(xticklabels=dates)to get more meaningful ticks.
在某些情况下,您的函数绘制的图在没有采用x, y, 位置输入的情况下看起来是正确的。我认为这基本上就是您使用plt.plot. 可以更容易地调用,例如,g.set_axis_labels("Date", "Stat")在您使用之后map,这将正确重命名您的轴。您可能还想g.set(xticklabels=dates)获得更有意义的刻度。
There is also a more general function, FacetGrid.map_dataframe. The rules here are similar, but the function you pass must accept a dataframe input in a parameter called data, and instead of taking array-like positional inputs it takes strings that correspond to variables in that dataframe. On each iteration through the facets, the function will be called with the input dataframe masked to just the values for that combination of row, col, and huelevels.
还有一个更通用的函数,FacetGrid.map_dataframe. 这里的规则是相似的,但你传递的函数必须接受一个名为 的参数中的数据帧输入data,而不是采用类似数组的位置输入,而是采用与该数据帧中的变量相对应的字符串。通过刻面每一次迭代,该功能将被屏蔽,只是将值的该组合的输入数据帧调用row,col和hue水平。
So in your specific case, you'll need to write a function that we can call plot_by_datethat should look something like this:
因此,在您的特定情况下,您需要编写一个我们可以调用的函数,该函数plot_by_date应如下所示:
def plot_by_date(x, y, color=None, label=None):
...
(I'd be more helpful on the body, but I don't actually know how to do much with dates and matplotlib). The end result is that when you call this function it should plot on the currently-active Axes. Then do
(我会对身体更有帮助,但我实际上不知道如何处理日期和 matplotlib)。最终结果是,当您调用此函数时,它应该绘制在当前活动的轴上。然后做
g.map(plot_by_date, "Date", "Stat")
And it should work, I think.
我认为它应该有效。

