使用 Pandas DataFrame 在 python 中创建堆积面积图
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/46737999/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Creating a stacked area plot in python with a Pandas DataFrame
提问by bpdronkers
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
dates = np.arange(1990,2061, 1)
dates = dates.astype('str').astype('datetime64')
df = pd.DataFrame(np.random.randint(0, dates.size, size=(dates.size,3)), columns=list('ABC'))
df['year'] = dates
cols = df.columns.tolist()
cols = [cols[-1]] + cols[:-1]
df = df[cols]
fig = plt.figure()
ax = fig.add_subplot(1,1,1)
ax.stackplot(df['year'], df.drop('year',axis=1))
Based on this code, I'm getting an error "TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''"
基于此代码,我收到错误消息“TypeError:输入类型不支持 ufunc 'isfinite',并且无法根据转换规则''safe''将输入安全地强制转换为任何受支持的类型”
I'm trying to figure out how to plot a DataFrame object with years in the first column, and then stacked area from the subsequent columns (A, B, C)..
我试图弄清楚如何在第一列中绘制一个带有年份的 DataFrame 对象,然后从后续列(A、B、C)中堆叠区域。
Also, since I'm a complete beginner here... feel free to comment on my code as to make it cleaner / better. I understand that if I use Matplotlib instead of the Pandas integrated plot method, that I have more functionality to adjust things later on?
另外,因为我是一个完整的初学者......请随时对我的代码发表评论,以使其更清晰/更好。我知道如果我使用 Matplotlib 而不是 Pandas 集成绘图方法,我有更多的功能可以在以后进行调整吗?
Thanks!
谢谢!
回答by A. Entuluva
I run into two problems running your code.
我在运行您的代码时遇到了两个问题。
First, stackplot
seems to dislike using string representations of dates. Datetime data types are very finicky sometimes. Either use integers for your 'year' column, or use .values
to convert from pandas to numpy datatypes as described in this question
首先,stackplot
似乎不喜欢使用日期的字符串表示。日期时间数据类型有时非常挑剔。使用整数作为您的“年份”列,或用于.values
将Pandas转换为 numpy 数据类型,如本问题所述
Secondly, according to the documentation for stackplot, when you call stackplot(x, y)
if x
is a Nx1 array, then y
must be MxN, where M is the number of columns. Your df.drop('year',axis=1))
will end up as NxM and throw another error at you. If you take the transpose, however, you can make it work.
其次,根据stackplot的文档,当您调用stackplot(x, y)
ifx
是 Nx1 数组时,则y
必须是 MxN,其中 M 是列数。您df.drop('year',axis=1))
将最终成为 NxM 并向您抛出另一个错误。但是,如果您进行转置,则可以使其工作。
If I just replace your final line with
如果我只是用
ax.stackplot(df['year'].values, df.drop('year',axis=1).T)
I get a plot that looks like this:
我得到一个看起来像这样的情节: