使用数据框列值的 Python Pandas 图

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/42744189/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 03:09:44  来源:igfitidea点击:

Python Pandas plot using dataframe column values

pythonpandasnumpyplotyahoo-finance

提问by LKM

I'm trying to plot a graph using dataframes.

我正在尝试使用数据框绘制图形。

I'm using 'pandas_datareader' to get the data.

我正在使用“pandas_datareader”来获取数据。

so my code is below:

所以我的代码如下:

tickers = ["AAPL","GOOG","MSFT","XOM","BRK-A","FB","JNJ","GE","AMZN","WFC"]
import pandas_datareader.data as web
import datetime as dt
end = dt.datetime.now().strftime("%Y-%m-%d")
start = (dt.datetime.now()-dt.timedelta(days=365*3)).strftime("%Y-%m-%d")
%matplotlib inline
import matplotlib.pyplot as plt
import pandas as pd
data = []
for ticker in tickers:
    sub_df = web.get_data_yahoo(ticker, start, end)
    sub_df["name"] = ticker
    data.append(sub_df)
data = pd.concat(data)

So in the variable data, there are 8 columns = ['Date', 'Open', 'High' ,'Low' ,'Close' 'Volume', 'Adj Close','name']

所以在变量中data,有 8 列 =['Date', 'Open', 'High' ,'Low' ,'Close' 'Volume', 'Adj Close','name']

The variable 'data' is shown below: enter image description here

变量“数据”如下所示: 在此处输入图片说明

What I want to do is to plot a graph taking 'date' values as x-parameter , 'high' as y-parameter with multiple columns as 'name' column values(=["AAPL","GOOG","MSFT","XOM","BRK-A","FB","JNJ","GE","AMZN","WFC"]).

我想要做的是绘制一个图形,将“日期”值作为 x 参数,“高”作为 y 参数,将多列作为“名称”列值(=[“AAPL”,“GOOG”,“MSFT” 、“XOM”、“BRK-A”、“FB”、“JNJ”、“GE”、“AMZN”、“WFC”])。

How can I do this?

我怎样才能做到这一点?

When i executed data.plot(), the result takes dataas x-parameter well but there are 5 columns ['open','high','low','close','volume','adj close']not 7 columns ["AAPL","GOOG","MSFT","XOM","BRK-A","FB","JNJ","GE","AMZN","WFC"]: what i want to do. The result is below: enter image description here

当我执行时data.plot(),结果data很好地作为 x 参数,但有 5 列['open','high','low','close','volume','adj close']而不是 7 列["AAPL","GOOG","MSFT","XOM","BRK-A","FB","JNJ","GE","AMZN","WFC"]:我想要做什么。结果如下: 在此处输入图片说明

回答by Psidom

You need to reshape your data so that the names become the header of the data frame, here since you want to plot Highonly, you can extract the Highand namecolumns, and transform it to wide format, then do the plot:

您需要重塑数据,使名称成为数据框的标题,因为您只想绘制High,因此可以提取Highname列,并将其转换为宽格式,然后进行绘图:

import matplotlib as mpl
mpl.rcParams['savefig.dpi'] = 120

high = data[["High", "name"]].set_index("name", append=True).High.unstack("name")

# notice here I scale down the BRK-A column so that it will be at the same scale as other columns
high['BRK-A'] = high['BRK-A']/1000
high.head()

enter image description here

在此处输入图片说明

ax = high.plot(figsize = (16, 10))

enter image description here

在此处输入图片说明

回答by MaxU

@Psidomand @Grrhave already given you very good answers.

@Psidom@Grr已经给了你很好的答案。

I just wanted to add that pandas_datareaderallows us to read all data into a Pandas.Panel conviniently in one step:

我只是想补充一点,它pandas_datareader允许我们一步轻松地将所有数据读入 Pandas.Panel:

p = web.DataReader(tickers, 'yahoo', start, end)

now we can easily slice it as we wish

现在我们可以根据需要轻松切片

# i'll intentionally exclude `BRK-A` as it spoils the whole graph
p.loc['High', :, ~p.minor_axis.isin(['BRK-A'])].plot(figsize=(10,8))

enter image description here

在此处输入图片说明

alternatively you can slice on the fly and save only Highvalues:

或者,您可以即时切片并仅保存High值:

df = web.DataReader(tickers, 'yahoo', start, end).loc['High']

which gives us:

这给了我们:

In [68]: df
Out[68]:
                  AAPL        AMZN     BRK-A          FB         GE         GOOG         JNJ       MSFT        WFC        XOM
Date
2014-03-13  539.659988  383.109985  188852.0   71.349998  26.000000  1210.502120   94.199997  38.450001  48.299999  94.570000
2014-03-14  530.890015  378.570007  186507.0   69.430000  25.379999  1190.872020   93.440002  38.139999  48.070000  94.220001
2014-03-17  529.969994  378.850006  185790.0   68.949997  25.629999  1197.072063   94.180000  38.410000  48.169998  94.529999
2014-03-18  531.969986  379.000000  185400.0   69.599998  25.730000  1211.532091   94.239998  39.900002  48.450001  95.250000
2014-03-19  536.239990  379.000000  185489.0   69.290001  25.700001  1211.992061   94.360001  39.549999  48.410000  95.300003
2014-03-20  532.669975  373.000000  186742.0   68.230003  25.370001  1209.612076   94.190002  40.650002  49.360001  94.739998
2014-03-21  533.750000  372.839996  188598.0   67.919998  25.830000  1209.632048   95.930000  40.939999  49.970001  95.989998
...                ...         ...       ...         ...        ...          ...         ...        ...        ...        ...
2017-03-02  140.279999  854.820007  266445.0  137.820007  30.230000   834.510010  124.360001  64.750000  59.790001  84.250000
2017-03-03  139.830002  851.989990  264690.0  137.330002  30.219999   831.359985  123.930000  64.279999  59.240002  83.599998
2017-03-06  139.770004  848.489990  263760.0  137.830002  30.080000   828.880005  124.430000  64.559998  58.880001  82.900002
2017-03-07  139.979996  848.460022  263560.0  138.369995  29.990000   833.409973  124.459999  64.779999  58.520000  83.290001
2017-03-08  139.800003  853.070007  263900.0  137.990005  29.940001   838.150024  124.680000  65.080002  59.130001  82.379997
2017-03-09  138.789993  856.400024  263620.0  138.570007  29.830000   842.000000  126.209999  65.199997  58.869999  81.720001
2017-03-10  139.360001  857.349976  263800.0  139.490005  30.430000   844.909973  126.489998  65.260002  59.180000  82.470001

[755 rows x 10 columns]

回答by Grr

You should group your data by nameand then plot. Something like data.groupby('name').plot ()should get you started. You may need to feed in dateas the x value and highfor the y. Cant test it myself at the moment as i am on mobile.

您应该将数据分组name,然后进行绘图。类似的东西data.groupby('name').plot ()应该让你开始。您可能需要date输入 x 值和highy值。由于我在移动设备上,目前无法自己测试。

Update

更新

After getting to a computer this I realized I was a bit off. You would need to reset the index before grouping then plot and finally update the legend. Like so:

拿到电脑后,我意识到我有点不对劲。您需要在分组之前重置索引,然后绘制并最终更新图例。像这样:

fig, ax = plt.subplots()
names = data.name.unique()
data.reset_index().groupby('name').plot(x='Date', y='High', ax=ax)
plt.legent(names)
plt.show()

Granted if you want this graph to make any sense you will need to do some form of adjustment for values as BRK-A is far more expensive than any of the other equities.

当然,如果您希望此图表有意义,您将需要对价值进行某种形式的调整,因为 BRK-A 比任何其他股票都贵得多。