Python matplotlib:在忽略缺失数据的点之间画线
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/14399689/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
matplotlib: drawing lines between points ignoring missing data
提问by gravenimage
I have a set of data which I want plotted as a line-graph. For each series, some data is missing (but different for each series). Currently matplotlib does not draw lines which skip missing data: for example
我有一组要绘制为折线图的数据。对于每个系列,都缺少一些数据(但每个系列都不同)。目前 matplotlib 不绘制跳过丢失数据的线:例如
import matplotlib.pyplot as plt
xs = range(8)
series1 = [1, 3, 3, None, None, 5, 8, 9]
series2 = [2, None, 5, None, 4, None, 3, 2]
plt.plot(xs, series1, linestyle='-', marker='o')
plt.plot(xs, series2, linestyle='-', marker='o')
plt.show()
results in a plot with gaps in the lines. How can I tell matplotlib to draw lines through the gaps? (I'd rather not have to interpolate the data).
结果是线条中有间隙的图。我如何告诉 matplotlib 通过间隙画线?(我宁愿不必插入数据)。
采纳答案by Thorsten Kranz
You can mask the NaN values this way:
您可以通过以下方式屏蔽 NaN 值:
import numpy as np
import matplotlib.pyplot as plt
xs = np.arange(8)
series1 = np.array([1, 3, 3, None, None, 5, 8, 9]).astype(np.double)
s1mask = np.isfinite(series1)
series2 = np.array([2, None, 5, None, 4, None, 3, 2]).astype(np.double)
s2mask = np.isfinite(series2)
plt.plot(xs[s1mask], series1[s1mask], linestyle='-', marker='o')
plt.plot(xs[s2mask], series2[s2mask], linestyle='-', marker='o')
plt.show()
This leads to
这将导致


回答by Adam Cadien
Without interpolation you'll need to remove the None's from the data. This also means you'll need to remove the X-values corresponding to None's in the series. Here's an (ugly) one liner for doing that:
如果没有插值,您需要从数据中删除 None 。这也意味着您需要删除与系列中的 None 对应的 X 值。这是这样做的(丑陋的)单衬:
x1Clean,series1Clean = zip(* filter( lambda x: x[1] is not None , zip(xs,series1) ))
The lambda function returns False for None values, filtering the x,series pairs from the list, it then re-zips the data back into its original form.
lambda 函数为 None 值返回 False,从列表中过滤 x,series 对,然后将数据重新压缩回其原始形式。
回答by JimP
For what it may be worth, after some trial and error I would like to add one clarification to Thorsten's solution. Hopefully saving time for users who looked elsewhere after having tried this approach.
对于它可能的价值,经过一些试验和错误之后,我想对 Thorsten 的解决方案进行澄清。希望为尝试这种方法后寻找其他地方的用户节省时间。
I was unable to get success with an identical problem while using
使用时我无法成功解决相同的问题
from pyplot import *
and attempting to plot with
并试图与
plot(abscissa[mask],ordinate[mask])
It seemed it was required to use import matplotlib.pyplot as pltto get the proper NaNs handling, though I cannot say why.
似乎需要使用它import matplotlib.pyplot as plt来获得正确的 NaN 处理,但我不能说为什么。
回答by Nasser Al-Wohaibi
Qouting @Rutger Kassies (link) :
引用@Rutger Kassies(链接):
Matplotlib only draws a line between consecutive (valid) data points, and leaves a gap at NaN values.
Matplotlib 仅在连续(有效)数据点之间画一条线,并在 NaN 值处留有空隙。
A solution if you are using Pandas, :
如果您使用的是Pandas,一个解决方案:
#pd.Series
s.dropna().plot() #masking (as @Thorsten Kranz suggestion)
#pd.DataFrame
df['a_col_ffill'] = df['a_col'].ffill(method='ffill')
df['b_col_ffill'] = df['b_col'].ffill(method='ffill') # changed from a to b
df[['a_col_ffill','b_col_ffill']].plot()
回答by DKB at NYU
Perhaps I missed the point, but I believe Pandas now does this automatically. The example below is a little involved, and requires internet access, but the line for China has lots of gaps in the early years, hence the straight line segments.
也许我没有抓住重点,但我相信 Pandas 现在会自动做到这一点。下面的例子有点牵强,需要上网,但是中国的线路早些年缺口很大,所以是直线段。
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# read data from Maddison project
url = 'http://www.ggdc.net/maddison/maddison-project/data/mpd_2013-01.xlsx'
mpd = pd.read_excel(url, skiprows=2, index_col=0, na_values=[' '])
mpd.columns = map(str.rstrip, mpd.columns)
# select countries
countries = ['England/GB/UK', 'USA', 'Japan', 'China', 'India', 'Argentina']
mpd = mpd[countries].dropna()
mpd = mpd.rename(columns={'England/GB/UK': 'UK'})
mpd = np.log(mpd)/np.log(2) # convert to log2
# plots
ax = mpd.plot(lw=2)
ax.set_title('GDP per person', fontsize=14, loc='left')
ax.set_ylabel('GDP Per Capita (1990 USD, log2 scale)')
ax.legend(loc='upper left', fontsize=10, handlelength=2, labelspacing=0.15)
fig = ax.get_figure()
fig.show()
回答by Markus Dutschke
A solution with pandas:
熊猫的解决方案:
import matplotlib.pyplot as plt
import pandas as pd
def splitSerToArr(ser):
return [ser.index, ser.as_matrix()]
xs = range(8)
series1 = [1, 3, 3, None, None, 5, 8, 9]
series2 = [2, None, 5, None, 4, None, 3, 2]
s1 = pd.Series(series1, index=xs)
s2 = pd.Series(series2, index=xs)
plt.plot( *splitSerToArr(s1.dropna()), linestyle='-', marker='o')
plt.plot( *splitSerToArr(s2.dropna()), linestyle='-', marker='o')
plt.show()
The splitSerToArrfunction is very handy, when plotting in Pandas. This is the output:
回答by Jayen
Another solution for pandas DataFrames:
pandas DataFrames 的另一种解决方案:
plot = df.plot(style='o-') # draw the lines so they appears in the legend
colors = [line.get_color() for line in plot.lines] # get the colors of the markers
df = df.interpolate(limit_area='inside') # interpolate
lines = plot.plot(df.index, df.values) # add more lines (with a new set of colors)
for color, line in zip(colors, lines):
line.set_color(color) # overwrite the new lines colors with the same colors as the old lines

