Python matplotlib:在忽略缺失数据的点之间画线

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/14399689/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 11:18:04  来源:igfitidea点击:

matplotlib: drawing lines between points ignoring missing data

pythonmatplotlib

提问by gravenimage

I have a set of data which I want plotted as a line-graph. For each series, some data is missing (but different for each series). Currently matplotlib does not draw lines which skip missing data: for example

我有一组要绘制为折线图的数据。对于每个系列,都缺少一些数据(但每个系列都不同)。目前 matplotlib 不绘制跳过丢失数据的线:例如

import matplotlib.pyplot as plt

xs = range(8)
series1 = [1, 3, 3, None, None, 5, 8, 9]
series2 = [2, None, 5, None, 4, None, 3, 2]

plt.plot(xs, series1, linestyle='-', marker='o')
plt.plot(xs, series2, linestyle='-', marker='o')

plt.show()

results in a plot with gaps in the lines. How can I tell matplotlib to draw lines through the gaps? (I'd rather not have to interpolate the data).

结果是线条中有间隙的图。我如何告诉 matplotlib 通过间隙画线?(我宁愿不必插入数据)。

采纳答案by Thorsten Kranz

You can mask the NaN values this way:

您可以通过以下方式屏蔽 NaN 值:

import numpy as np
import matplotlib.pyplot as plt

xs = np.arange(8)
series1 = np.array([1, 3, 3, None, None, 5, 8, 9]).astype(np.double)
s1mask = np.isfinite(series1)
series2 = np.array([2, None, 5, None, 4, None, 3, 2]).astype(np.double)
s2mask = np.isfinite(series2)

plt.plot(xs[s1mask], series1[s1mask], linestyle='-', marker='o')
plt.plot(xs[s2mask], series2[s2mask], linestyle='-', marker='o')

plt.show()

This leads to

这将导致

Plot

阴谋

回答by Adam Cadien

Without interpolation you'll need to remove the None's from the data. This also means you'll need to remove the X-values corresponding to None's in the series. Here's an (ugly) one liner for doing that:

如果没有插值,您需要从数据中删除 None 。这也意味着您需要删除与系列中的 None 对应的 X 值。这是这样做的(丑陋的)单衬:

  x1Clean,series1Clean = zip(* filter( lambda x: x[1] is not None , zip(xs,series1) ))

The lambda function returns False for None values, filtering the x,series pairs from the list, it then re-zips the data back into its original form.

lambda 函数为 None 值返回 False,从列表中过滤 x,series 对,然后将数据重新压缩回其原始形式。

回答by JimP

For what it may be worth, after some trial and error I would like to add one clarification to Thorsten's solution. Hopefully saving time for users who looked elsewhere after having tried this approach.

对于它可能的价值,经过一些试验和错误之后,我想对 Thorsten 的解决方案进行澄清。希望为尝试这种方法后寻找其他地方的用户节省时间。

I was unable to get success with an identical problem while using

使用时我无法成功解决相同的问题

from pyplot import *

and attempting to plot with

并试图与

plot(abscissa[mask],ordinate[mask])

It seemed it was required to use import matplotlib.pyplot as pltto get the proper NaNs handling, though I cannot say why.

似乎需要使用它import matplotlib.pyplot as plt来获得正确的 NaN 处理,但我不能说为什么。

回答by Nasser Al-Wohaibi

Qouting @Rutger Kassies (link) :

引用@Rutger Kassies(链接):

Matplotlib only draws a line between consecutive (valid) data points, and leaves a gap at NaN values.

Matplotlib 仅在连续(有效)数据点之间画一条线,并在 NaN 值处留有空隙。

A solution if you are using Pandas, :

如果您使用的是Pandas,一个解决方案:

#pd.Series 
s.dropna().plot() #masking (as @Thorsten Kranz suggestion)

#pd.DataFrame
df['a_col_ffill'] = df['a_col'].ffill(method='ffill')
df['b_col_ffill'] = df['b_col'].ffill(method='ffill')  # changed from a to b
df[['a_col_ffill','b_col_ffill']].plot()

回答by DKB at NYU

Perhaps I missed the point, but I believe Pandas now does this automatically. The example below is a little involved, and requires internet access, but the line for China has lots of gaps in the early years, hence the straight line segments.

也许我没有抓住重点,但我相信 Pandas 现在会自动做到这一点。下面的例子有点牵强,需要上网,但是中国的线路早些年缺口很大,所以是直线段。

import pandas as pd 
import numpy as np 
import matplotlib.pyplot as plt

# read data from Maddison project 
url = 'http://www.ggdc.net/maddison/maddison-project/data/mpd_2013-01.xlsx'
mpd = pd.read_excel(url, skiprows=2, index_col=0, na_values=[' ']) 
mpd.columns = map(str.rstrip, mpd.columns)

# select countries 
countries = ['England/GB/UK', 'USA', 'Japan', 'China', 'India', 'Argentina']
mpd = mpd[countries].dropna()
mpd = mpd.rename(columns={'England/GB/UK': 'UK'})
mpd = np.log(mpd)/np.log(2)  # convert to log2 

# plots
ax = mpd.plot(lw=2)
ax.set_title('GDP per person', fontsize=14, loc='left')
ax.set_ylabel('GDP Per Capita (1990 USD, log2 scale)')
ax.legend(loc='upper left', fontsize=10, handlelength=2, labelspacing=0.15)
fig = ax.get_figure()
fig.show() 

回答by Markus Dutschke

A solution with pandas:

熊猫的解决方案:

import matplotlib.pyplot as plt
import pandas as pd

def splitSerToArr(ser):
    return [ser.index, ser.as_matrix()]


xs = range(8)
series1 = [1, 3, 3, None, None, 5, 8, 9]
series2 = [2, None, 5, None, 4, None, 3, 2]

s1 = pd.Series(series1, index=xs)
s2 = pd.Series(series2, index=xs)

plt.plot( *splitSerToArr(s1.dropna()), linestyle='-', marker='o')
plt.plot( *splitSerToArr(s2.dropna()), linestyle='-', marker='o')

plt.show()

The splitSerToArrfunction is very handy, when plotting in Pandas. This is the output:enter image description here

splitSerToArr在 Pandas 中绘图时,该功能非常方便。这是输出:在此处输入图片说明

回答by Jayen

Another solution for pandas DataFrames:

pandas DataFrames 的另一种解决方案:

plot = df.plot(style='o-') # draw the lines so they appears in the legend
colors = [line.get_color() for line in plot.lines] # get the colors of the markers
df = df.interpolate(limit_area='inside') # interpolate
lines = plot.plot(df.index, df.values) # add more lines (with a new set of colors)
for color, line in zip(colors, lines):
  line.set_color(color) # overwrite the new lines colors with the same colors as the old lines