Python 在一维 numpy 数组中使用 Numpy 查找局部最大值/最小值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4624970/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Finding local maxima/minima with Numpy in a 1D numpy array
提问by Navi
Can you suggest a module function from numpy/scipy that can find local maxima/minima in a 1D numpy array? Obviously the simplest approach ever is to have a look at the nearest neighbours, but I would like to have an accepted solution that is part of the numpy distro.
你能从 numpy/scipy 中推荐一个模块函数,它可以在一维 numpy 数组中找到局部最大值/最小值吗?显然,有史以来最简单的方法是查看最近的邻居,但我希望有一个可接受的解决方案,它是 numpy 发行版的一部分。
采纳答案by Sven Marnach
If you are looking for all entries in the 1d array asmaller than their neighbors, you can try
如果您要查找一维数组a中小于其邻居的所有条目,您可以尝试
numpy.r_[True, a[1:] < a[:-1]] & numpy.r_[a[:-1] < a[1:], True]
You could also smoothyour array before this step using numpy.convolve().
你也可以顺利使用这一步骤之前,你的阵列numpy.convolve()。
I don't think there is a dedicated function for this.
我认为没有专门的功能。
回答by Mike Vella
Update:I wasn't happy with gradient so I found it more reliable to use numpy.diff. Please let me know if it does what you want.
更新:我对渐变不满意,所以我发现使用numpy.diff. 请让我知道它是否符合您的要求。
Regarding the issue of noise, the mathematical problem is to locate maxima/minima if we want to look at noise we can use something like convolve which was mentioned earlier.
关于噪声问题,数学问题是定位最大值/最小值,如果我们想查看噪声,我们可以使用前面提到的卷积之类的东西。
import numpy as np
from matplotlib import pyplot
a=np.array([10.3,2,0.9,4,5,6,7,34,2,5,25,3,-26,-20,-29],dtype=np.float)
gradients=np.diff(a)
print gradients
maxima_num=0
minima_num=0
max_locations=[]
min_locations=[]
count=0
for i in gradients[:-1]:
count+=1
if ((cmp(i,0)>0) & (cmp(gradients[count],0)<0) & (i != gradients[count])):
maxima_num+=1
max_locations.append(count)
if ((cmp(i,0)<0) & (cmp(gradients[count],0)>0) & (i != gradients[count])):
minima_num+=1
min_locations.append(count)
turning_points = {'maxima_number':maxima_num,'minima_number':minima_num,'maxima_locations':max_locations,'minima_locations':min_locations}
print turning_points
pyplot.plot(a)
pyplot.show()
回答by R. C.
For curves with not too much noise, I recommend the following small code snippet:
对于没有太多噪音的曲线,我推荐以下小代码片段:
from numpy import *
# example data with some peaks:
x = linspace(0,4,1e3)
data = .2*sin(10*x)+ exp(-abs(2-x)**2)
# that's the line, you need:
a = diff(sign(diff(data))).nonzero()[0] + 1 # local min+max
b = (diff(sign(diff(data))) > 0).nonzero()[0] + 1 # local min
c = (diff(sign(diff(data))) < 0).nonzero()[0] + 1 # local max
# graphical output...
from pylab import *
plot(x,data)
plot(x[b], data[b], "o", label="min")
plot(x[c], data[c], "o", label="max")
legend()
show()
The +1is important, because diffreduces the original index number.
该+1是很重要的,因为diff降低了原始的指数。
回答by danodonovan
In SciPy >= 0.11
在 SciPy >= 0.11
import numpy as np
from scipy.signal import argrelextrema
x = np.random.random(12)
# for local maxima
argrelextrema(x, np.greater)
# for local minima
argrelextrema(x, np.less)
Produces
生产
>>> x
array([ 0.56660112, 0.76309473, 0.69597908, 0.38260156, 0.24346445,
0.56021785, 0.24109326, 0.41884061, 0.35461957, 0.54398472,
0.59572658, 0.92377974])
>>> argrelextrema(x, np.greater)
(array([1, 5, 7]),)
>>> argrelextrema(x, np.less)
(array([4, 6, 8]),)
Note, these are the indices of x that are local max/min. To get the values, try:
请注意,这些是 x 的局部最大值/最小值的索引。要获取值,请尝试:
>>> x[argrelextrema(x, np.greater)[0]]
scipy.signalalso provides argrelmaxand argrelminfor finding maxima and minima respectively.
scipy.signal还分别提供argrelmax和argrelmin用于查找最大值和最小值。
回答by BobC
Another approach (more words, less code) that may help:
另一种可能有帮助的方法(更多的话,更少的代码):
The locations of local maxima and minima are also the locations of the zero crossings of the first derivative. It is generally much easier to find zero crossings than it is to directly find local maxima and minima.
局部最大值和最小值的位置也是一阶导数过零的位置。找到零交叉点通常比直接找到局部最大值和最小值容易得多。
Unfortunately, the first derivative tends to "amplify" noise, so when significant noise is present in the original data, the first derivative is best used only after the original data has had some degree of smoothing applied.
不幸的是,一阶导数往往会“放大”噪声,因此当原始数据中存在显着噪声时,最好仅在对原始数据应用某种程度的平滑后才使用一阶导数。
Since smoothing is, in the simplest sense, a low pass filter, the smoothing is often best (well, most easily) done by using a convolution kernel, and "shaping" that kernel can provide a surprising amount of feature-preserving/enhancing capability. The process of finding an optimal kernel can be automated using a variety of means, but the best may be simple brute force (plenty fast for finding small kernels). A good kernel will (as intended) massively distort the original data, but it will NOT affect the location of the peaks/valleys of interest.
由于平滑在最简单的意义上是低通滤波器,因此平滑通常最好(好吧,最容易)通过使用卷积核来完成,并且“整形”该核可以提供惊人数量的特征保留/增强能力. 寻找最佳内核的过程可以使用多种方式自动化,但最好的方法可能是简单的蛮力(寻找小内核的速度非常快)。一个好的内核会(按预期)大量扭曲原始数据,但它不会影响感兴趣的峰/谷的位置。
Fortunately, quite often a suitable kernel can be created via a simple SWAG ("educated guess"). The width of the smoothing kernel should be a little wider than the widest expected "interesting" peak in the original data, and its shape will resemble that peak (a single-scaled wavelet). For mean-preserving kernels (what any good smoothing filter should be) the sum of the kernel elements should be precisely equal to 1.00, and the kernel should be symmetric about its center (meaning it will have an odd number of elements.
幸运的是,通常可以通过简单的 SWAG(“受过教育的猜测”)创建合适的内核。平滑内核的宽度应该比原始数据中最宽的预期“有趣”峰值稍宽,其形状将类似于该峰值(单尺度小波)。对于保持均值的内核(任何好的平滑滤波器应该是什么),内核元素的总和应该精确地等于 1.00,并且内核应该关于其中心对称(意味着它将具有奇数个元素。
Given an optimal smoothing kernel (or a small number of kernels optimized for different data content), the degree of smoothing becomes a scaling factor for (the "gain" of) the convolution kernel.
给定最佳平滑内核(或针对不同数据内容优化的少量内核),平滑程度成为卷积内核(“增益”)的缩放因子。
Determining the "correct" (optimal) degree of smoothing (convolution kernel gain) can even be automated: Compare the standard deviation of the first derivative data with the standard deviation of the smoothed data. How the ratio of the two standard deviations changes with changes in the degree of smoothing cam be used to predict effective smoothing values. A few manual data runs (that are truly representative) should be all that's needed.
甚至可以自动确定“正确”(最佳)平滑度(卷积核增益):将一阶导数数据的标准偏差与平滑数据的标准偏差进行比较。两个标准偏差的比率如何随着平滑程度的变化而变化可以用来预测有效的平滑值。一些手动数据运行(真正具有代表性)应该是所有需要的。
All the prior solutions posted above compute the first derivative, but they don't treat it as a statistical measure, nor do the above solutions attempt to performing feature preserving/enhancing smoothing (to help subtle peaks "leap above" the noise).
上面发布的所有先前解决方案都计算一阶导数,但他们不将其视为统计量度,上述解决方案也没有尝试执行特征保留/增强平滑(以帮助细微的峰值“超越”噪声)。
Finally, the bad news: Finding "real" peaks becomes a royal pain when the noise also has features that look like real peaks (overlapping bandwidth). The next more-complex solution is generally to use a longer convolution kernel (a "wider kernel aperture") that takes into account the relationship between adjacent "real" peaks (such as minimum or maximum rates for peak occurrence), or to use multiple convolution passes using kernels having different widths (but only if it is faster: it is a fundamental mathematical truth that linear convolutions performed in sequence can always be convolved together into a single convolution). But it is often far easier to first find a sequence of useful kernels (of varying widths) and convolve them together than it is to directly find the final kernel in a single step.
最后,坏消息是:当噪声也具有看起来像真正的峰值(重叠带宽)的特征时,找到“真正的”峰值就变得非常痛苦。下一个更复杂的解决方案通常是使用更长的卷积核(“更宽的核孔径”),它考虑到相邻“真实”峰值之间的关系(例如峰值出现的最小或最大速率),或者使用多个卷积使用具有不同宽度的内核传递(但前提是速度更快:按顺序执行的线性卷积总是可以一起卷积成单个卷积,这是一个基本的数学真理)。但是,与在单个步骤中直接找到最终内核相比,首先找到一系列有用的内核(不同宽度)并将它们卷积在一起通常要容易得多。
Hopefully this provides enough info to let Google (and perhaps a good stats text) fill in the gaps. I really wish I had the time to provide a worked example, or a link to one. If anyone comes across one online, please post it here!
希望这提供了足够的信息让谷歌(也许还有一个很好的统计文本)填补空白。我真的希望我有时间提供一个有效的例子,或者一个链接。如果有人在网上遇到过,请在这里发布!
回答by prtkp
import numpy as np
x=np.array([6,3,5,2,1,4,9,7,8])
y=np.array([2,1,3,5,3,9,8,10,7])
sortId=np.argsort(x)
x=x[sortId]
y=y[sortId]
minm = np.array([])
maxm = np.array([])
i = 0
while i < length-1:
if i < length - 1:
while i < length-1 and y[i+1] >= y[i]:
i+=1
if i != 0 and i < length-1:
maxm = np.append(maxm,i)
i+=1
if i < length - 1:
while i < length-1 and y[i+1] <= y[i]:
i+=1
if i < length-1:
minm = np.append(minm,i)
i+=1
print minm
print maxm
minmand maxmcontain indices of minima and maxima, respectively. For a huge data set, it will give lots of maximas/minimas so in that case smooth the curve first and then apply this algorithm.
minm并分别maxm包含最小值和最大值的索引。对于庞大的数据集,它会给出很多最大值/最小值,因此在这种情况下,首先平滑曲线,然后应用此算法。
回答by A STEFANI
Why not use Scipy built-in function signal.find_peaks_cwtto do the job ?
为什么不使用 Scipy 内置函数signal.find_peaks_cwt来完成这项工作?
from scipy import signal
import numpy as np
#generate junk data (numpy 1D arr)
xs = np.arange(0, np.pi, 0.05)
data = np.sin(xs)
# maxima : use builtin function to find (max) peaks
max_peakind = signal.find_peaks_cwt(data, np.arange(1,10))
# inverse (in order to find minima)
inv_data = 1/data
# minima : use builtin function fo find (min) peaks (use inversed data)
min_peakind = signal.find_peaks_cwt(inv_data, np.arange(1,10))
#show results
print "maxima", data[max_peakind]
print "minima", data[min_peakind]
results:
结果:
maxima [ 0.9995736]
minima [ 0.09146464]
Regards
问候
回答by Misha Smirnov
None of these solutions worked for me since I wanted to find peaks in the center of repeating values as well. for example, in
这些解决方案都不适合我,因为我也想在重复值的中心找到峰值。例如,在
ar = np.array([0,1,2,2,2,1,3,3,3,2,5,0])
ar = np.array([0,1,2,2,2,1,3,3,3,2,5,0])
the answer should be
答案应该是
array([ 3, 7, 10], dtype=int64)
I did this using a loop. I know it's not super clean, but it gets the job done.
我使用循环来做到这一点。我知道它不是超级干净,但它完成了工作。
def findLocalMaxima(ar):
# find local maxima of array, including centers of repeating elements
maxInd = np.zeros_like(ar)
peakVar = -np.inf
i = -1
while i < len(ar)-1:
#for i in range(len(ar)):
i += 1
if peakVar < ar[i]:
peakVar = ar[i]
for j in range(i,len(ar)):
if peakVar < ar[j]:
break
elif peakVar == ar[j]:
continue
elif peakVar > ar[j]:
peakInd = i + np.floor(abs(i-j)/2)
maxInd[peakInd.astype(int)] = 1
i = j
break
peakVar = ar[i]
maxInd = np.where(maxInd)[0]
return maxInd
回答by Dave
While this question is really old. I believe there is a much simpler approach in numpy (a one liner).
虽然这个问题真的很老。我相信在 numpy(单行)中有一种更简单的方法。
import numpy as np
list = [1,3,9,5,2,5,6,9,7]
np.diff(np.sign(np.diff(list))) #the one liner
#output
array([ 0, -2, 0, 2, 0, 0, -2])
To find a local max or min we essentially want to find when the difference between the values in the list (3-1, 9-3...) changes from positive to negative (max) or negative to positive (min). Therefore, first we find the difference. Then we find the sign, and then we find the changes in sign by taking the difference again. (Sort of like a first and second derivative in calculus, only we have discrete data and don't have a continuous function.)
为了找到局部最大值或最小值,我们本质上想要找到列表中的值之间的差异 (3-1, 9-3...) 何时从正变为负 (max) 或从负变为正 (min)。因此,我们首先找出不同之处。然后我们找到符号,然后我们通过再次取差来找到符号的变化。(有点像微积分中的一阶和二阶导数,只有我们有离散数据并且没有连续函数。)
The output in my example does not contain the extrema (the first and last values in the list). Also, just like calculus, if the second derivative is negative, you have max, and if it is positive you have a min.
我的示例中的输出不包含极值(列表中的第一个和最后一个值)。此外,就像微积分一样,如果二阶导数为负,则为最大值,如果为正,则为最小值。
Thus we have the following matchup:
因此,我们有以下匹配:
[1, 3, 9, 5, 2, 5, 6, 9, 7]
[0, -2, 0, 2, 0, 0, -2]
Max Min Max
回答by Cleb
As of SciPy version 1.1, you can also use find_peaks. Below are two examples taken from the documentation itself.
从 SciPy 1.1 版开始,您还可以使用find_peaks。以下是取自文档本身的两个示例。
Using the heightargument, one can select all maxima above a certain threshold (in this example, all non-negative maxima; this can be very useful if one has to deal with a noisy baseline; if you want to find minima, just multiply you input by -1):
使用该height参数,可以选择高于某个阈值的所有最大值(在此示例中,所有非负最大值;如果必须处理嘈杂的基线,这可能非常有用;如果您想找到最小值,只需乘以您的输入通过-1):
import matplotlib.pyplot as plt
from scipy.misc import electrocardiogram
from scipy.signal import find_peaks
import numpy as np
x = electrocardiogram()[2000:4000]
peaks, _ = find_peaks(x, height=0)
plt.plot(x)
plt.plot(peaks, x[peaks], "x")
plt.plot(np.zeros_like(x), "--", color="gray")
plt.show()
Another extremely helpful argument is distance, which defines the minimum distance between two peaks:
另一个非常有用的参数是distance,它定义了两个峰之间的最小距离:
peaks, _ = find_peaks(x, distance=150)
# difference between peaks is >= 150
print(np.diff(peaks))
# prints [186 180 177 171 177 169 167 164 158 162 172]
plt.plot(x)
plt.plot(peaks, x[peaks], "x")
plt.show()


