计算中位数 - javascript

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/45309447/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-23 03:00:02  来源:igfitidea点击:

Calculating median - javascript

javascriptmedian

提问by Muhammed Bayram

I've been trying to calculate medianbut still I've got some mathematical issues I guess as I couldn't get the correct median value and couldn't figure out why. Here's the code;

我一直在尝试计算中位数,但我仍然有一些数学问题,我猜是因为我无法获得正确的中位数,也无法弄清楚原因。这是代码;

class StatsCollector {

    constructor() {
        this.inputNumber = 0;
        this.average = 0;

        this.timeout = 19000;

        this.frequencies = new Map();
        for (let i of Array(this.timeout).keys()) {
            this.frequencies.set(i, 0);
        }
    }

    pushValue(responseTimeMs) {
        let req = responseTimeMs;
        if (req > this.timeout) {
            req = this.timeout;
        }

        this.average = (this.average * this.inputNumber + req) / (this.inputNumber + 1);

        console.log(responseTimeMs / 1000)
        let groupIndex = Math.floor(responseTimeMs / 1000);
        this.frequencies.set(groupIndex, this.frequencies.get(groupIndex) + 1);

        this.inputNumber += 1;
    }

    getMedian() {
        let medianElement = 0;
        if (this.inputNumber <= 0) {
            return 0;
        }
        if (this.inputNumber == 1) {
            return this.average
        }
        if (this.inputNumber == 2) {
            return this.average
        }
        if (this.inputNumber > 2) {
            medianElement = this.inputNumber / 2;
        }

        let minCumulativeFreq = 0;
        let maxCumulativeFreq = 0;
        let cumulativeFreq = 0;
        let freqGroup = 0;
        for (let i of Array(20).keys()) {
            if (medianElement <= cumulativeFreq + this.frequencies.get(i)) {
                minCumulativeFreq = cumulativeFreq;
                maxCumulativeFreq = cumulativeFreq + this.frequencies.get(i);
                freqGroup = i;
                break;
            }
            cumulativeFreq += this.frequencies.get(i);
        }

        return (((medianElement - minCumulativeFreq) / (maxCumulativeFreq - minCumulativeFreq)) + (freqGroup)) * 1000;
    }

    getAverage() {
        return this.average;
    }

}

Here's the snapshot of the results when I enter the values of

这是我输入值时的结果快照

342,654,987,1093,2234,6243,7087,20123

342,654,987,1093,2234,6243,7087,20123

enter image description here

在此处输入图片说明

The correct result should be;

正确的结果应该是;

Median: 1663.5

中位数:1663.5

回答by jdmdevdotnet

Change your median method to this:

将您的中位数方法更改为:

function median(values){
  if(values.length ===0) return 0;

  values.sort(function(a,b){
    return a-b;
  });

  var half = Math.floor(values.length / 2);

  if (values.length % 2)
    return values[half];

  return (values[half - 1] + values[half]) / 2.0;
}

fiddle

小提琴

回答by boisvert

The solutions above - sort then find middle - are fine, but slow on large data sets. Sorting the data first has a complexity of n x log(n).

上面的解决方案 - 排序然后找到中间 - 很好,但在大型数据集上很慢。首先对数据进行排序的复杂度为 nx log(n)。

There is a faster median algorithm, which consists in segregating the array in two according to a pivot, then looking for the median in the larger set. Here is some javascript code, but here is a more detailed explanation

有一种更快的中值算法,它包括根据枢轴将数组分成两部分,然后在更大的集合中寻找中值。这是一些javascript代码,但这里有更详细的解释

// Trying some array
alert(quickselect_median([7,3,5])); // 2300,5,4,0,123,2,76,768,28]));

function quickselect_median(arr) {
   const L = arr.length, halfL = L/2;
   if (L % 2 == 1)
      return quickselect(arr, halfL);
   else
      return 0.5 * (quickselect(arr, halfL - 1) + quickselect(arr, halfL));
}

function quickselect(arr, k) {
   // Select the kth element in arr
   // arr: List of numerics
   // k: Index
   // return: The kth element (in numerical order) of arr
   if (arr.length == 1)
      return arr[0];
   else {
      const pivot = arr[0];
      const lows = arr.filter((e)=>(e<pivot));
      const highs = arr.filter((e)=>(e>pivot));
      const pivots = arr.filter((e)=>(e==pivot));
      if (k < lows.length) // the pivot is too high
         return quickselect(lows, k);
      else if (k < lows.length + pivots.length)// We got lucky and guessed the median
         return pivot;
      else // the pivot is too low
         return quickselect(highs, k - lows.length - pivots.length);
   }
}

Astute readers will notice a few things:

细心的读者会注意到以下几点:

  1. I simply transliterated Russel Cohen's Python solution into JS, so all kudos to him.
  2. There are several small optimisations worth doing, but there's parallelisation worth doing, and the code as is is easier to change in either a quicker single-threaded, or quicker multi-threaded, version.
  3. This is the average linear timealgorithm, there is more efficient a deterministiclinear time version, see Russel's postfor details, including performance data.
  1. 我只是简单地将 Russel Cohen 的 Python 解决方案音译成 JS,所以我对他表示敬意。
  2. 有几个小的优化值得做,但也有值得做的并行化,而且代码在更快的单线程或更快的多线程版本中更容易更改。
  3. 这是平均线性时间算法,有更高效的确定性线性时间版本,请参阅Russel 的帖子了解详细信息,包括性能数据。

ADDITION 19 Sept. 2019:

2019 年 9 月 19 日补充:

One comment asks whether this is worth doing in javascript. I ran the code in JSPerfand it gives interesting results.

一个评论询问这是否值得在 javascript 中做。我在JSPerf 中运行了代码,它给出了有趣的结果。

  • if the array has an odd number of elements (one figure to find), sorting is 20% slower that this "fast median" proposition.

  • if there is an even number of elements, the "fast" algorithm is 40% slower, because it filters through the data twice, to find elements number k and k+1 to average. It is possible to write a version of fast median that doesn't do this.

  • 如果数组有奇数个元素(要查找一个数字),则排序比这个“快速中位数”命题慢 20%。

  • 如果有偶数个元素,“快速”算法会慢 40%,因为它过滤数据两次,以找到第 k 个元素和 k+1 个元素以求平均值。可以编写一个不这样做的快速中位数版本。

The test used rather small arrays (29 elements in the jsperf test). The effect appears to be more pronounced as arrays get larger. A more general point to make is: it shows these kinds of optimisations are worth doing in Javascript. An awful lot of computation is done in JS, including with large amounts of data (think of dashboards, spreadsheets, data visualisations), and in systems with limited resources (think of mobile and embedded computing).

该测试使用了相当小的数组(jsperf 测试中的 29 个元素)。随着数组变大,效果似乎更加明显。一个更普遍的观点是:它表明这些类型的优化在 Javascript 中是值得做的。大量的计算是在 JS 中完成的,包括大量数据(想想仪表板、电子表格、数据可视化)和资源有限的系统(想想移动和嵌入式计算)。

回答by Dps

`

`

var arr = {  
  max: function(array) {
    return Math.max.apply(null, array);
  },

  min: function(array) {
    return Math.min.apply(null, array);
  },

  range: function(array) {
    return arr.max(array) - arr.min(array);
  },

  midrange: function(array) {
    return arr.range(array) / 2;
  },

  sum: function(array) {
    var num = 0;
    for (var i = 0, l = array.length; i < l; i++) num += array[i];
    return num;
  },

  mean: function(array) {
    return arr.sum(array) / array.length;
  },

  median: function(array) {
    array.sort(function(a, b) {
      return a - b;
    });
    var mid = array.length / 2;
    return mid % 1 ? array[mid - 0.5] : (array[mid - 1] + array[mid]) / 2;
  },

  modes: function(array) {
    if (!array.length) return [];
    var modeMap = {},
      maxCount = 1,
      modes = [array[0]];

    array.forEach(function(val) {
      if (!modeMap[val]) modeMap[val] = 1;
      else modeMap[val]++;

      if (modeMap[val] > maxCount) {
        modes = [val];
        maxCount = modeMap[val];
      }
      else if (modeMap[val] === maxCount) {
        modes.push(val);
        maxCount = modeMap[val];
      }
    });
    return modes;
  },

  variance: function(array) {
    var mean = arr.mean(array);
    return arr.mean(array.map(function(num) {
      return Math.pow(num - mean, 2);
    }));
  },

  standardDeviation: function(array) {
    return Math.sqrt(arr.variance(array));
  },

  meanAbsoluteDeviation: function(array) {
    var mean = arr.mean(array);
    return arr.mean(array.map(function(num) {
      return Math.abs(num - mean);
    }));
  },

  zScores: function(array) {
    var mean = arr.mean(array);
    var standardDeviation = arr.standardDeviation(array);
    return array.map(function(num) {
      return (num - mean) / standardDeviation;
    });
  }
};

`

`

回答by JBallin

Here's another solution:

这是另一个解决方案:

function median(numbers) {
    const sorted = numbers.slice().sort((a, b) => a - b);
    const middle = Math.floor(sorted.length / 2);

    if (sorted.length % 2 === 0) {
        return (sorted[middle - 1] + sorted[middle]) / 2;
    }

    return sorted[middle];
}

console.log(median([4, 5, 7, 1, 33]));

回答by hien711

For better performance in terms of time complexity, use MaxHeap - MinHeap to find the median of stream of array.

为了在时间复杂度方面获得更好的性能,请使用 MaxHeap - MinHeap 来查找数组流的中值。

回答by user3242162

Simpler & more efficient

更简单更高效

const median = dataSet => {
  if (dataSet.length === 1) return dataSet[0]
  const sorted = ([ ...dataSet ]).sort()
  const ceil = Math.ceil(sorted.length / 2)
  const floor = Math.floor(sorted.length / 2)
  if (ceil === floor) return sorted[floor]
  return ((sorted[ceil] + sorted[floor]) / 2)
}