计算中位数 - javascript
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/45309447/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Calculating median - javascript
提问by Muhammed Bayram
I've been trying to calculate medianbut still I've got some mathematical issues I guess as I couldn't get the correct median value and couldn't figure out why. Here's the code;
我一直在尝试计算中位数,但我仍然有一些数学问题,我猜是因为我无法获得正确的中位数,也无法弄清楚原因。这是代码;
class StatsCollector {
constructor() {
this.inputNumber = 0;
this.average = 0;
this.timeout = 19000;
this.frequencies = new Map();
for (let i of Array(this.timeout).keys()) {
this.frequencies.set(i, 0);
}
}
pushValue(responseTimeMs) {
let req = responseTimeMs;
if (req > this.timeout) {
req = this.timeout;
}
this.average = (this.average * this.inputNumber + req) / (this.inputNumber + 1);
console.log(responseTimeMs / 1000)
let groupIndex = Math.floor(responseTimeMs / 1000);
this.frequencies.set(groupIndex, this.frequencies.get(groupIndex) + 1);
this.inputNumber += 1;
}
getMedian() {
let medianElement = 0;
if (this.inputNumber <= 0) {
return 0;
}
if (this.inputNumber == 1) {
return this.average
}
if (this.inputNumber == 2) {
return this.average
}
if (this.inputNumber > 2) {
medianElement = this.inputNumber / 2;
}
let minCumulativeFreq = 0;
let maxCumulativeFreq = 0;
let cumulativeFreq = 0;
let freqGroup = 0;
for (let i of Array(20).keys()) {
if (medianElement <= cumulativeFreq + this.frequencies.get(i)) {
minCumulativeFreq = cumulativeFreq;
maxCumulativeFreq = cumulativeFreq + this.frequencies.get(i);
freqGroup = i;
break;
}
cumulativeFreq += this.frequencies.get(i);
}
return (((medianElement - minCumulativeFreq) / (maxCumulativeFreq - minCumulativeFreq)) + (freqGroup)) * 1000;
}
getAverage() {
return this.average;
}
}
Here's the snapshot of the results when I enter the values of
这是我输入值时的结果快照
342,654,987,1093,2234,6243,7087,20123
342,654,987,1093,2234,6243,7087,20123


The correct result should be;
正确的结果应该是;
Median: 1663.5
中位数:1663.5
回答by jdmdevdotnet
回答by boisvert
The solutions above - sort then find middle - are fine, but slow on large data sets. Sorting the data first has a complexity of n x log(n).
上面的解决方案 - 排序然后找到中间 - 很好,但在大型数据集上很慢。首先对数据进行排序的复杂度为 nx log(n)。
There is a faster median algorithm, which consists in segregating the array in two according to a pivot, then looking for the median in the larger set. Here is some javascript code, but here is a more detailed explanation
有一种更快的中值算法,它包括根据枢轴将数组分成两部分,然后在更大的集合中寻找中值。这是一些javascript代码,但这里有更详细的解释
// Trying some array
alert(quickselect_median([7,3,5])); // 2300,5,4,0,123,2,76,768,28]));
function quickselect_median(arr) {
const L = arr.length, halfL = L/2;
if (L % 2 == 1)
return quickselect(arr, halfL);
else
return 0.5 * (quickselect(arr, halfL - 1) + quickselect(arr, halfL));
}
function quickselect(arr, k) {
// Select the kth element in arr
// arr: List of numerics
// k: Index
// return: The kth element (in numerical order) of arr
if (arr.length == 1)
return arr[0];
else {
const pivot = arr[0];
const lows = arr.filter((e)=>(e<pivot));
const highs = arr.filter((e)=>(e>pivot));
const pivots = arr.filter((e)=>(e==pivot));
if (k < lows.length) // the pivot is too high
return quickselect(lows, k);
else if (k < lows.length + pivots.length)// We got lucky and guessed the median
return pivot;
else // the pivot is too low
return quickselect(highs, k - lows.length - pivots.length);
}
}
Astute readers will notice a few things:
细心的读者会注意到以下几点:
- I simply transliterated Russel Cohen's Python solution into JS, so all kudos to him.
- There are several small optimisations worth doing, but there's parallelisation worth doing, and the code as is is easier to change in either a quicker single-threaded, or quicker multi-threaded, version.
- This is the average linear timealgorithm, there is more efficient a deterministiclinear time version, see Russel's postfor details, including performance data.
- 我只是简单地将 Russel Cohen 的 Python 解决方案音译成 JS,所以我对他表示敬意。
- 有几个小的优化值得做,但也有值得做的并行化,而且代码在更快的单线程或更快的多线程版本中更容易更改。
- 这是平均线性时间算法,有更高效的确定性线性时间版本,请参阅Russel 的帖子了解详细信息,包括性能数据。
ADDITION 19 Sept. 2019:
2019 年 9 月 19 日补充:
One comment asks whether this is worth doing in javascript. I ran the code in JSPerfand it gives interesting results.
一个评论询问这是否值得在 javascript 中做。我在JSPerf 中运行了代码,它给出了有趣的结果。
if the array has an odd number of elements (one figure to find), sorting is 20% slower that this "fast median" proposition.
if there is an even number of elements, the "fast" algorithm is 40% slower, because it filters through the data twice, to find elements number k and k+1 to average. It is possible to write a version of fast median that doesn't do this.
如果数组有奇数个元素(要查找一个数字),则排序比这个“快速中位数”命题慢 20%。
如果有偶数个元素,“快速”算法会慢 40%,因为它过滤数据两次,以找到第 k 个元素和 k+1 个元素以求平均值。可以编写一个不这样做的快速中位数版本。
The test used rather small arrays (29 elements in the jsperf test). The effect appears to be more pronounced as arrays get larger. A more general point to make is: it shows these kinds of optimisations are worth doing in Javascript. An awful lot of computation is done in JS, including with large amounts of data (think of dashboards, spreadsheets, data visualisations), and in systems with limited resources (think of mobile and embedded computing).
该测试使用了相当小的数组(jsperf 测试中的 29 个元素)。随着数组变大,效果似乎更加明显。一个更普遍的观点是:它表明这些类型的优化在 Javascript 中是值得做的。大量的计算是在 JS 中完成的,包括大量数据(想想仪表板、电子表格、数据可视化)和资源有限的系统(想想移动和嵌入式计算)。
回答by Dps
`
`
var arr = {
max: function(array) {
return Math.max.apply(null, array);
},
min: function(array) {
return Math.min.apply(null, array);
},
range: function(array) {
return arr.max(array) - arr.min(array);
},
midrange: function(array) {
return arr.range(array) / 2;
},
sum: function(array) {
var num = 0;
for (var i = 0, l = array.length; i < l; i++) num += array[i];
return num;
},
mean: function(array) {
return arr.sum(array) / array.length;
},
median: function(array) {
array.sort(function(a, b) {
return a - b;
});
var mid = array.length / 2;
return mid % 1 ? array[mid - 0.5] : (array[mid - 1] + array[mid]) / 2;
},
modes: function(array) {
if (!array.length) return [];
var modeMap = {},
maxCount = 1,
modes = [array[0]];
array.forEach(function(val) {
if (!modeMap[val]) modeMap[val] = 1;
else modeMap[val]++;
if (modeMap[val] > maxCount) {
modes = [val];
maxCount = modeMap[val];
}
else if (modeMap[val] === maxCount) {
modes.push(val);
maxCount = modeMap[val];
}
});
return modes;
},
variance: function(array) {
var mean = arr.mean(array);
return arr.mean(array.map(function(num) {
return Math.pow(num - mean, 2);
}));
},
standardDeviation: function(array) {
return Math.sqrt(arr.variance(array));
},
meanAbsoluteDeviation: function(array) {
var mean = arr.mean(array);
return arr.mean(array.map(function(num) {
return Math.abs(num - mean);
}));
},
zScores: function(array) {
var mean = arr.mean(array);
var standardDeviation = arr.standardDeviation(array);
return array.map(function(num) {
return (num - mean) / standardDeviation;
});
}
};
`
`
回答by JBallin
Here's another solution:
这是另一个解决方案:
function median(numbers) {
const sorted = numbers.slice().sort((a, b) => a - b);
const middle = Math.floor(sorted.length / 2);
if (sorted.length % 2 === 0) {
return (sorted[middle - 1] + sorted[middle]) / 2;
}
return sorted[middle];
}
console.log(median([4, 5, 7, 1, 33]));
回答by hien711
For better performance in terms of time complexity, use MaxHeap - MinHeap to find the median of stream of array.
为了在时间复杂度方面获得更好的性能,请使用 MaxHeap - MinHeap 来查找数组流的中值。
回答by user3242162
Simpler & more efficient
更简单更高效
const median = dataSet => {
if (dataSet.length === 1) return dataSet[0]
const sorted = ([ ...dataSet ]).sort()
const ceil = Math.ceil(sorted.length / 2)
const floor = Math.floor(sorted.length / 2)
if (ceil === floor) return sorted[floor]
return ((sorted[ceil] + sorted[floor]) / 2)
}

