java 将 16 位 pcm 转换为 8 位

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/5717447/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-30 12:27:28  来源:igfitidea点击:

Convert 16 bit pcm to 8 bit

javaalgorithmaudiobitpcm

提问by gosho_ot_pochivka

I have pcm audio stored in a byte array. It is 16 bits per sample. I want to make it 8 bit per sample audio.

我将 pcm 音频存储在字节数组中。每个样本是 16 位。我想将每个样本音频设为 8 位。

Can anyone suggest a good algorithm to do that?

谁能建议一个好的算法来做到这一点?

I haven't mentioned the bitrate because I think it isn't important for the algorithm - right?

我没有提到比特率,因为我认为它对算法并不重要 - 对吧?

回答by unwind

I can't see right now why it's not enough to just take the upper byte, i.e. discard the lower 8 bits of each sample.

我现在不明白为什么只取高字节是不够的,即丢弃每个样本的低 8 位。

That of course assumes that the samples are linear; if they're not then maybe you need to do something to linearize them before dropping bits.

当然,假设样本是线性的;如果它们不是那么也许你需要在丢弃位之前做一些事情来线性化它们。

short sixteenBit = 0xfeed;
byte eightBit = sixteenBit >> 8;
// eightBit is now 0xfe.

As suggested by AShelly in a comment, it might be a good idea to round, i.e. add 1 if the byte we're discarding is higher than half its maximum:

正如 AShelly 在评论中所建议的,四舍五入可能是一个好主意,即如果我们丢弃的字节高于其最大值的一半,则加 1:

eightBit += eightBit < 0xff && ((sixteenBit & 0xff) > 0x80);

The test against 0xff implements clamping, so we don't risk adding 1 to 0xff and wrapping that to 0x00 which would be bad.

针对 0xff 的测试实现了钳位,因此我们不会冒险将 1 添加到 0xff 并将其包装到 0x00,这会很糟糕。

回答by user2774867

16-bit samples are usually signed, and 8-bit samples are usually unsigned, so the simplest answer is that you need to convert the 16-bit samples from signed (16-bit samples are almost always stored as a range from -32768 to +32767) to unsigned and then take the top 8 bits of the result. In C, this could be expressed as output = (unsigned char)((unsigned short)(input + 32768) >> 8). This is a good start, and might be good enough for your needs, but it won't sound very nice. It sounds rough because of "quantization noise".

16 位样本通常是有符号的,而 8 位样本通常是无符号的,因此最简单的答案是您需要将 16 位样本从有符号转换(16 位样本几乎总是存储为从 -32768 到+32767) 到无符号,然后取结果的前 8 位。在 C 中,这可以表示为 output = (unsigned char)((unsigned short)(input + 32768) >> 8)。这是一个好的开始,可能足以满足您的需求,但听起来不太好。由于“量化噪声”,这听起来很粗糙。

Quantization noise is the difference between the original input and your algorithm's output. No matter what you do, you're going to have noise, and the noise will be "half a bit" on average. There's nothing you can do about that, but there are ways to make the noise less noticeable.

量化噪声是原始输入和算法输出之间的差异。不管你做什么,你都会有噪音,而且噪音平均会是“一半”。您对此无能为力,但有一些方法可以使噪音不那么明显。

The main problem with the quantization noise is that it tends to form patterns. If the difference between input and output were completely random, things would actually sound fine, but instead the output will repeatedly be too high for a certain part of the waveform and too low for the next part. Your ear picks up on this pattern.

量化噪声的主要问题是它倾向于形成模式。如果输入和输出之间的差异完全是随机的,那么实际上听起来不错,但是对于波形的某个部分,输出会反复地过高,而对于下一部分来说,输出会反复地过低。你的耳朵会注意到这种模式。

To have a result that sounds good, you need to add dithering. Dithering is a technique that tries to smooth-out the quantization noise. The simplest dithering just removes the patterns from the noise so that the noise patterns don't distract from the actual signal patterns. Better dithering can go a step further and take steps to reduce the noise by adding together the error values from multiple samples and then adding in a correction when the total error gets large enough to be worth correcting.

要获得听起来不错的结果,您需要添加抖动。抖动是一种尝试平滑量化噪声的技术。最简单的抖动只是从噪声中去除模式,这样噪声模式就不会分散实际信号模式的注意力。更好的抖动可以更进一步,通过将来自多个样本的误差值相加,然后在总误差大到值得校正时添加校正,从而采取措施降低噪声。

You can find explanations and code samples for various dithering algorithms online. One good area to investigate might be the SoX tool, http://en.wikipedia.org/wiki/SoX. Check the source for its dithering effect, and experiment with converting various sounds from 16-bit to 8-bit with and without dithering enabled. You will be surprised by the difference in quality that dithering can make when converting to 8-bit sound.

您可以在线找到各种抖动算法的解释和代码示例。一个值得研究的领域可能是 SoX 工具,http://en.wikipedia.org/wiki/SoX。检查源的抖动效果,并尝试在启用和不启用抖动的情况下将各种声音从 16 位转换为 8 位。在转换为 8 位声音时,抖动会产生的质量差异会让您感到惊讶。

回答by Akash Raghav

byteData = (byte) (((shortData +32768)>>8)& 0xFF) 

this worked for me.

这对我有用。

回答by William Morrison

Normalize the 16 bit samples, then rescale by the maximum value of your 8 bit sample.

标准化 16 位样本,然后按 8 位样本的最大值重新缩放。

This yields a more accurate conversion as the lower 8 bits of each sample aren't being discarded. However, my solution is more computationally expensive than the selected answer.

这会产生更准确的转换,因为每个样本的低 8 位不会被丢弃。但是,我的解决方案比所选答案的计算成本更高。