Javascript 在base64编码之前缩短字符串以使其更短的无损压缩方法?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4144704/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-23 11:04:43  来源:igfitidea点击:

Lossless compression method to shorten string before base64 encoding to make it shorter?

javascriptcompressionbase64huffman-codelzw

提问by bennedich

just built a small webapp for previewing HTML-documents that generates URL:s containing the HTML (and all inline CSS and Javascript) in base64 encoded data. Problem is, the URL:s quickly get kinda long. What is the "de facto" standard way (preferably by Javascript) to compress the string first without data loss?

刚刚构建了一个用于预览 HTML 文档的小型 web 应用程序,该应用程序生成 URL:s,其中包含 base64 编码数据中的 HTML(以及所有内联 CSS 和 Javascript)。问题是,URL:s 很快就会变得有点长。首先压缩字符串而不丢失数据的“事实上的”标准方式(最好是通过Javascript)是什么?

PS; I read about Huffman and Lempel-Ziv in school some time ago, and I remember really enjoying LZW :)

PS; 前段时间我在学校读到了 Huffman 和 Lempel-Ziv,我记得我真的很喜欢 LZW :)

EDIT:

编辑:

Solution found; seems like rawStr => utf8Str => lzwStr => base64Str is the way to go. I'm further working on implementing huffman compression between utf8 and lzw. Problem so far is that too many chars become very long when encoded to base64.

找到解决方案;似乎 rawStr => utf8Str => lzwStr => base64Str 是要走的路。我正在进一步致力于在 utf8 和 lzw 之间实现霍夫曼压缩。到目前为止的问题是编码为 base64 时太多的字符变得很长。

采纳答案by David Murdoch

Check out this answer. It mentions functions for LZW compression/decompression (via http://jsolait.net/, specifically http://jsolait.net/browser/trunk/jsolait/lib/codecs.js).

看看这个答案。它提到了 LZW 压缩/解压缩的功能(通过http://jsolait.net/,特别是http://jsolait.net/browser/trunk/jsolait/lib/codecs.js)。

回答by James Gaunt

You will struggle to get very much compression at all on a URL, they're too short and don't contain enough redundant information to get much benefit from Huffman / LZW style algorithms.

您将很难在 URL 上获得非常多的压缩,它们太短并且不包含足够的冗余信息,无法从 Huffman / LZW 风格的算法中获得很多好处。

If you have constraints on the space of possible URLS (e.g. all content tends to be in the same set of folders) you could hard code some parts of the URLS for expansion on the client - i.e. cheat.

如果您对可能的 URLS 的空间有限制(例如,所有内容往往都在同一组文件夹中),您可以对 URLS 的某些部分进行硬编码以在客户端上进行扩展 - 即作弊。