C++ 尝试实现

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1036504/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 18:29:39  来源:igfitidea点击:

Trie implementation

c++cdata-structurestrie

提问by Anton Kazennikov

Is there any speed- and cache-efficient implementations of trie in C/C++? I know what a trie is, but I don't want reinvent the wheel, implementing it myself.

在 C/C++ 中是否有速度和缓存效率高的 trie 实现?我知道什么是特里,但我不想重新发明轮子,自己实现它。

采纳答案by SashaN

if you are looking for an ANSI C implementation you can "steal" it from FreeBSD. The file you are looking for is called radix.c. It's used for managing routing data in kernel.

如果您正在寻找 ANSI C 实现,您可以从 FreeBSD “窃取”它。您要查找的文件名为radix.c。它用于管理内核中的路由数据。

回答by eloj

I realize the question was about ready implementations, but for reference...

我意识到问题是关于现成的实现,但仅供参考......

Before you jump on Judy you should have read "A Performance Comparison of Judy to Hash Tables". Then googling the title will likely give you a lifetime of discussion and rebutals to read.

在您开始学习 Judy 之前,您应该阅读“ Judy 与哈希表的性能比较”。然后谷歌搜索标题可能会给你一生的讨论和反驳阅读。

The one explicitly cache-conscious trie I know of is the HAT-trie.

我所知道的一个明确的缓存感知树是HAT-trie

The HAT-trie, when implemented correctly, is very cool. However, for prefix search you need a sorting step on the hash buckets, which somewhat clashes with the idea of a prefix structure.

HAT-trie 在正确实施时非常酷。但是,对于前缀搜索,您需要对哈希桶进行排序,这与前缀结构的想法有些冲突。

A somewhat simpler trie is the burst-triewhich essentially gives you an interpolation between a standard tree of some sort (like a BST) and a trie. I like it conceptually and it's much easier to implement.

一个稍微简单的特里是突发特里,它本质上为您提供某种标准树(如 BST)和特里之间的插值。我在概念上喜欢它,而且它更容易实现。

回答by SPWorley

I've had good luck with libTrie. It may not be specifically cache optimized but the performance has always been decent for my applications.

我在libTrie 上很幸运。它可能没有经过专门的缓存优化,但对于我的应用程序来说,性能一直不错。

回答by tgoodhart

GCC ships with a handful of data structures as part of its "Policy-based data structures". This includes a few trie implementations.

GCC 附带了一些数据结构,作为其“基于策略的数据结构”的一部分。这包括一些尝试实现。

http://gcc.gnu.org/onlinedocs/libstdc++/ext/pb_ds/trie_based_containers.html

http://gcc.gnu.org/onlinedocs/libstdc++/ext/pb_ds/trie_based_containers.html

回答by nik

References,

参考,

  • A Double-Array Trie implementationarticle (includes a Cimplementation)
  • TRASH- A dynamic LC-trie and hash data structure -- (a 2006 PDF reference describing a dynamic LC-trie used in the Linux kernel to implement address lookup in the IP routing table
  • 一篇Double-Array Trie 实现文章(包含一个C实现)
  • TRASH- 动态 LC-trie 和哈希数据结构 --(2006 年的 PDF 参考,描述了在 Linux 内核中用于实现 IP 路由表中的地址查找的动态 LC-trie

回答by bill

Judy arrays: Very fast and memory efficient ordered sparse dynamic arrays for bits, integers and strings. Judy arrays are faster and more memory efficient than any binary-search-tree (incl. avl & red-black-trees).

Judy 数组:非常快速且内存高效的有序稀疏动态数组,用于位、整数和字符串。Judy 数组比任何二叉搜索树(包括 avl 和红黑树)更快,内存效率更高。

回答by Kokizzu

Cedar, HAT-Trie, and JudyArrayis quite awesome, you can find the benchmark here.

CedarHAT-TrieJudyArray非常棒,你可以在这里找到基准。

benchmark result

基准结果

回答by amadvance

You can also try TommyDS at http://tommyds.sourceforge.net/

您也可以在http://tommyds.sourceforge.net/尝试 TommyDS

See the benchmarks page on the site for a speed comparison with nedtries and judy.

有关与 nedtries 和 judy 的速度比较,请参阅网站上的基准测试页面。

回答by Jasper Bekkers

Cache optimizations are something you'll probably are going to have to do, because you'll have to fit the data into a single cacheline which generally is something like 64 bytes (which will probably work if you start combining data, such as pointers). But it's tricky :-)

缓存优化是您可能需要做的事情,因为您必须将数据放入单个缓存行中,该缓存行通常大约为 64 字节(如果您开始组合数据,例如指针,这可能会起作用) . 但这很棘手:-)

回答by Nate

Burst Trie'sseem to be a bit more space efficient. I'm not sure how much cache-efficiency you can get out of any index since CPU-caches are so tiny. However, this kind of trie is plenty compact enough to keep large data sets in RAM (where a regular Trie would not).

Burst Trie似乎更节省空间。由于 CPU 缓存非常小,我不确定您可以从任何索引中获得多少缓存效率。然而,这种 trie 足够紧凑,足以将大数据集保存在 RAM 中(常规 Trie 不会)。

I wrote a Scala implementation of a burst trie that also incorporates some space-saving techniques that I found in GWT's type-ahead implementation.

我编写了一个burst trie 的Scala 实现,它也结合了我在GWT 的提前输入实现中发现的一些节省空间的技术。

https://github.com/nbauernfeind/scala-burst-trie

https://github.com/nbauernfeind/scala-burst-trie