C#中的老化数据结构

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/15272/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-01 08:50:34  来源:igfitidea点击:

Aging Data Structure in C#

提问by thelsdj

I want a data structure that will allow querying how many items in last Xminutes. An item may just be a simple identifier or a more complex data structure, preferably the timestamp of the item will be in the item, rather than stored outside (as a hash or similar, wouldn't want to have problems with multiple items having same timestamp).

我想要一个数据结构,允许查询在过去X分钟内有多少项目。一个项目可能只是一个简单的标识符或更复杂的数据结构,最好项目的时间戳将在项目中,而不是存储在外部(作为散列或类似的,不希望多个项目有相同的问题)时间戳)。

So far it seems that with LINQ I could easily filter items with timestamp greater than a given time and aggregate a count. Though I'm hesitant to try to work .NET 3.5 specific stuff into my production environment yet. Are there any other suggestions for a similar data structure?

到目前为止,似乎使用 LINQ 我可以轻松过滤时间戳大于给定时间的项目并汇总计数。尽管我还犹豫要不要尝试将 .NET 3.5 特定的东西应用到我的生产环境中。对于类似的数据结构还有其他建议吗?

The other part that I'm interested in is agingold data out, If I'm only going to be asking for counts of items less than 6 hours ago I would like anything older than that to be removed from my data structure because this may be a long-running program.

我感兴趣的另一部分是老化旧数据,如果我只要求在不到 6 小时前的项目计数,我希望从我的数据结构中删除任何比这更旧的内容,因为这可能成为一个长期运行的程序。

采纳答案by Lasse V. Karlsen

A simple linked list can be used for this.

为此可以使用一个简单的链表。

Basically you add new items to the end, and remove too old items from the start, it is a cheap data structure.

基本上你在最后添加新项目,并从一开始删除太旧的项目,这是一种廉价的数据结构。

example-code:

示例代码:

list.push_end(new_data)
while list.head.age >= age_limit:
    list.pop_head()

If the list will be busy enough to warrant chopping off larger pieces than one at a time, then I agree with dmo, use a tree structure or something similar that allows pruning on a higher level.

如果列表足够忙以保证一次切掉比一个更大的部分,那么我同意dmo,使用树结构或类似的东西来允许在更高级别上进行修剪。

回答by dmo

I think that an important consideration will be the frequency of querying vs. adding/removing. If you will do frequent querying (especially if you'll have a large collection) a B-tree may be the way to go:

我认为一个重要的考虑因素是查询与添加/删除的频率。如果您要进行频繁的查询(特别是如果您有一个大集合),那么 B 树可能是您要走的路:

http://en.wikipedia.org/wiki/B-tree

http://en.wikipedia.org/wiki/B-tree

You could have some thread go through and clean up this tree periodically or make it part of the search (again, depending on the usage). Basically, you'll do a tree search to find the spot "x minutes ago", then count the number of children on the nodes with newer times. If you keep the number of children under the nodes up to date, this sum can be done quickly.

您可以让一些线程通过并定期清理这棵树或使其成为搜索的一部分(同样,取决于使用情况)。基本上,您将进行树搜索以找到“x 分钟前”的位置,然后计算更新时间节点上的子节点数。如果您使节点下的子节点数量保持最新,则可以快速完成此总和。

回答by ehosca

a cache with sliding expiration will do the job ....

具有滑动到期时间的缓存将完成这项工作......

stuff your items in and the cache handles the aging ....

把你的物品塞进去,缓存处理老化......

http://www.sharedcache.com/cms/

http://www.sharedcache.com/cms/