.net 我应该使用哪个 NoSQL 数据库进行日志记录?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/10525725/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-03 16:24:33  来源:igfitidea点击:

Which NoSQL database should I use for logging?

.netmongodblogging

提问by ikrain

Do you have any experience logging to NoSQL databases for scalable apps? I have done some research on NoSQL databases for logging and found that MongoDB seems to be a good choice. Also, I found log4mongo-netwhich seems to be a very straightforward option.

您是否有为可扩展应用程序登录 NoSQL 数据库的经验?我对用于日志记录的 NoSQL 数据库做了一些研究,发现 MongoDB 似乎是一个不错的选择。另外,我发现log4mongo-net似乎是一个非常简单的选择。

Would you recommend this kind of approach? Are there any other suggestions?

你会推荐这种方法吗?还有其他建议吗?

回答by yamen

I've decided to revise this accepted answer as the state of the art has moved significantly in the last 18 months, and much better alternatives exist.

我决定修改这个被接受的答案,因为在过去的 18 个月里,最先进的技术已经发生了重大变化,并且存在更好的替代方案。

New Answer

新答案

MongoDB is a sub-par choice for a scalable logging solution. There are the usual reasons for this (write performance under load for example). I'd like to put forward one more, which is that it only solves a single use case in a logging solution.

MongoDB 是可扩展日志记录解决方案的次要选择。这有通常的原因(例如,负载下的写入性能)。我想再提出一个,那就是它只解决了日志记录解决方案中的一个用例。

A strong logging solution needs to cover at least the following stages:

一个强大的日志记录解决方案至少需要涵盖以下阶段:

  • Collection
  • Transport
  • Processing
  • Storage
  • Search
  • Visualisation
  • 收藏
  • 运输
  • 加工
  • 贮存
  • 搜索
  • 可视化

MongoDB as a choice only solves the Storage use case (albeit somewhat poorly). Once the complete chain is analysed, there are more appropriate solutions.

MongoDB 作为一种选择只解决了存储用例(尽管有点差)。一旦分析了完整的链条,就会有更合适的解决方案。

@KazukiOhta mentions a few options. My preferred end to end solution these days involves:

@KazukiOhta 提到了一些选项。这些天我首选的端到端解决方案包括:

The underlying use of ElasticSearch for log data storage uses the current best of breed NoSQL solution for the logging and searching use case. The fact that Logstash-Forwarder / Logstash / ElasticSearch / Kibana3 are under the umbrella of ElasticSearchmakes for an even more compelling argument.

ElasticSearch 用于日志数据存储的底层使用使用当前最好的 NoSQL 解决方案来用于日志记录和搜索用例。Logstash-Forwarder / Logstash / ElasticSearch / Kibana3 都在 ElasticSearch 的保护伞下,这一事实更加引人注目。

Since Logstash can also act as a Graphite proxy, a very similar chain can be built for the associated problem of collecting and analysing metrics (not just logs).

由于 Logstash 还可以充当 Graphite 代理,因此可以为收集和分析指标(不仅仅是日志)的相关问题构建一个非常相似的链。

Old Answer

旧答案

MongoDB Capped Collectionsare extremely popular and suitable for logging, with the added bonus of being 'schema less', which is usually a semantic fit for logging. Often we only know what we want to log well into a project, or after certain issues have been found in production. Relational databases or strict schemas tend to be difficult to change in these cases, and attempts to make them 'flexible' tends just to make them 'slow' and difficult to use or understand.

MongoDB Capped Collections非常流行并且适用于日志记录,还有一个额外的好处是“模式更少”,这通常是日志记录的语义适合。通常我们只知道我们想要很好地登录到项目中,或者在生产中发现某些问题之后。在这些情况下,关系数据库或严格模式往往难以更改,而试图使它们“灵活”往往只会使它们“缓慢”并且难以使用或理解。

But if you'd like to manage your logs in the dark and have lasers going and make it look like you're from spacethere's always Graylog2which uses MongoDB as part of its overall infrastructure but provides a whole lot more on top such as a common, extensible format, a dedicated log collection server, distributed architecture and a funky UI.

但是,如果您想在黑暗中管理您的日志并让激光运行并使其看起来像是来自太空,那么总是有Graylog2,它使用 MongoDB 作为其整体基础架构的一部分,但在顶部提供了更多功能,例如通用的、可扩展的格式、专用的日志收集服务器、分布式架构和时髦的 UI。

回答by Kazuki Ohta

I've seen a lot of companies are using MongoDBto store application logs. Its schema-freeness is really flexible for application logs, at which schema tends to change time-to-time. Also, its Capped Collectionfeature is really useful because it automatically purges old data to keep the data fit into the memory.

我看到很多公司都在使用MongoDB来存储应用程序日志。它的无模式对于应用程序日志来说非常灵活,在这种情况下,模式往往会不时更改。此外,它的Capped Collection功能非常有用,因为它会自动清除旧数据以保持数据适合内存。

People aggregates the logs by normal Grouping or MapReduce, but it's not that fast. Especially MongoDB's MapReduce only works within a single thread and its JavaScript execution overhead is huge. New aggregation frameworkcould solve this problem.

人们通过普通分组或 MapReduce 聚合日志,但速度并不快。尤其是 MongoDB 的 MapReduce 只能在单线程内工作,而且它的 JavaScript 执行开销很大。新的聚合框架可以解决这个问题。

When you use MongoDB for logging, the concern is the lock contentionby high write throughputs. Although MongoDB's insert is fire-and-forget style by default, calling a lot of insert() causes a heavy write lock contention. This could affect the application performance, and prevent the readers to aggregate / filter the stored logs.

当您使用 MongoDB 进行日志记录时,需要关注的是高写入吞吐量导致的锁争用。虽然 MongoDB 的 insert 默认是即发即弃的风格,但是调用大量的 insert() 会导致严重的写锁争用。这可能会影响应用程序性能,并阻止阅读器聚合/过滤存储的日志。

One solution might be using the log collector frameworksuch as Fluentd, Logstash, or Flume. These daemons are supposed to be launched at every application nodes, and takes the logs from app processes.

一种解决方案可能是使用日志收集器框架,例如FluentdLogstashFlume。这些守护进程应该在每个应用程序节点上启动,并从应用程序进程中获取日志。

Fluentd plus MongoDB

Fluentd 加上 MongoDB

They buffer the logs and asynchronouslywrites out the data to other systems like MongoDB / PostgreSQL / etc. The write is done by batches, so it's a lot more efficient than writing directly from apps. This link describes how to put the logs into Fluentd from PHP program.

它们缓冲日志并将数据异步写出到其他系统,如 MongoDB / PostgreSQL / 等。写入是通过批处理完成的,因此比直接从应用程序写入要高效得多。此链接描述了如何将日志从 PHP 程序放入 Fluentd。

Here's some tutorials about MongoDB + Fluentd.

这里有一些关于 MongoDB + Fluentd 的教程。

MongoDB's problem is it starts slowing down when the data volume exceeds the memory size. At that point, you can switch to other solutions like Apache Hadoopor Cassandra. If you have a distributed logging layer mentioned above, you can instantly switch into another solution as you grow. This tutorial describes how to store logs to HDFS by using Fluentd.

MongoDB 的问题是当数据量超过内存大小时它开始变慢。此时,您可以切换到其他解决方案,例如Apache HadoopCassandra。如果你有上面提到的分布式日志层,你可以随着你的成长立即切换到另一个解决方案。本教程介绍如何使用 Fluentd 将日志存储到 HDFS。

回答by Chris

You should specify what kind of log messages your app produces. If you are only logging lots and lots of simple log messages, MongoDB is a very good choice as it scales so good. But if you need complex authentification stuff or a lot of hierarchy, I would use a traditional rdbms.

您应该指定您的应用程序生成的日志消息类型。如果您只记录大量简单的日志消息,MongoDB 是一个非常好的选择,因为它的扩展性非常好。但是,如果您需要复杂的身份验证或大量层次结构,我会使用传统的 rdbms。