database 用于快速读取和快速写入的高性能数据库。无更新或删除

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/26885297/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-08 07:50:37  来源:igfitidea点击:

High Performance DB for Fast Read and Fast Write. No Update or Delete

performanceaerospikedatabasenosql

提问by Reddy

I am looking for the database/mechanism to store the data where I can write the data and read the data with high performance.

我正在寻找数据库/机制来存储数据,我可以在其中写入数据并以高性能读取数据。

This storage is used to for storing the Logging like important information across multiple systems. Since it's critical data which will be logged, read performance should be pretty fast as these data will be used to show history. Since we never do update on them/delete on them/or do any kinda joins, I am looking for right solution.Probably we might archive the data in long time but that's something ok to deal with.

此存储用于跨多个系统存储日志等重要信息。Since it's critical data which will be logged, read performance should be pretty fast as these data will be used to show history. Since we never do update on them/delete on them/or do any kinda joins, I am looking for right solution.可能我们可能会长时间存档数据,但这是可以处理的。

I tried looking at different sources to understand different NoSql databases, experts opinion is always better :)

我尝试查看不同的来源以了解不同的 NoSql 数据库,专家意见总是更好:)

Must Have:
1. Fast Read without fail
2. Fast Write without fail
3. Random access Performance
4. Replication kinda feature, one goes down, immediately another should be up and working
5. Concurrent write/read data

Good to Have:
1. Search content like analysing the data for auditing with/without Indexes

Don't required:
1. Transactions are not required at all
2. Update never happens
3. Delete never happens
4. Joins are not required

Referred: http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis

参考:http: //kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis

回答by kporter

Disclosure: Kevin Porter is a Senior Software Engineer at Aerospike, Inc. since May 2013. (ref)

披露:Kevin Porter 自 2013 年 5 月起担任 Aerospike, Inc. 的高级软件工程师。(参考

Be sure to consider Aerospike; Aerospike dominates in the adtech space where high throughputreads and writes are a required. Aerospike is frequently touted as having "the speed of Redis with the scalability of Cassandra." For searching/querying see Aerospike's secondary indexdocumentation.

一定要考虑Aerospike;Aerospike 在需要高吞吐量读取和写入的广告技术领域占据主导地位。Aerospike 经常被吹捧为“具有 Redis 的速度和 Cassandra 的可扩展性”。对于搜索/查询,请参阅 Aerospike 的二级索引文档。

For more information see the discussion/articles below:

有关更多信息,请参阅下面的讨论/文章:

  1. Aerospike vs Cassandra
  2. Aerospike vs Redis and Mongo
  3. Aerospike Benchmarks
  1. Aerospike VS 卡桑德拉
  2. Aerospike 与 Redis 和 Mongo
  3. Aerospike 基准

Lastly verify the performance for yourself with the One million TPS on EC2 Instructions.

最后使用EC2 说明上的 100 万 TPS验证自己的性能。

回答by Carlo Bertuccini

Let me be the Cassandrasponsor.

让我成为Cassandra 的赞助商。

Disclaimer: I don't say Cassandra is better than the others because I don't even know so deeply mongo/redis/whatever and I don't want even come into this kind of stuffs.

免责声明:我并不是说 Cassandra 比其他的更好,因为我什至对 mongo/redis/whatever 都不太了解,我什至不想进入这种东西。

The reason why I suggest Cassandra is because your needs match perfectlywith what Cassandra offers and your "don't required list" is a set of feature that are either not supported in Cassandra (joins for instances) or considered an anti-pattern (deletes and in some situations updates).

我建议 Cassandra 的原因是因为您的需求与Cassandra 提供的内容完全匹配,并且您的“不需要的列表”是一组在 Cassandra 中不受支持(例如加入)或被视为反模式(删除并在某些情况下更新)。

From your "Must Have" list, point by point

从您的“必须拥有”列表中,一点一点

  1. Fast Read without fail: Supported. You can choose the consistency level of each read operation deciding how much important is to retrieve the most fresh information and how much important is speed

  2. Fast Write without fail: Same as point 1

  3. Random access Performance: When coming in the Cassandra world you have to consider many parameters to get a random access performance but the most important that comes into my mind is the data model -- if you create a data model that scales horizontally (give a look here) and you avoid hotspots you get what you need. If you model your DB in a good way you should have O(1)for each operation since data are structured to be queried

  4. Replication: In this Cassandra is even better than what you might think. If one node goes down nothing changes to the cluster and everything(*) keep working perfectly. Cassandra spots no single point of failure. I can tell you with older Cassandra version I've had an uptime of more than 3 years

  5. Concurrent write/read data: Cassandra uses the lww policy (last-write-wins) to handle concurrent writes on the same key. The system supports multiple read-write and with newer protocols also async operations.

  1. 快速读取而不会失败:支持。您可以选择每个读取操作的一致性级别,决定检索最新信息的重要性和速度的重要性

  2. 快速写入而不会失败:与第 1 点相同

  3. 随机访问性能:当进入 Cassandra 世界时,您必须考虑许多参数才能获得随机访问性能,但我想到的最重要的是数据模型——如果您创建一个水平扩展的数据模型(看看在这里),您可以避免热点,从而获得所需的东西。如果您以良好的方式对数据库进行建模,则每个操作都应该有O(1),因为数据的结构是要查询的

  4. 复制:在这个 Cassandra 中,甚至比你想象的还要好。如果一个节点出现故障,集群不会发生任何变化,并且一切(*)都保持完美运行。Cassandra 没有发现单点故障。我可以告诉您,使用较旧的 Cassandra 版本,我的正常运行时间已超过 3 年

  5. 并发写入/读取数据:Cassandra 使用 lww 策略(last-write-wins)来处理对同一键的并发写入。该系统支持多个读写,并且使用较新的协议也支持异步操作。

There are lots of other interesting features Cassandra offers: linear horizontal scaling is the one I appreciate more but there is also the fact that you can know the instant in which every piece of data has been updated (the timestamp of lww), counters features and so on.

Cassandra 提供了许多其他有趣的功能:线性水平缩放是我更欣赏的功能,但还有一个事实是,您可以知道每条数据更新的时刻(lww 的时间戳)、计数器功能和很快。

(*)- if you don't use Consistency Level All which, imho, should NEVER be used in such a system.

(*)- 如果您不使用 Consistency Level All,恕我直言,不应在这样的系统中使用。

回答by Peter Corless

Here's a few more links on how you can span In-Memory with Disk (DRAM, SSM, and disk storage) w/ Aerospike:

以下是有关如何使用 Aerospike 使用磁盘(DRAM、SSM 和磁盘存储)跨越内存的更多链接:

http://www.aerospike.com/hybrid-memory/

http://www.aerospike.com/hybrid-memory/

http://www.aerospike.com/docs/architecture/storage.html

http://www.aerospike.com/docs/architecture/storage.html

I think everyone is right in terms of matching the specific DB to your specific use case. For instance, Aerospike is optimal for key-value data. Other options might be better.

我认为在将特定数据库与您的特定用例相匹配方面,每个人都是正确的。例如,Aerospike 最适合键值数据。其他选择可能会更好。

By way of analogy, I'll always remember how, decades ago, a sister of mine once borrowed my computer and wrote her term paper in Microsoft Excel. Line after line was a different row of a spreadsheet. It looked ugly as heck, but, uh, okay. She got the task done. She cursed and swore at how difficult it was to edit the thing. No kidding!

打个比方,我会永远记得,几十年前,我的一个姐姐借我的电脑,用 Microsoft Excel 写了她的学期论文。一行一行是电子表格的不同行。它看起来很丑,但是,呃,好吧。她完成了任务。她诅咒并发誓编辑这个东西是多么困难。不开玩笑!

Choosing the right NoSQL database for the right task will either make your job a breeze, or could cause you to curse a blue streak if you decided on the wrong basic tool for the task at hand.

为正确的任务选择正确的 NoSQL 数据库将使您的工作变得轻而易举,或者如果您决定为手头的任务选择错误的基本工具,可能会导致您诅咒蓝色条纹。

Of course, every vendor's going to defend their product. I think it's best the community answer the question. Here's another Stack Overflow thread answering a similar question:

当然,每个供应商都会为他们的产品辩护。我认为最好由社区来回答这个问题。这是另一个回答类似问题的 Stack Overflow 线程:

Has anyone worked with Aerospike? How does it compare to MongoDB?

有人用过 Aerospike 吗?它与 MongoDB 相比如何?

btw: Do you have any more specific insights for us on what type of problem you are trying to solve?

顺便说一句:对于您要解决的问题类型,您对我们有更具体的见解吗?