MongoDB 在包含 50.000.000 个文档的大型集合上写入性能不佳
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/24868171/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
MongoDB poor write performance on large collections with 50.000.000 documents plus
提问by Guido Kr?mer
I have got a MongoDB which store product data for 204.639.403items, those data has already spitted up, by the item's country, into four logicaldatabases running on the same physical machine in the same MongoDB process.
我有一个 MongoDB,它存储204.639.403 个项目的产品数据,这些数据已经按项目所在的国家/地区写入四个逻辑数据库中,这些数据库在同一 MongoDB 进程中的同一台物理机器上运行。
Here is a list with the number of documents per logical database:
以下是每个逻辑数据库的文档数列表:
- CoUk: 56.719.977
- De: 61.216.165
- Fr: 52.280.460
- It: 34.422.801
- 库克:56.719.977
- 德:61.216.165
- 法国:52.280.460
- 它:34.422.801
My problem is that the database write performance is getting worser, especially writes to the largest of the four databases (De) has become really bad, according to iotop
the mongod process uses 99%of the IO time with less than 3MB writes and 1.5MB reads per second. This leads to long locking databases, 100%+lock become normally according to mongostat
- even if all processes writing and reading to the other country databases has been stopped. The current slave reaches a LOAD up to 6, the replica set master has a load of 2-3 at the same time, therefore it leads to a replication lag, too.
我的问题是数据库写入性能越来越差,尤其是写入四个数据库中最大的(De)变得非常糟糕,根据iotop
mongod进程使用99%的IO时间,小于3MB的写入和1.5MB的读取每秒。这导致长时间锁定数据库,100%+锁定变得正常mongostat
- 即使所有进程写入和读取其他国家的数据库都已停止。当前slave的LOAD达到6,副本集master同时有2-3的load,因此也会导致复制延迟。
Each databases has the same data and index structure, I am using the largest database (De) for further examples only.
每个数据库都有相同的数据和索引结构,我使用最大的数据库 (De) 仅作为进一步的示例。
This is a random item taken from the database, just as example, the structure is optimized to gather all important data with a single read:
这是从数据库中随机抽取的项目,例如,优化结构以通过单次读取收集所有重要数据:
{
"_id" : ObjectId("533b675dba0e381ecf4daa86"),
"ProductId" : "XGW1-E002F-DW",
"Title" : "Sample item",
"OfferNew" : {
"Count" : 7,
"LowestPrice" : 2631,
"OfferCondition" : "NEW"
},
"Country" : "de",
"ImageUrl" : "http://….jpg",
"OfferHistoryNew" : [
…
{
"Date" : ISODate("2014-06-01T23:22:10.940+02:00"),
"Value" : {
"Count" : 10,
"LowestPrice" : 2171,
"OfferCondition" : "NEW"
}
}
],
"Processed" : ISODate("2014-06-09T23:22:10.940+02:00"),
"Eans" : [
"9781241461959"
],
"OfferUsed" : {
"Count" : 1,
"LowestPrice" : 5660,
"OfferCondition" : "USED"
},
"Categories" : [
NumberLong(186606),
NumberLong(541686),
NumberLong(288100),
NumberLong(143),
NumberLong(15777241)
]
}
Typical querys are form simple one like by the ProductId or an EAN only to refinements by the category and sorted by its A rank or refinements by the category and an A rank range (1 up to 10.000 for example) and sorted by the B rank… .
典型的查询是简单的,例如按 ProductId 或 EAN 仅按类别细化并按其 A 等级排序或按类别和 A 等级范围(例如 1 到 10.000)细化并按 B 等级排序…… .
This are the stats from the largest db:
这是来自最大数据库的统计数据:
{
"ns" : "De.Item",
"count" : 61216165,
"size" : 43915150656,
"avgObjSize" : 717,
"storageSize" : 45795192544,
"numExtents" : 42,
"nindexes" : 6,
"lastExtentSize" : 2146426864,
"paddingFactor" : 1,
"systemFlags" : 0,
"userFlags" : 1,
"totalIndexSize" : 41356824320,
"indexSizes" : {
"_id_" : 2544027808,
"RankA_1" : 1718096464,
"Categories_1_RankA_1_RankB_-1" : 16383534832,
"Eans_1" : 2846073776,
"Categories_1_RankA_-1" : 15115290064,
"ProductId_1" : 2749801376
},
"ok" : 1
}
It is mentionable that the index size is nearly half of the storage size.
值得一提的是,索引大小几乎是存储大小的一半。
Each country DB has to handle 3-5 million updates/inserts per day, my target is to perform the write operations in less than five hours during the night.
每个国家/地区数据库每天必须处理 3-5 百万次更新/插入,我的目标是在夜间不到五个小时内执行写入操作。
Currently it's a replica set with two servers, each has 32GB RAM and a RAID1 with 2TB HDDs. Simple optimizations like the deadlock scheduler and noatime has already been made.
目前它是一个带有两台服务器的副本集,每台服务器都有 32GB 内存和一个带有 2TB 硬盘的 RAID1。死锁调度程序和 noatime 等简单优化已经完成。
I have worked out some optimizations strategies:
我已经制定了一些优化策略:
- Reducing the number indexes:
- the default _id could use the ProductId instead of the default MongoId which would save 6-7% per DB per total nixes size.
- Trying to remove the Categories_1_RankA_-1 index maybe the BrowseNodes_1_RankA_1_RankB_-1 index could handle the query, too. Does sorting still performs well when not the complete index is used? Another way would be storing the index matching Categories_1_RankA_1_RankB_-1 in another collection which refers to the main collection.
- Reducing the amount of raw data by using smaller keys, instead of 'Categories', 'Eans', 'OfferHistoryNew'… I could use 'a', 'b', 'c'… this should be easy since I used http://mongoHyman.org/but I don't now how worthwhile it will be.
- Replacing the RAID1 with a RAID0, could be easily tested by taken the slave down, reinstalling and reading it to the replica set… .
- Testing stronger Hardware SSDs and more memory which should handle the reads and writes faster.
- Use MongoDB's shading capabilities:
- I read that each shard has to hold the whole database index?
- I have concerns that the query structure might not fit into a sharing environment well. Using the product id as shard key seems to fit not all query types and sharding by the category is complicated, too. A single item can be listed in multiple main and sub categories … . My concerns could be wrong, I never used it in a production environment.
- 减少数字索引:
- 默认的 _id 可以使用 ProductId 而不是默认的 MongoId,这将为每个总 nixes 大小为每个 DB 节省 6-7%。
- 尝试删除 Categories_1_RankA_-1 索引,也许 BrowseNodes_1_RankA_1_RankB_-1 索引也可以处理查询。当不使用完整索引时,排序仍然表现良好吗?另一种方法是将匹配 Categories_1_RankA_1_RankB_-1 的索引存储在另一个引用主集合的集合中。
- 通过使用较小的键来减少原始数据量,而不是“类别”、“Eans”、“OfferHistoryNew”……我可以使用“a”、“b”、“c”……这应该很容易,因为我使用了http:/ /mongoHyman.org/但我现在不知道它有多值得。
- 用 RAID0 替换 RAID1,可以通过关闭从属设备、重新安装并将其读取到副本集来轻松测试......。
- 测试更强大的硬件 SSD 和更多内存,以更快地处理读取和写入。
- 使用 MongoDB 的着色功能:
- 我读到每个分片都必须保存整个数据库索引?
- 我担心查询结构可能不适合共享环境。使用产品 ID 作为分片键似乎并不适合所有查询类型,而且按类别进行分片也很复杂。单个项目可以列在多个主要和子类别中……。我的担心可能是错误的,我从未在生产环境中使用过它。
But there should be other optimization strategies, too did not comes to my mind I would like to hear about!
Which optimization strategy sound most promising or is a mixture of several optimizations is needed here?
但是应该还有其他优化策略,我想听的也没有想到!
哪种优化策略听起来最有希望,或者这里需要多种优化的混合?
采纳答案by qSlug
Most likely you are running into issues due to record growth, see http://docs.mongodb.org/manual/core/write-performance/#document-growth.
由于创纪录的增长,您很可能会遇到问题,请参阅http://docs.mongodb.org/manual/core/write-performance/#document-growth。
Mongo prefers records of fixed (or at least bounded) size. Increasing the record size beyond the pre-allocated storage will cause the document to be moved to another location on disk, multiplying your I/O with each write. Consider pre-allocating "enough" space for your average document on insert, if your document sizes are relatively homogenous. Otherwise consider splitting rapidly growing nested arrays into a separate collection, thereby replacing updates with inserts. Also check your fragmentation and consider compacting your databases from time to time, so that you have a higher density of documents per block which will cut down on hard page faults.
Mongo 更喜欢固定(或至少有界)大小的记录。将记录大小增加到超出预先分配的存储空间将导致文档被移动到磁盘上的另一个位置,从而使每次写入的 I/O 成倍增加。如果您的文档大小相对相同,请考虑在插入时为您的平均文档预先分配“足够”的空间。否则考虑将快速增长的嵌套数组拆分为一个单独的集合,从而用插入替换更新。还要检查您的碎片并考虑不时压缩您的数据库,以便您每个块拥有更高密度的文档,这将减少硬页面错误。
回答by jrullmann
Would you consider using a database with better throughput that supports documents? I've heard success stories with TokuMX. And FoundationDB(where I'm an engineer) has very good performance with high-concurrent write loads and large documents. Happy to answer further questions about FoundationDB.
您会考虑使用支持文档的吞吐量更高的数据库吗?我听说过TokuMX 的成功案例。和FoundationDB(其中我是一个工程师)具有高并发写入负载和大型文档非常不错的表现。很高兴回答有关 FoundationDB 的更多问题。