如何配置 MongoDB Java 驱动程序 MongoOptions 以供生产使用?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/6520439/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to configure MongoDB Java driver MongoOptions for production use?
提问by Dan Polites
I've been searching the web looking for best practices for configuring MongoOptions for the MongoDB Java driver and I haven't come up with much other than the API. This search started after I ran into the "com.mongodb.DBPortPool$SemaphoresOut: Out of semaphores to get db connection" error and by increasing the connections/multiplier I was able to solve that problem. I'm looking for links to or your best practices in configuring these options for production.
我一直在网上搜索,寻找为 MongoDB Java 驱动程序配置 MongoOptions 的最佳实践,除了 API 之外,我没有想出太多。这个搜索是在我遇到“com.mongodb.DBPortPool$SemaphoresOut: Out of semaphores to get db connection”错误后开始的,通过增加连接/乘数我能够解决这个问题。我正在寻找链接或您为生产配置这些选项的最佳实践。
The options for the 2.4 driver include: http://api.mongodb.org/java/2.4/com/mongodb/MongoOptions.html
2.4 驱动的选项包括:http: //api.mongodb.org/java/2.4/com/mongodb/MongoOptions.html
- autoConnectRetry
- connectionsPerHost
- connectTimeout
- maxWaitTime
- socketTimeout
- threadsAllowedToBlockForConnectionMultiplier
- 自动连接重试
- 每主机连接数
- 连接超时
- 最大等待时间
- 套接字超时
- 线程AllowedToBlockForConnectionMultiplier
The newer drivers have more options and I would be interested in hearing about those as well.
较新的驱动程序有更多选择,我也有兴趣了解这些。
回答by Remon van Vliet
Updated to 2.9 :
更新到 2.9 :
autoConnectRetrysimply means the driver will automatically attempt to reconnect to the server(s) after unexpected disconnects. In production environments you usually want this set to true.
connectionsPerHostare the amount of physical connections a single Mongo instance (it's singleton so you usually have one per application) can establish to a mongod/mongos process. At time of writing the java driver will establish this amount of connections eventually even if the actual query throughput is low (in order words you will see the "conn" statistic in mongostat rise until it hits this number per app server).
There is no need to set this higher than 100 in most cases but this setting is one of those "test it and see" things. Do note that you will have to make sure you set this low enough so that the total amount of connections to your server do not exceed
db.serverStatus().connections.available
In production we currently have this at 40.
connectTimeout. As the name suggest number of milliseconds the driver will wait before a connection attempt is aborted. Set timeout to something long (15-30 seconds) unless there's a realistic, expected chance this will be in the way of otherwise succesful connection attempts. Normally if a connection attempt takes longer than a couple of seconds your network infrastructure isn't capable of high throughput.
maxWaitTime. Number of ms a thread will wait for a connection to become available on the connection pool, and raises an exception if this does not happen in time. Keep default.
socketTimeout. Standard socket timeout value. Set to 60 seconds (60000).
threadsAllowedToBlockForConnectionMultiplier. Multiplier for connectionsPerHost that denotes the number of threads that are allowed to wait for connections to become available if the pool is currently exhausted. This is the setting that will cause the "com.mongodb.DBPortPool$SemaphoresOut: Out of semaphores to get db connection" exception. It will throw this exception once this thread queue exceeds the threadsAllowedToBlockForConnectionMultiplier value. For example, if the connectionsPerHost is 10 and this value is 5 up to 50 threads can block before the aforementioned exception is thrown.
If you expect big peaks in throughput that could cause large queues temporarily increase this value. We have it at 1500 at the moment for exactly that reason. If your query load consistently outpaces the server you should just improve your hardware/scaling situation accordingly.
readPreference. (UPDATED, 2.8+)Used to determine the default read preference and replaces "slaveOk". Set up a ReadPreference through one of the class factory method. A full description of the most common settings can be found at the end of this post
w. (UPDATED, 2.6+)This value determines the "safety" of the write. When this value is -1 the write will not report any errors regardless of network or database errors. WriteConcern.NONE is the appropriate predefined WriteConcern for this. If w is 0 then network errors will make the write fail but mongo errors will not. This is typically referred to as "fire and forget" writes and should be used when performance is more important than consistency and durability. Use WriteConcern.NORMAL for this mode.
If you set w to 1 or higher the write is considered safe. Safe writes perform the write and follow it up by a request to the server to make sure the write succeeded or retrieve an error value if it did not (in other words, it sends a getLastError() command after you write). Note that until this getLastError() command is completed the connection is reserved. As a result of that and the additional command the throughput will be signficantly lower than writes with w <= 0. With a w value of exactly 1 MongoDB guarantees the write succeeded (or verifiably failed) on the instance you sent the write to.
In the case of replica sets you can use higher values for w whcih tell MongoDB to send the write to at least "w" members of the replica set before returning (or more accurately, wait for the replication of your write to "w" members). You can also set w to the string "majority" which tells MongoDB to perform the write to the majority of replica set members (WriteConcern.MAJORITY). Typicall you should set this to 1 unless you need raw performance (-1 or 0) or replicated writes (>1). Values higher than 1 have a considerable impact on write throughput.
fsync. Durability option that forces mongo to flush to disk after each write when enabled. I've never had any durability issues related to a write backlog so we have this on false (the default) in production.
j*(NEW 2.7+)*. Boolean that when set to true forces MongoDB to wait for a successful journaling group commit before returning. If you have journaling enabled you can enable this for additional durability. Refer to http://www.mongodb.org/display/DOCS/Journalingto see what journaling gets you (and thus why you might want to enable this flag).
autoConnectRetry只是意味着驱动程序将在意外断开连接后自动尝试重新连接到服务器。在生产环境中,您通常希望将此设置为 true。
connectionPerHost是单个 Mongo 实例(它是单例的,因此您通常每个应用程序有一个)可以建立到 mongod/mongos 进程的物理连接数量。在编写时,即使实际查询吞吐量很低,java 驱动程序最终也会建立这个数量的连接(换句话说,您将看到 mongostat 中的“conn”统计数据上升,直到它达到每个应用服务器的这个数字)。
在大多数情况下,不需要将此设置为高于 100,但此设置是“测试并查看”的设置之一。请注意,您必须确保将其设置得足够低,以便与服务器的连接总数不超过
db.serverStatus().connections.available
在生产中,我们目前有 40 个。
连接超时。顾名思义,驱动程序在连接尝试中止之前将等待的毫秒数。将超时设置为较长的时间(15-30 秒),除非有现实的、预期的机会,否则这将妨碍成功的连接尝试。通常,如果连接尝试花费的时间超过几秒钟,则您的网络基础设施无法提供高吞吐量。
最大等待时间。线程将等待连接池上的连接可用的毫秒数,如果这没有及时发生,则会引发异常。保持默认。
套接字超时。标准套接字超时值。设置为 60 秒 (60000)。
线程AllowedToBlockForConnectionMultiplier。connectionPerHost 的乘数,表示如果池当前已耗尽,则允许等待连接变为可用的线程数。这是将导致“com.mongodb.DBPortPool$SemaphoresOut: Out of semaphores to get db connection”异常的设置。一旦此线程队列超过threadsAllowedToBlockForConnectionMultiplier 值,它将抛出此异常。例如,如果connectionsPerHost 为10 且此值为5,则在引发上述异常之前最多可阻塞50 个线程。
如果您预计可能导致大队列的吞吐量高峰会暂时增加此值。正是出于这个原因,我们目前将其设为 1500。如果您的查询负载始终超过服务器,您应该相应地改善您的硬件/扩展情况。
阅读偏好。(更新,2.8+)用于确定默认读取首选项并替换“slaveOk”。通过类工厂方法之一设置 ReadPreference。最常见设置的完整描述可以在这篇文章的末尾找到
w ^。(更新,2.6+)该值决定了写入的“安全性”。当此值为 -1 时,无论网络或数据库错误如何,写入都不会报告任何错误。WriteConcern.NONE 是为此适当的预定义 WriteConcern。如果 w 为 0,则网络错误将使写入失败,但 mongo 错误不会。这通常称为“即发即弃”写入,应该在性能比一致性和持久性更重要时使用。在此模式下使用 WriteConcern.NORMAL。
如果您将 w 设置为 1 或更高,则写入被认为是安全的。安全写入执行写入并通过对服务器的请求进行跟进,以确保写入成功或如果未成功则检索错误值(换句话说,它会在您写入后发送 getLastError() 命令)。请注意,在此 getLastError() 命令完成之前,连接是保留的。由于这一点和附加命令,吞吐量将显着低于 w <= 0 的写入。当 aw 值恰好为 1 时,MongoDB 保证在您发送写入的实例上写入成功(或可验证失败)。
在副本集的情况下,您可以使用更高的值 whcih 告诉 MongoDB 在返回之前将写入发送到副本集的至少“w”个成员(或更准确地说,等待将您的写入复制到“w”个成员)。您还可以将 w 设置为字符串“majority”,它告诉 MongoDB 执行对大多数副本集成员(WriteConcern.MAJORITY)的写入。通常,除非您需要原始性能(-1 或 0)或复制写入(>1),否则您应该将其设置为 1。高于 1 的值对写入吞吐量有相当大的影响。
同步。持久性选项,在启用时强制 mongo 在每次写入后刷新到磁盘。我从来没有遇到过与写积压相关的持久性问题,所以我们在生产中将其设为 false(默认值)。
j* (新的 2.7+)*。布尔值,当设置为 true 时强制 MongoDB 在返回之前等待成功的日志组提交。如果您启用了日记功能,您可以启用它以获得额外的持久性。请参阅http://www.mongodb.org/display/DOCS/Journaling以查看日志为您提供了什么(以及您可能想要启用此标志的原因)。
ReadPreferenceThe ReadPreference class allows you to configure to what mongod instances queries are routed if you are working with replica sets. The following options are available :
ReadPreference如果您正在使用副本集,ReadPreference 类允许您配置查询路由的 mongod 实例。以下选项可用:
ReadPreference.primary(): All reads go to the repset primary member only. Use this if you require all queries to return consistent (the most recently written) data. This is the default.
ReadPreference.primaryPreferred(): All reads go to the repset primary member if possible but may query secondary members if the primary node is not available. As such if the primary becomes unavailable reads become eventually consistent, but only if the primary is unavailable.
ReadPreference.secondary(): All reads go to secondary repset members and the primary member is used for writes only. Use this only if you can live with eventually consistent reads. Additional repset members can be used to scale up read performance although there are limits to the amount of (voting) members a repset can have.
ReadPreference.secondaryPreferred(): All reads go to secondary repset members if any of them are available. The primary member is used exclusively for writes unless all secondary members become unavailable. Other than the fallback to the primary member for reads this is the same as ReadPreference.secondary().
ReadPreference.nearest(): Reads go to the nearest repset member available to the database client. Use only if eventually consistent reads are acceptable. The nearest member is the member with the lowest latency between the client and the various repset members. Since busy members will eventually have higher latencies this shouldalso automatically balance read load although in my experience secondary(Preferred) seems to do so better if member latencies are relatively consistent.
ReadPreference.primary():所有读取仅转到 repset 主要成员。如果您需要所有查询返回一致(最近写入的)数据,请使用此选项。这是默认设置。
ReadPreference.primaryPreferred():如果可能,所有读取都会转到 repset 主要成员,但如果主要节点不可用,则可能会查询次要成员。因此,如果主节点不可用,读取最终会变得一致,但前提是主节点不可用。
ReadPreference.secondary():所有读取都转到次要 repset 成员,主要成员仅用于写入。仅当您可以接受最终一致的读取时才使用它。尽管一个 repset 可以拥有的(投票)成员数量有限制,但可以使用额外的 repset 成员来提高读取性能。
ReadPreference.secondaryPreferred():如果有任何可用,则所有读取都将转到次要 repset 成员。主要成员专门用于写入,除非所有次要成员都不可用。除了读取到主要成员的回退之外,这与 ReadPreference.secondary() 相同。
ReadPreference.nearest():读取到数据库客户端可用的最近的 repset 成员。仅当最终一致性读取可接受时才使用。最近的成员是客户端和各种 repset 成员之间延迟最低的成员。由于繁忙的成员最终会有更高的延迟,这也应该自动平衡读取负载,尽管根据我的经验,如果成员延迟相对一致,二级(首选)似乎做得更好。
Note : All of the above have tag enabled versions of the same method which return TaggableReadPreference instances instead. A full description of replica set tags can be found here : Replica Set Tags
注意:以上所有方法都具有相同方法的标记启用版本,它们返回 TaggableReadPreference 实例。副本集标签的完整描述可以在这里找到:副本集标签