Java Cassandra Datastax 驱动程序 - 连接池
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/20421763/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Cassandra Datastax Driver - Connection Pool
提问by Anakin001
I'm trying to understand the connection pooling in Datastax Cassandra Driver, so I can better use it in my web service.
我正在尝试了解 Datastax Cassandra Driver 中的连接池,以便我可以更好地在我的 Web 服务中使用它。
I have version 1.0 of the documentation. It says:
我有 1.0 版的文档。它说:
The Java driver uses connections asynchronously, so multiple requests can be submitted on the same connection at the same time.
Java 驱动程序异步使用连接,因此可以在同一连接上同时提交多个请求。
What do they understand by connection? When connecting to a cluster, we have: a Builder, a Cluster and a Session. Which one of them is the connection?
他们对关联的理解是什么?当连接到一个集群时,我们有:一个构建器、一个集群和一个会话。其中哪一个是连接?
For example, there is this parameter:
例如,有这个参数:
maxSimultaneousRequestsPerConnection - number of simultaneous requests on all connections to a host after which more connections are created.
maxSimultaneousRequestsPerConnection - 主机的所有连接上的同时请求数,之后创建更多连接。
So, these connections are automatically created, in the case of connection pooling (which is what I would expect). But what exactly are the connections? Cluster objects? Sessions?
因此,在连接池的情况下(这是我所期望的),这些连接是自动创建的。但究竟有哪些联系呢?集群对象?会议?
I'm trying to decide what to keep 'static' in my web service. For the moment, I decided to keep the Builder static, so for every call I create a new Cluster and a new Session. Is this ok? If the Cluster is the Connection, then it should be ok. But is it? Now, the logger says, for every call:
我正在尝试决定在我的网络服务中保持“静态”的内容。目前,我决定让 Builder 保持静态,因此对于每次调用,我都会创建一个新的集群和一个新的会话。这个可以吗?如果集群是连接,那么应该没问题。但是是吗?现在,记录器说,对于每次调用:
2013:12:06 12:05:50 DEBUG Cluster:742 - Starting new cluster with contact points
2013:12:06 12:05:50 DEBUG ControlConnection:216 - [Control connection] Refreshing node list and token map
2013:12:06 12:05:50 DEBUG ControlConnection:219 - [Control connection] Refreshing schema
2013:12:06 12:05:50 DEBUG ControlConnection:147 - [Control connection] Successfully connected to...
2013:12:06 12:05:50 DEBUG Cluster:742 - 使用接触点启动新集群
2013:12:06 12:05:50 DEBUG ControlConnection:216 - [控制连接] 刷新节点列表和令牌映射
2013:12:06 12:05:50 DEBUG ControlConnection:219 - [控制连接] 刷新架构
2013:12:06 12:05:50 DEBUG ControlConnection:147 - [控制连接] 成功连接到...
So, it connects to the Cluster every time? It's not what I want, I want to reuse connections.
那么,它每次都连接到 Cluster 吗?这不是我想要的,我想重用连接。
So, the connection is actually the Session? If this is the case, I should keep the Cluster static, not the Builder.
那么,连接其实就是Session?如果是这种情况,我应该保持 Cluster 静态,而不是 Builder。
What method should I call, to be sure I reuse connections, whenever possible?
我应该调用什么方法,以确保我尽可能重用连接?
采纳答案by C4stor
You are right, the connection is actually in the Session, and the Session is the object you should give to your DAOs to write into Cassandra.
您是对的,连接实际上在 Session 中,而 Session 是您应该提供给 DAO 以写入 Cassandra 的对象。
As long as you use the same Session object, you should be reusing connections (you can see the Session as being your connection pool).
只要您使用相同的 Session 对象,您就应该重用连接(您可以将 Session 视为您的连接池)。
Edit (2017/4/10) : I precised this answer following @William Price one. Please be aware that this answer is 4 years old, and Cassandra have changed a fair bit in the meantime !
编辑(2017/4/10):我在@William Price 之后精确地给出了这个答案。请注意,这个答案已经有 4 年历史了,而 Cassandra 在此期间发生了一些变化!
回答by LynAs
Just an update for the community. You can set connection pool in the following way
只是社区的更新。您可以通过以下方式设置连接池
private static Cluster cluster;
cluster.getConfiguration().getPoolingOptions().setMaxConnectionsPerHost(HostDistance.LOCAL,100);
回答by William Price
The accepted answer(at the time of this writing)is giving the correct advice:
该接受的答案(在写这篇文章的时间)被给予正确的建议:
As long as you use the same Session object, you [will] be reusing connections.
只要您使用相同的 Session 对象,您就会[将] 重用连接。
However, some parts were originally oversimplified. I hope the following provides insight into the scope of each object type and their respective purposes.
但是,有些部分最初被过度简化了。我希望通过以下内容可以深入了解每种对象类型的范围及其各自的目的。
Builder ≠ Cluster ≠ Session ≠ Connection ≠ Statement
生成器≠集群≠会话≠连接≠语句
A Cluster.Builder
is used to configure and create a Cluster
ACluster.Builder
用于配置和创建集群
A Cluster
represents the entire Cassandra ring
ACluster
代表整个Cassandra环
A ring consists of multiple nodes (hosts), and the ring can support one or more keyspaces. You can query a Clusterobject about cluster- (ring)-level properties.
一个环由多个节点(主机)组成,环可以支持一个或多个键空间。您可以查询有关集群(环)级属性的集群对象。
I also think of it as the object that represents the calling applicationto the ring. You communicated your application's needs (e.g. encryption, compression, etc.) to the builder, but it is this objectthat firstimplements/communicates with the actual C* ring. If your application uses more than one authentication credentialfor different users/purposes, you likely have different Clusterobjects even if they connect to the same ring.
我还认为它是代表呼叫应用程序到环的对象。你传达你的应用程序的需求(例如,加密,压缩等)的制造商,但它是这个对象是第一工具/与实际C *环连通。如果您的应用程序为不同的用户/目的使用多个身份验证凭据,即使它们连接到同一个环,您也可能拥有不同的Cluster对象。
A Session
itself is nota connection, but it manages them
A本身不是连接,但它管理它们Session
A session may need to talk to allnodes in the ring, which cannot be done with a single TCP connection except in the special case of rings that contain exactly one(1) node. The Sessionmanages a connection pool, and that pool will generally have at least one connection for each nodein the ring. This is why you should re-use Sessionobjects as much as possible. An application does not directly manage or access connections.
一个会话可能需要与环中的所有节点对话,这不能通过单个 TCP 连接完成,除非环仅包含一个 (1) 节点的特殊情况。所述会话管理一个连接池,并且池通常具有用于至少一个连接每个节点在环。 这就是为什么你应该尽可能多地重用Session对象。应用程序不直接管理或访问连接。
A Sessionis accessed from the Clusterobject; it is usually "bound" to a single keyspaceat a time, which becomes the default keyspace for the statementsexecuted from that session. A statement can use a fully-qualified table name (e.g. keyspacename.tablename
) to access tables in other keyspaces, so it's not requiredto use multiple sessions to access data across keyspaces. Using multiple sessionsto talk to the same ringincreases the total number of TCP connections required.
甲会话从访问群集对象; 它通常一次“绑定”到一个键空间,它成为从该会话执行的语句的默认键空间。语句可以使用完全限定的表名(例如keyspacename.tablename
)来访问其他键空间中的表,因此不需要使用多个会话来跨键空间访问数据。使用多个会话与同一个环通话会增加所需的 TCP 连接总数。
A Statement
executes within a Session
AStatement
在Session 中执行
Statements can be preparedor not, and each one either mutates data or queries it (and in some cases, both). The fastest, most efficient statements need to communicate with at most one node, and a Sessionfrom a topology-aware Clustershould contact only that node (or one of its peers) on a single TCP connection. The least efficient statements must touch all replicas (a majority of nodes), but that will be handled by the coordinator nodeon the ring itself, so even for these statements the Sessionwill only use a single connection from the application.
语句可以准备好也可以不准备,每个语句要么改变数据要么查询它(在某些情况下,两者兼而有之)。最快、最有效的语句最多需要与一个节点通信,并且来自拓扑感知集群的会话应该只与单个 TCP 连接上的该节点(或其一个对等节点)联系。效率最低的语句必须涉及所有副本(大多数节点),但这将由环本身上的协调器节点处理,因此即使对于这些语句,会话也仅使用来自应用程序的单个连接。
Also, versions 2 and 3 of the Cassandra binary protocolused by the driver use multiplexingon the connections. So while a single statement requires at least one TCP connection, that single connection can potentially service up to 128 or 32k+ asynchronous requests simultaneously, depending on the protocol version (respectively).
此外,驱动程序使用的 Cassandra二进制协议的第 2 版和第 3 版在连接上使用多路复用。因此,虽然单个语句需要至少一个 TCP 连接,但该单个连接可能同时服务多达 128 或 32k+ 个异步请求,具体取决于协议版本(分别)。