java Cassandra 中的密钥空间模式导入和导出

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/11682197/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-31 06:02:01  来源:igfitidea点击:

Keyspace schema import and export in Cassandra

javacassandra

提问by Sunil Kumar

I have a Cassandra 1.1.2 installation on my system as a single node cluster and have three keyspaces: hotel, studentand employee. I want to dump the keyspace schema of hotelalong with its column family data if possible and restore the dump on other Cassandra cluster. Can any one suggest me in detail that how should I do this?

我在我的系统上安装了 Cassandra 1.1.2 作为单节点集群,并且具有三个键空间:hotel,studentemployee. hotel如果可能,我想转储 的键空间模式及其列族数据,并在其他 Cassandra 集群上恢复转储。任何人都可以详细建议我应该如何做到这一点?

采纳答案by Tamil

You can use sstable2jsonand json2sstablecassandra tools

您可以使用sstable2jsonjson2sstablecassandra 工具

Check out Datastax documentationon the same and thistoo

查看同样的Datastax文档也是

Usage: sstable2json [-f outfile] <sstable> [-k key [-k key [...]]]
Usage: json2sstable -K keyspace -c column_family <json> <sstable>

You can always execute cassandra-cli commands in file

您始终可以在文件中执行 cassandra-cli 命令

cassandra-cli -h HOST -p PORT -f fileName

You can load all your create statements in to a file and execute this command

您可以将所有创建语句加载到文件中并执行此命令

To get cli scripts to create keyspaces and column families use following command in cassandra-cli interface

要获取 cli 脚本以创建键空间和列族,请在 cassandra-cli 界面中使用以下命令

show schema

But incase you wanna create a cluster of two nodes. You don't need to do all the above. Just starting the other node with different token range and same cluster name would do. Cassandra internally will manage to stream the data and schema informations

但是,如果您想创建一个包含两个节点的集群。您无需执行上述所有操作。只需启动具有不同令牌范围和相同集群名称的另一个节点即可。Cassandra 将在内部设法传输数据和模式信息

回答by user3360277

I don't recommend use stable2jsonand json2sstableto load a large amout of data. It uses Hymanson API to create the dataset and transform it to json format. It implies to load all of the data in memory to create a unique json representation.

我不建议使用stable2jsonjson2sstable加载大量数据。它使用 Hymanson API 创建数据集并将其转换为 json 格式。它意味着将所有数据加载到内存中以创建唯一的 json 表示。

It is ok for a few amount of data, now imagine to load a large dataset of more than 40 million of rows, about 25GB of data, these tools simply doesn't work well. I already asked datastax guys about it without clarification.

少量的数据是可以的,现在想象一下加载一个超过4000万行的大数据集,大约25GB的数据,这些工具根本无法正常工作。我已经问过 datastax 的人了,但没有澄清。

In case of large datasets, just copy cassandra data files from a cluster to another may solve the problem. In my case I'm was trying to migrate from Cassandra 1.0.6 cluster to a 1.2.1, the data files were not compatible between this versions.

对于大型数据集,只需将 cassandra 数据文件从一个集群复制到另一个集群就可以解决问题。在我的例子中,我试图从 Cassandra 1.0.6 集群迁移到 1.2.1,但这些版本之间的数据文件不兼容。

What is the solution? I'm just writing my own export/import tool to solve this. I hope to post a link for this tool soon.

解决办法是什么?我只是在编写自己的导出/导入工具来解决这个问题。我希望尽快发布此工具的链接。