java Cassandra 中的密钥空间模式导入和导出
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/11682197/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Keyspace schema import and export in Cassandra
提问by Sunil Kumar
I have a Cassandra 1.1.2 installation on my system as a single node cluster and have three keyspaces: hotel
, student
and employee
. I want to dump the keyspace schema of hotel
along with its column family data if possible and restore the dump on other Cassandra cluster. Can any one suggest me in detail that how should I do this?
我在我的系统上安装了 Cassandra 1.1.2 作为单节点集群,并且具有三个键空间:hotel
,student
和employee
. hotel
如果可能,我想转储 的键空间模式及其列族数据,并在其他 Cassandra 集群上恢复转储。任何人都可以详细建议我应该如何做到这一点?
采纳答案by Tamil
You can use sstable2json
and json2sstable
cassandra tools
您可以使用sstable2json
和json2sstable
cassandra 工具
Check out Datastax documentationon the same and thistoo
Usage: sstable2json [-f outfile] <sstable> [-k key [-k key [...]]]
Usage: json2sstable -K keyspace -c column_family <json> <sstable>
You can always execute cassandra-cli commands in file
您始终可以在文件中执行 cassandra-cli 命令
cassandra-cli -h HOST -p PORT -f fileName
You can load all your create statements in to a file and execute this command
您可以将所有创建语句加载到文件中并执行此命令
To get cli scripts to create keyspaces and column families use following command in cassandra-cli interface
要获取 cli 脚本以创建键空间和列族,请在 cassandra-cli 界面中使用以下命令
show schema
But incase you wanna create a cluster of two nodes. You don't need to do all the above. Just starting the other node with different token range and same cluster name would do. Cassandra internally will manage to stream the data and schema informations
但是,如果您想创建一个包含两个节点的集群。您无需执行上述所有操作。只需启动具有不同令牌范围和相同集群名称的另一个节点即可。Cassandra 将在内部设法传输数据和模式信息
回答by user3360277
I don't recommend use stable2json
and json2sstable
to load a large amout of data. It uses Hymanson API to create the dataset and transform it to json format. It implies to load all of the data in memory to create a unique json representation.
我不建议使用stable2json
和json2sstable
加载大量数据。它使用 Hymanson API 创建数据集并将其转换为 json 格式。它意味着将所有数据加载到内存中以创建唯一的 json 表示。
It is ok for a few amount of data, now imagine to load a large dataset of more than 40 million of rows, about 25GB of data, these tools simply doesn't work well. I already asked datastax guys about it without clarification.
少量的数据是可以的,现在想象一下加载一个超过4000万行的大数据集,大约25GB的数据,这些工具根本无法正常工作。我已经问过 datastax 的人了,但没有澄清。
In case of large datasets, just copy cassandra data files from a cluster to another may solve the problem. In my case I'm was trying to migrate from Cassandra 1.0.6 cluster to a 1.2.1, the data files were not compatible between this versions.
对于大型数据集,只需将 cassandra 数据文件从一个集群复制到另一个集群就可以解决问题。在我的例子中,我试图从 Cassandra 1.0.6 集群迁移到 1.2.1,但这些版本之间的数据文件不兼容。
What is the solution? I'm just writing my own export/import tool to solve this. I hope to post a link for this tool soon.
解决办法是什么?我只是在编写自己的导出/导入工具来解决这个问题。我希望尽快发布此工具的链接。