php 如何在预先存在的 SQL 数据库之上使用弹性搜索?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/17856457/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 16:34:55  来源:igfitidea点击:

How to use Elastic Search on top of a pre-existing SQL Database?

phpjavascriptsqljsonelasticsearch

提问by Miles M.

I've been reading through a lot of good documentation about how to implement Elastic Search on a website with javascript or PHP.

我已经阅读了很多关于如何使用 javascript 或 PHP 在网站上实现弹性搜索的优秀文档。

Very good introduction to ES.

很好的ES 介绍

Very complete documentation hereand here.

非常完整的文档herehere

A whole CRUD.

整个CRUD

Elastic search with PHP: here, here, and here.

使用 PHP 进行弹性搜索:此处此处此处

So the reason why I'm giving you those URLs is to understand how to use one or many of those great documentations when having a pre-existing SQL DB.

所以我给你这些 URL 的原因是为了了解在拥有一个预先存在的 SQL 数据库时如何使用这些伟大的文档中的一个或多个。

I'm missing the point somewhere: As they said Elasticsearch will create its own indexes and DB with MongoDB, I don't understand how can I use my (gigantic) database using SQL? Let say I have a MySQL DB, and I would like to use Elasticsearch to make my research faster and to propose the user pre-made queries, how do I do that? How does ES works over/along MySQL? How to transfer this gigantic set of Datas (over 8GB) into ES DB in order to be fully efficient at the beginning?

我在某处遗漏了一点:正如他们所说的 Elasticsearch 将使用 MongoDB 创建自己的索引和数据库,我不明白如何使用 SQL 使用我的(巨大的)数据库?假设我有一个 MySQL 数据库,我想使用 Elasticsearch 使我的研究更快,并向用户提出预制查询,我该怎么做?ES 如何在 MySQL 上/与 MySQL 一起工作?如何将这组庞大的数据(超过 8GB)传输到 ES DB 中才能在开始时完全高效?

Many Thanks

非常感谢

采纳答案by Tim

I am using jdbc-riverw/ mysql. It is very fast. You can configure them to continually poll data, or use one-time (one-shot strategy) imports.

我正在使用带有 mysql 的jdbc-river。它非常快。您可以将它们配置为持续轮询数据,或使用一次性(一次性策略)导入。

e.g.

例如

curl -xPUT http://es-server:9200/_river/my_river/_meta -d '
{
    "type" : "jdbc",
    "jdbc" : {
        "strategy" : "simple",
        "poll" : "5s",
        "scale" : 0,
        "autocommit" : false,
        "fetchsize" : 10,
        "max_rows" : 0,
        "max_retries" : 3,
        "max_retries_wait" : "10s",
        "driver" : "com.mysql.jdbc.Driver",
        "url" : "jdbc:mysql://mysql-server:3306/mydb",
        "user" : "root",
        "password" : "password*",
        "sql" : "select c.id, c.brandCode, c.companyCode from category c"
    },
    "index" : {
        "index" : "mainIndex",
        "type" : "category",
        "bulk_size" : 30,
        "max_bulk_requests" : 100,
        "index_settings" : null,
        "type_mapping" : null,
        "versioning" : false,
        "acknowledge" : false
    }
}'

回答by Tony O'Hagan

If you need a more performant and scalable solution to the polling offered by jdbc-river, I recommend that you watch this presentation that explains how to perform incremental syncing from SQL Server into Elastic Search:

如果您需要针对 jdbc-river 提供的轮询的更高性能和可扩展性的解决方案,我建议您观看此演示文稿,该演示文稿解释了如何执行从 SQL Server 到 Elastic Search 的增量同步:

The principles discussed in the video also apply for other RDBMS -> NoSQL replication applications.

视频中讨论的原则也适用于其他 RDBMS -> NoSQL 复制应用程序。