将 JSON 文件导入/索引到 Elasticsearch

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/15936616/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-03 19:40:07  来源:igfitidea点击:

Import/Index a JSON file into Elasticsearch

jsonelasticsearch

提问by Shawn Roller

I am new to Elasticsearch and have been entering data manually up until this point. For example I've done something like this:

我是 Elasticsearch 的新手,到目前为止一直在手动输入数据。例如我做了这样的事情:

$ curl -XPUT 'http://localhost:9200/twitter/tweet/1' -d '{
    "user" : "kimchy",
    "post_date" : "2009-11-15T14:12:12",
    "message" : "trying out Elastic Search"
}'

I now have a .json file and I want to index this into Elasticsearch. I've tried something like this too, but no success:

我现在有一个 .json 文件,我想将其索引到 Elasticsearch 中。我也试过这样的事情,但没有成功:

curl -XPOST 'http://jfblouvmlxecs01:9200/test/test/1' -d lane.json

How do I import a .json file? Are there steps I need to take first to ensure the mapping is correct?

如何导入 .json 文件?我需要先采取哪些步骤来确保映射正确?

采纳答案by javanna

The right command if you want to use a file with curl is this:

如果要使用带有 curl 的文件,正确的命令是:

curl -XPOST 'http://jfblouvmlxecs01:9200/test/_doc/1' -d @lane.json

Elasticsearch is schemaless, therefore you don't necessarily need a mapping. If you send the json as it is and you use the default mapping, every field will be indexed and analyzed using the standard analyzer.

Elasticsearch 是无模式的,因此您不一定需要映射。如果您按原样发送 json 并使用默认映射,则每个字段都将使用标准分析器进行索引和分析。

If you want to interact with Elasticsearch through the command line, you may want to have a look at the elasticshellwhich should be a little bit handier than curl.

如果你想通过命令行与 Elasticsearch 交互,你可能想看看elasticshell,它应该比 curl 更方便一些。

2019-07-10: Should be noted that custom mapping typesis deprecated and should not be used. I updated the type in the url above to make it easier to see which was the index and which was the type as having both named "test" was confusing.

2019-07-10:需要注意的是,自定义映射类型已被弃用,不应使用。我更新了上面 url 中的类型,以便更容易地看到哪个是索引,哪个是类型,因为同时命名为“test”是令人困惑的。

回答by KenH

Per the current docs, https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html:

根据当前文档,https: //www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html

If you're providing text file input to curl, you must use the --data-binary flag instead of plain -d. The latter doesn't preserve newlines.

如果您向 curl 提供文本文件输入,则必须使用 --data-binary 标志而不是普通的 -d。后者不保留换行符。

Example:

例子:

$ curl -s -XPOST localhost:9200/_bulk --data-binary @requests

回答by Evan

We made a little tool for this type of thing https://github.com/taskrabbit/elasticsearch-dump

我们为这类事情做了一个小工具https://github.com/taskrabbit/elasticsearch-dump

回答by MosheZada

I'm the author of elasticsearch_loader
I wrote ESL for this exact problem.

我是 elasticsearch_loader 的作者,
我为这个确切的问题编写了 ESL。

You can download it with pip:

你可以用pip下载它:

pip install elasticsearch-loader

And then you will be able to load json files into elasticsearch by issuing:

然后您将能够通过发出以下命令将 json 文件加载到 elasticsearch 中:

elasticsearch_loader --index incidents --type incident json file1.json file2.json

回答by Ram Pratap

Adding to KenH's answer

添加到 KenH 的答案

$ curl -s -XPOST localhost:9200/_bulk --data-binary @requests

You can replace @requestswith @complete_path_to_json_file

您可以替换@requests@complete_path_to_json_file

Note: @is important before the file path

注意:@文件路径之前很重要

回答by Gajendra D Ambi

I just made sure that I am in the same directory as the json file and then simply ran this

我只是确保我与 json 文件在同一目录中,然后简单地运行它

curl -s -H "Content-Type: application/json" -XPOST localhost:9200/product/default/_bulk?pretty --data-binary @product.json

So if you too make sure you are at the same directory and run it this way. Note: product/default/ in the command is something specific to my environment. you can omit it or replace it with whatever is relevant to you.

所以如果你也确保你在同一个目录下并以这种方式运行它。注意:命令中的 product/default/ 特定于我的环境。您可以省略它或用与您相关的任何内容替换它。

回答by Piyush Mittal

just get postman from https://www.getpostman.com/docs/environmentsgive it the file location with /test/test/1/_bulk?pretty command. enter image description here

只需从https://www.getpostman.com/docs/environments获取邮递员,使用 /test/test/1/_bulk?pretty 命令为其指定文件位置。 在此处输入图片说明

回答by Greg Dougherty

One thing I've not seen anyone mention: the JSON file must have one line specifying the index the next line belongs to, for every line of the "pure" JSON file.

我没有看到任何人提到的一件事:对于“纯”JSON 文件的每一行,JSON 文件必须有一行指定下一行所属的索引。

I.E.

IE

{"index":{"_index":"shakespeare","_type":"act","_id":0}}
{"line_id":1,"play_name":"Henry IV","speech_number":"","line_number":"","speaker":"","text_entry":"ACT I"}

Without that, nothing works, and it won't tell you why

没有它,什么都不起作用,它也不会告诉你为什么

回答by MLS

You are using

您正在使用

$ curl -s -XPOST localhost:9200/_bulk --data-binary @requests

If 'requests' is a json file then you have to change this to

如果“请求”是一个 json 文件,那么您必须将其更改为

$ curl -s -XPOST localhost:9200/_bulk --data-binary @requests.json

Now before this, if your json file is not indexed, you have to insert an index line before each line inside the json file. You can do this with JQ. Refer below link: http://kevinmarsh.com/2014/10/23/using-jq-to-import-json-into-elasticsearch.html

在此之前,如果您的 json 文件未编入索引,则必须在 json 文件中的每一行之前插入一个索引行。你可以用 JQ 做到这一点。请参阅以下链接:http: //kevinmarsh.com/2014/10/23/using-jq-to-import-json-into-elasticsearch.html

Go to elasticsearch tutorials (example the shakespeare tutorial) and download the json file sample used and have a look at it. In front of each json object (each individual line) there is an index line. This is what you are looking for after using the jq command. This format is mandatory to use the bulk API, plain json files wont work.

转到 elasticsearch 教程(例如莎士比亚教程)并下载使用的 json 文件示例并查看它。在每个 json 对象(每个单独的行)前面都有一个索引行。这就是您在使用 jq 命令后要查找的内容。这种格式是使用批量 API 所必需的,普通的 json 文件不起作用。

回答by Yaroslav Gaponov

I wrote some code to expose the Elasticsearch API via a Filesystem API.

我编写了一些代码来通过文件系统 API 公开 Elasticsearch API。

It is good idea for clear export/import of data for example.

例如,清楚地导出/导入数据是个好主意。

I created prototypeelasticdriver. It is based on FUSE

我创建了原型elasticdriver。它基于FUSE

demo

演示