java Google 图片搜索:如何构建反向图片搜索 URL?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/7584808/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-30 20:35:49  来源:igfitidea点击:

Google image search: How do I construct a reverse image search URL?

javagoogle-image-search

提问by maks

How can I programmatically through java convert an image to "some string" to pass it as a parameter for searching in google image search. Actually I have made some base64 convertion of image but it differs from that that google does in its image search engine. I've made such a convertion(java 7):

如何通过java以编程方式将图像转换为“某个字符串”以将其作为参数传递以在谷歌图像搜索中进行搜索。实际上我已经对图像进行了一些 base64 转换,但它与谷歌在其图像搜索引擎中所做的不同。我做了这样的转换(java 7):

import javax.xml.bind.DatatypeConverter;
...
            Path p = Paths.get("my_photo.JPG");
            try(InputStream in = Files.newInputStream(p); 
                    PrintWriter write = new PrintWriter("base64.txt");
               ) {
                byte [] bytes = new byte[in.available()];
                in.read(bytes);
                String base64 = DatatypeConverter.printBase64Binary(bytes);
                write.println(base64);

            } catch(IOException ex) {
                ex.printStackTrace();
            }

the output of this simple program differs from the google's string in url. I talk about that string that goes after tbs=sbi:AMhZZ...

这个简单程序的输出与 url 中的 google 字符串不同。我谈论的是后面的那根弦tbs=sbi:AMhZZ...

回答by mikerobi

This is my best guess for how the image search works:

这是我对图像搜索如何工作的最佳猜测:

The data in the URL is not an encoded form of the image. The data is an image fingerprint used for fuzzy matching.

URL 中的数据不是图像的编码形式。该数据是用于模糊匹配的图像指纹。

You should notice that when you upload an image for searching, it is a 2 step process. The first step uploads the image via the url http://images.google.com/searchbyimage/upload. The Google server returns the fingerprint. The browser is then redirected to a search page with a query string based on the fingerprint.

您应该注意到,当您上传图像进行搜索时,它是一个 2 步过程。第一步通过 url 上传图像http://images.google.com/searchbyimage/upload。Google 服务器返回指纹。然后将浏览器重定向到带有基于指纹的查询字符串的搜索页面。

Unless Google publishes the algorithm for generating the fingerprint, you will be unable to generate the search query string from within your application. Until then, you can have your application post the image to the upload URI. You should be able to parse the response and construct the query string.

除非 Google 发布生成指纹的算法,否则您将无法从应用程序中生成搜索查询字符串。在此之前,您可以让您的应用程序将图像发布到上传 URI。您应该能够解析响应并构建查询字符串。

EDIT

编辑

These are the keys and values sent to the server when I uploaded a file.

这些是我上传文件时发送到服务器的键和值。

image_url       =
btnG            = Search
encoded_image   = // the binary image content goes here
image_content   =
filename        =
hl              = en
bih             = 507
biw             = 1920

"bih" and "biw" look like dimensions, but do not corrispond to the uploaded file.

"bih" 和 "biw" 看起来像尺寸,但与上传的文件不对应。

Use this information at your own risk. It is an undocumented api that could change and break your application.

使用此信息的风险由您自行承担。这是一个未公开的 api,可能会更改和破坏您的应用程序。

回答by Ajit

Using google's image search.

import java.io.BufferedReader;
import java.io.File;
import java.io.IOException;
import java.io.InputStreamReader;

import org.apache.http.HttpResponse;
import org.apache.http.client.ClientProtocolException;
import org.apache.http.client.HttpClient;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.entity.mime.MultipartEntity;
import org.apache.http.entity.mime.content.FileBody;
import org.apache.http.entity.mime.content.StringBody;
import org.apache.http.impl.client.DefaultHttpClient;

public class HttpFileUpload {
  public static void main(String args[]){
    try {
      HttpClient client = new DefaultHttpClient();
      String url="https://www.google.co.in/searchbyimage/upload";
      String imageFile="c:\temp\shirt.jpg";
      HttpPost post = new HttpPost(url);

      MultipartEntity entity = new MultipartEntity();
      entity.addPart("encoded_image", new FileBody(new File(imageFile)));
      entity.addPart("image_url",new StringBody(""));
      entity.addPart("image_content",new StringBody(""));
      entity.addPart("filename",new StringBody(""));
      entity.addPart("h1",new StringBody("en"));
      entity.addPart("bih",new StringBody("179"));
      entity.addPart("biw",new StringBody("1600"));

      post.setEntity(entity);
      HttpResponse response = client.execute(post);
      BufferedReader rd = new BufferedReader(new InputStreamReader(response.getEntity().getContent()));         

      String line = "";
      while ((line = rd.readLine()) != null) {
        if (line.indexOf("HREF")>0)
      System.out.println(line.substring(8));
      }

    }catch (ClientProtocolException cpx){
      cpx.printStackTrace();
    }catch (IOException ioex){
      ioex.printStackTrace();
    }
 }
}

回答by golimar

Based on @Ajit's answer, this does the same but using the curlcommand (Linux / Cygwin / etc)

根据@Ajit 的回答,这样做是一样的,但使用curl命令(Linux / Cygwin / 等)

curl -s -F "image_url=" -F "image_content=" -F "filename=" -F "h1=en"  -F "bih=179" -F "biw=1600" -F "encoded_image=@my_image_file.jpg" https://www.google.co.in/searchbyimage/upload

This will print a URL on standard output. You can download that URL with curlor wgetbut you may have to change the User Agent to that of a graphical web browser like Chrome.

这将在标准输出上打印一个 URL。您可以使用curl或下载该 URL,wget但您可能必须将用户代理更改为图形网络浏览器(如 Chrome)的用户代理。

回答by shark

Use Google Vision APIfor that. There are also lot of examples available from Google

为此使用Google Vision API。Google 也提供了很多示例