java Apache HTTPClient 流式传输 HTTP POST 请求?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/17012334/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-01 00:43:05  来源:igfitidea点击:

Apache HTTPClient Streaming HTTP POST Request?

javahttpapache-httpclient-4.x

提问by sigpwned

I'm trying to build a "full-duplex" HTTP streaming request using Apache HTTPClient.

我正在尝试使用Apache HTTPClient构建“全双工”HTTP 流请求。

In my first attempt, I tried using the following request code:

在我的第一次尝试中,我尝试使用以下请求代码:

URL url=new URL(/* code goes here */);

HttpPost request=new HttpPost(url.toString());

request.addHeader("Connection", "close");

PipedOutputStream requestOutput=new PipedOutputStream();
PipedInputStream requestInput=new PipedInputStream(requestOutput, DEFAULT_PIPE_SIZE);
ContentType requestContentType=getContentType();
InputStreamEntity requestEntity=new InputStreamEntity(requestInput, -1, requestContentType);
request.setEntity(requestEntity);

HttpEntity responseEntity=null;
HttpResponse response=getHttpClient().execute(request); // <-- Hanging here
try {
    if(response.getStatusLine().getStatusCode() != 200)
        throw new IOException("Unexpected status code: "+response.getStatusLine().getStatusCode());

    responseEntity = response.getEntity();
}
finally {
    if(responseEntity == null)
        request.abort();
}

InputStream responseInput=responseEntity.getContent();
ContentType responseContentType;
if(responseEntity.getContentType() != null)
    responseContentType = ContentType.parse(responseEntity.getContentType().getValue());
else
    responseContentType = DEFAULT_CONTENT_TYPE;

Reader responseStream=decode(responseInput, responseContentType);
Writer requestStream=encode(requestOutput, getContentType());

The request hangs at the line indicated above. It seems that the code is trying to send the entire request before it gets the response. In retrospect, this makes sense. However, it's not what I was hoping for. :)

请求挂在上面指示的行上。代码似乎试图在获得响应之前发送整个请求。回想起来,这是有道理的。然而,这并不是我所希望的。:)

Instead, I was hoping to send the request headers with Transfer-Encoding: chunked, receive a response header of HTTP/1.1 200 OKwith a Transfer-Encoding: chunkedheader of its own, and then I'd have a full-duplex streaming HTTP connection to work with.

相反,我希望用 发送请求标头Transfer-Encoding: chunked,接收一个HTTP/1.1 200 OK带有Transfer-Encoding: chunked它自己标头的响应标头,然后我有一个全双工流 HTTP 连接可以使用。

Happily, my HTTPClient has another NIO-based asynchronous client with good usage examples (like this one). My questions are:

令人高兴的是,我的 HTTPClient 有另一个基于 NIO 的异步客户端,有很好的使用示例(就像这个)。我的问题是:

  1. Is my interpretation of the synchronous HTTPClient behavior correct? Or is there something I can do to continue using the (simpler) synchronous HTTPClient in the manner I described?
  2. Does the NIO-based client wait to send the whole request before seeking a response? Or will I be able to send the request incrementally and receive the response incrementally at the same time?
  1. 我对同步 HTTPClient 行为的解释是否正确?或者我可以做些什么来以我描述的方式继续使用(更简单的)同步 HTTPClient?
  2. 基于 NIO 的客户端是否在寻求响应之前等待发送整个请求?或者我能不能增量发送请求并同时增量接收响应?

If HTTPClient will not support this modality, is there another HTTP client library that will? Or should I be planning to write a (minimal) HTTP client to support this modality?

如果 HTTPClient 不支持这种模式,是否有另一个 HTTP 客户端库支持?或者我应该计划编写一个(最小的)HTTP 客户端来支持这种模式?

回答by Chris

Here is my view on skim reading the code:

这是我对略读代码的看法:

  1. I cannot completely agree with the fact that a non-200 response means failure. All 2XX responses are mostly valid. Check wikifor more details

  2. For any TCP request, I would recommend to receive the entire response to confirm that it is valid. I say this because, a partial response may mostly be treated as bad response as most of the client implementations cannot make use of it. (Imagine a case where server is responding with 2MB of data and it goes down during this time)

  1. 我不能完全同意非 200 响应意味着失败的事实。所有 2XX 响应大多是有效的。查看维基了解更多详情

  2. 对于任何 TCP 请求,我建议接收整个响应以确认它是有效的。我这样说是因为,部分响应可能主要被视为不良响应,因为大多数客户端实现无法使用它。(想象一下服务器响应 2MB 数据的情况,并且在此期间宕机)

回答by Robert Christian

A separate thread must be writing to the OutputStream for your code to work.

必须有一个单独的线程写入 OutputStream 才能使您的代码工作。

  • The code above provides the HTTPClient with a PipedInputStream.
  • PipedInputStream makes bytes available as they are written to the corresponding OutputStream.
  • The code above does not write to the OutputStream (which must be done by a separate thread.
  • Therefore the code is hanging exactly where your comment is.
  • Under the hood, the Apache client says "inputStream.read()" which in the case of piped streams requires that outputStream.write(bytes) was called previously (by a separate thread).
  • Since you aren't pumping bytes into the associated OutputStream from a separate thread the InputStream just sits and waits for the OutputStream to be written to by "some other thread."
  • 上面的代码为 HTTPClient 提供了一个 PipedInputStream。
  • PipedInputStream 使字节在写入相应的 OutputStream 时可用。
  • 上面的代码不会写入 OutputStream(必须由单独的线程完成。
  • 因此,代码正好挂在您的评论所在的位置。
  • 在幕后,Apache 客户端说“inputStream.read()”,在管道流的情况下,它要求先前(由单独的线程)调用 outputStream.write(bytes) 。
  • 由于您没有从单独的线程将字节泵入关联的 OutputStream,因此 InputStream 只是等待“其他线程”写入 OutputStream。

From the JavaDocs:

A piped input stream should be connected to a piped output stream; the piped input stream then provides whatever data bytes are written to the piped output stream.

Typically, data is read from a PipedInputStream object by one thread and data is written to the corresponding PipedOutputStream by some other thread.

Attempting to use both objects from a single thread is not recommended, as it may deadlock the thread.

The piped input stream contains a buffer, decoupling read operations from write operations, within limits. A pipe is said to be "broken" if a thread that was providing data bytes to the connected piped output stream is no longer alive.

来自 JavaDocs:

管道输入流应该连接到管道输出流;管道输入流然后提供写入管道输出流的任何数据字节。

通常,数据由一个线程从 PipedInputStream 对象读取,数据由其他线程写入相应的 PipedOutputStream。

不建议尝试从单个线程使用这两个对象,因为这可能会使线程死锁。

管道输入流包含一个缓冲区,在限制范围内将读取操作与写入操作分离。如果向连接的管道输出流提供数据字节的线程不再处于活动状态,则称管道已“损坏”。

Note: Seems to me, since piped streams and concurrency were not mentioned in your problem statement, that it's not necessary. Try wrapping a ByteArrayInputStream() with the Entity object instead first for a sanity check... that should help you narrow down the issue.

注意:在我看来,由于您的问题陈述中没有提到管道流和并发性,因此没有必要。尝试使用 Entity 对象包装 ByteArrayInputStream() 而不是首先进行健全性检查......这应该可以帮助您缩小问题的范围。

Update

更新

Incidentally, I wrote an inversion of Apache's HTTP Client API [PipedApacheClientOutputStream]which provides an OutputStream interface for HTTP POST using Apache Commons HTTP Client 4.3.4. This may be close to what you are looking for...

顺便说一句,我写了一个 Apache 的 HTTP 客户端 API [PipedApacheClientOutputStream]的倒置,它为使用 Apache Commons HTTP Client 4.3.4 的 HTTP POST 提供了一个 OutputStream 接口。这可能与您要查找的内容很接近...

Calling-code looks like this:

调用代码如下所示:

// Calling-code manages thread-pool
ExecutorService es = Executors.newCachedThreadPool(
  new ThreadFactoryBuilder()
  .setNameFormat("apache-client-executor-thread-%d")
  .build());


// Build configuration
PipedApacheClientOutputStreamConfig config = new      
  PipedApacheClientOutputStreamConfig();
config.setUrl("http://localhost:3000");
config.setPipeBufferSizeBytes(1024);
config.setThreadPool(es);
config.setHttpClient(HttpClientBuilder.create().build());

// Instantiate OutputStream
PipedApacheClientOutputStream os = new     
PipedApacheClientOutputStream(config);

// Write to OutputStream
os.write(...);

try {
  os.close();
} catch (IOException e) {
  logger.error(e.getLocalizedMessage(), e);
}

// Do stuff with HTTP response
...

// Close the HTTP response
os.getResponse().close();

// Finally, shut down thread pool
// This must occur after retrieving response (after is) if interested   
// in POST result
es.shutdown();

Note- In practice the same client, executor service, and config will likely be reused throughout the life of the application, so the outer prep and close code in the above example will likely live in bootstrap/init and finalization code rather than directly inline with the OutputStream instantiation.

注意-在实践中,相同的客户端、执行程序服务和配置可能会在应用程序的整个生命周期中重复使用,因此上面示例中的外部准备和关闭代码可能会存在于引导/初始化和终结代码中,而不是直接内联OutputStream 实例化。