Ruby on Rails 3:通过 Rails 将数据流式传输到客户端
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3507594/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Ruby on Rails 3: Streaming data through Rails to client
提问by jkndrkn
I am working on a Ruby on Rails app that communicates with RackSpace cloudfiles (similar to Amazon S3 but lacking some features).
我正在开发一个 Ruby on Rails 应用程序,该应用程序与 RackSpace 云文件(类似于 Amazon S3,但缺少一些功能)进行通信。
Due to the lack of the availability of per-object access permissions and query string authentication, downloads to users have to be mediated through an application.
由于缺乏每个对象的访问权限和查询字符串身份验证的可用性,必须通过应用程序调解向用户的下载。
In Rails 2.3, it looks like you can dynamically build a response as follows:
在 Rails 2.3 中,您似乎可以按如下方式动态构建响应:
# Streams about 180 MB of generated data to the browser.
render :text => proc { |response, output|
10_000_000.times do |i|
output.write("This is line #{i}\n")
end
}
(from http://api.rubyonrails.org/classes/ActionController/Base.html#M000464)
(来自http://api.rubyonrails.org/classes/ActionController/Base.html#M000464)
Instead of 10_000_000.times...I could dump my cloudfiles stream generation code in there.
而不是10_000_000.times...我可以在那里转储我的云文件流生成代码。
Trouble is, this is the output I get when I attempt to use this technique in Rails 3.
问题是,这是我尝试在 Rails 3 中使用这种技术时得到的输出。
#<Proc:0x000000010989a6e8@/Users/jderiksen/lt/lt-uber/site/app/controllers/prospect_uploads_controller.rb:75>
Looks like maybe the proc object's callmethod is not being called? Any other ideas?
看起来可能call没有调用proc 对象的方法?还有其他想法吗?
采纳答案by Steven Yelton
It looks like this isn't available in Rails 3
看起来这在 Rails 3 中不可用
https://rails.lighthouseapp.com/projects/8994/tickets/2546-render-text-proc
https://rails.lighthouseapp.com/projects/8994/tickets/2546-render-text-proc
This appeared to work for me in my controller:
这在我的控制器中似乎对我有用:
self.response_body = proc{ |response, output|
output.write "Hello world"
}
回答by John
Assign to response_bodyan object that responds to #each:
分配给response_body响应于 的对象#each:
class Streamer
def each
10_000_000.times do |i|
yield "This is line #{i}\n"
end
end
end
self.response_body = Streamer.new
If you are using 1.9.x or the Backportsgem, you can write this more compactly using Enumerator.new:
如果您使用的是 1.9.x 或Backportsgem,则可以使用以下命令更紧凑地编写Enumerator.new:
self.response_body = Enumerator.new do |y|
10_000_000.times do |i|
y << "This is line #{i}\n"
end
end
Note that when and if the data is flushed depends on the Rack handler and underlying server being used. I have confirmed that Mongrel, for instance, will stream the data, but other users have reported that WEBrick, for instance, buffers it until the response is closed. There is no way to force the response to flush.
请注意,何时以及是否刷新数据取决于 Rack 处理程序和正在使用的底层服务器。例如,我已经确认 Mongrel 会流式传输数据,但其他用户报告说 WEBrick 会缓冲它,直到响应关闭。没有办法强制响应刷新。
In Rails 3.0.x, there are several additional gotchas:
在 Rails 3.0.x 中,还有几个额外的问题:
- In development mode, doing things such as accessing model classes from within the enumeration can be problematic due to bad interactions with class reloading. This is an open bugin Rails 3.0.x.
A bug in the interaction between Rack and Rails causes
#eachto be called twice for each request. This is another open bug. You can work around it with the following monkey patch:class Rack::Response def close @body.close if @body.respond_to?(:close) end end
- 在开发模式下,由于与类重新加载的不良交互,诸如从枚举中访问模型类之类的操作可能会出现问题。这是Rails 3.0.x 中的一个开放错误。
Rack 和 Rails 之间交互中的一个错误导致
#each每个请求被调用两次。这是另一个开放的错误。您可以使用以下猴子补丁解决它:class Rack::Response def close @body.close if @body.respond_to?(:close) end end
Both problems are fixed in Rails 3.1, where HTTP streaming is a marquee feature.
这两个问题在 Rails 3.1 中都得到了修复,其中 HTTP 流是一个选取框功能。
Note that the other common suggestion, self.response_body = proc {|response, output| ...}, does work in Rails 3.0.x, but has been deprecated (and will no longer actually stream the data) in 3.1. Assigning an object that responds to #eachworks in all Rails 3 versions.
请注意,另一个常见建议 ,self.response_body = proc {|response, output| ...}在 Rails 3.0.x 中确实有效,但在 3.1 中已被弃用(并且将不再实际流式传输数据)。分配一个#each在所有 Rails 3 版本中都有效的对象。
回答by paneer_tikka
Thanks to all the posts above, here is fully working code to stream large CSVs. This code:
感谢上面的所有帖子,这里有完整的代码来流式传输大型 CSV。这段代码:
- Does not require any additional gems.
- Uses Model.find_each() so as to not bloat memory with all matching objects.
- Has been tested on rails 3.2.5, ruby 1.9.3 and heroku using unicorn, with single dyno.
- Adds a GC.start at every 500 rows, so as not to blow the heroku dyno's allowed memory.
- You may need to adjust the GC.start depending on your Model's memory footprint. I have successfully used this to stream 105K models into a csv of 9.7MB without any problems.
- 不需要任何额外的宝石。
- 使用 Model.find_each() 以免所有匹配的对象都使内存膨胀。
- 已在 rails 3.2.5、ruby 1.9.3 和 heroku 上使用 unicorn 和单个 dyno 进行过测试。
- 每 500 行添加一个 GC.start,以免破坏 heroku dyno 的允许内存。
- 您可能需要根据模型的内存占用调整 GC.start。我已经成功地使用它将 105K 模型流式传输到 9.7MB 的 csv 中,没有任何问题。
Controller Method:
控制器方法:
def csv_export
respond_to do |format|
format.csv {
@filename = "responses-#{Date.today.to_s(:db)}.csv"
self.response.headers["Content-Type"] ||= 'text/csv'
self.response.headers["Content-Disposition"] = "attachment; filename=#{@filename}"
self.response.headers['Last-Modified'] = Time.now.ctime.to_s
self.response_body = Enumerator.new do |y|
i = 0
Model.find_each do |m|
if i == 0
y << Model.csv_header.to_csv
end
y << sr.csv_array.to_csv
i = i+1
GC.start if i%500==0
end
end
}
end
end
config/unicorn.rb
配置/unicorn.rb
# Set to 3 instead of 4 as per http://michaelvanrooijen.com/articles/2011/06/01-more-concurrency-on-a-single-heroku-dyno-with-the-new-celadon-cedar-stack/
worker_processes 3
# Change timeout to 120s to allow downloading of large streamed CSVs on slow networks
timeout 120
#Enable streaming
port = ENV["PORT"].to_i
listen port, :tcp_nopush => false
Model.rb
模型.rb
def self.csv_header
["ID", "Route", "username"]
end
def csv_array
[id, route, username]
end
回答by Exequiel
In case you are assigning to response_body an object that responds to #each method and it's buffering until the response is closed, try in in action controller:
如果您为 response_body 分配一个响应 #each 方法的对象,并且它正在缓冲直到响应关闭,请尝试在动作控制器中:
self.response.headers['Last-Modified'] = Time.now.to_s
self.response.headers['Last-Modified'] = Time.now.to_s
回答by moumar
Just for the record, rails >= 3.1 has an easy way to stream data by assigning an object that respond to #each method to the controller's response.
只是为了记录,rails >= 3.1 通过将响应#each 方法的对象分配给控制器的响应,有一种简单的方法来流式传输数据。
Everything is explained here: http://blog.sparqcode.com/2012/02/04/streaming-data-with-rails-3-1-or-3-2/
一切都在这里解释:http: //blog.sparqcode.com/2012/02/04/streaming-data-with-rails-3-1-or-3-2/
回答by shuji.koike
In addition, you will have to set the 'Content-Length'header by your self.
此外,您必须自行设置“Content-Length”标头。
If not, Rack will have to wait (buffering body data into memory) to determine the length. And it will ruin your efforts using the methods described above.
如果没有,Rack 将不得不等待(将主体数据缓冲到内存中)来确定长度。使用上述方法会破坏您的努力。
In my case, I could determine the length. In cases you can't, you need to make Rack to start sending body without a 'Content-Length'header. Try to add into config.ru "use Rack::Chunked" after 'require' before the 'run'. (Thanks arkadiy)
就我而言,我可以确定长度。如果你不能,你需要让 Rack 开始发送没有“Content-Length”标头的正文。尝试在 'run' 之前的 'require' 之后添加到 config.ru 中的“use Rack::Chunked”。(感谢阿卡迪)
回答by Matt Hucke
This solved my problem as well - I have gzip'd CSV files, want to send to the user as unzipped CSV, so I read them a line at a time using a GzipReader.
这也解决了我的问题 - 我有 gzip 的 CSV 文件,想以解压缩的 CSV 格式发送给用户,所以我使用 GzipReader 一次读取一行。
These lines are also helpful if you're trying to deliver a big file as a download:
如果您尝试下载大文件,这些行也很有用:
self.response.headers["Content-Type"] = "application/octet-stream"
self.response.headers["Content-Disposition"] = "attachment; filename=#{filename}"
self.response.headers["Content-Type"] = "application/octet-stream"
self.response.headers["Content-Disposition"] = "attachment; filename=#{filename}"
回答by Daniel Cadenas
Yes, response_body is the Rails 3 way of doing this for the moment: https://rails.lighthouseapp.com/projects/8994/tickets/4554-render-text-proc-regression
是的,response_body 目前是 Rails 3 的方法:https: //rails.lighthouseapp.com/projects/8994/tickets/4554-render-text-proc-regression
回答by Martin
I commented in the lighthouse ticket, just wanted to say the self.response_body = proc approach worked for me though I needed to use Mongrel instead of WEBrick to succeed.
我在灯塔票中发表了评论,只是想说 self.response_body = proc 方法对我有用,尽管我需要使用 Mongrel 而不是 WEBrick 才能成功。
Martin
马丁
回答by Yogesh Nachnani
Applying John's solution along with Exequiel's suggestion worked for me.
应用 John 的解决方案和 Exequiel 的建议对我有用。
The statement
该声明
self.response.headers['Last-Modified'] = Time.now.to_s
marks the response as non-cacheable in rack.
将响应标记为在机架中不可缓存。
After investigating further, I figured one could also use this :
经过进一步调查,我想人们也可以使用这个:
headers['Cache-Control'] = 'no-cache'
This, to me, is just slightly more intuitive. It conveys the message to any1 else who may be reading my code. Also, in case a future version of rack stops checking for Last-Modified , a lot of code may break and it may be a while for folks to figure out why.
对我来说,这只是稍微直观一些。它将信息传达给可能正在阅读我的代码的任何其他人。此外,如果未来版本的 rack 停止检查 Last-Modified ,很多代码可能会损坏,人们可能需要一段时间才能弄清楚原因。

