Ruby-on-rails 批量插入记录到 Active Record 表

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/15317837/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-02 21:36:36  来源:igfitidea点击:

Bulk Insert records into Active Record table

ruby-on-railsactiverecordbulkinsert

提问by Hyman R-G

I found that my Model.create!statements were taking a very long time to run when I added a large number of records at once. Looked at ActiveRecord-Importbut it didn't work with an array of hashes (which is what I have and which I think is pretty common). How can I improve the performance?

我发现Model.create!当我一次添加大量记录时,我的语句需要很长时间才能运行。查看ActiveRecord-Import但它不适用于一组散列(这是我所拥有的,我认为这很常见)。如何提高性能?

采纳答案by Hyman R-G

I started running into problems with large numbers of records (> 10000), so I modified the code to work in groups of 1000 records at a time. Here is a link to the new code:

我开始遇到大量记录(> 10000)的问题,所以我修改了代码以一次处理 1000 条记录。这是新代码的链接:

https://gist.github.com/Hymanrg/76ade1724bd816292e4e

https://gist.github.com/Hymanrg/76ade1724bd816292e4e

回答by Harish Shetty

Use the activerecord-importgem. Let us say you are reading a CSV file and generating a Productcatalogue and you want to insert records in batches of 1000:

使用activerecord-importgem。假设您正在读取一个 CSV 文件并生成一个Product目录,并且您想要批量插入 1000 条记录:

batch,batch_size = [], 1_000 
CSV.foreach("/data/new_products.csv", :headers => true) do |row|
  batch << Product.new(row)

  if batch.size >= batch_size
    Product.import batch
    batch = []
  end
end
Product.import batch

回答by Hyman R-G

Thanks to Chris Heald @cheald for his 2009 article, with showed me that the best way to go was the multi-row insert command.

感谢 Chris Heald @cheald 在 2009 年发表的文章,他向我展示了最好的方法是多行插入命令。

Added the following code to my initializers/active_record.rbfile, changed my Model.create!(...)calls to Model.import!(...)and away it goes. A couple caveats:

将以下代码添加到我的initializers/active_record.rb文件中,更改了我的Model.create!(...)调用Model.import!(...)并使其消失。几个警告:

1) It does not validate the data.
2) It uses the form of the SQL INSERT command that reads like ...

1) 它不验证数据。
2) 它使用 SQL INSERT 命令的形式,读起来像 ...

INSERT INTO <table> (field-1, field-2, ...) 
       VALUES (value-1-1, value-1-2, ...), (value-2-1, value-2-2, ...), ...`

... which may not be the correct syntax for all databases, but it works with Postgres. It would not be difficult to alter the code for the appropriate syntax for your SQL version.

...这可能不是所有数据库的正确语法,但它适用于 Postgres。为您的 SQL 版本更改适当语法的代码并不困难。

In my particular case, inserting 19K+ records into a simple table on my development machine (MacBook Pro with 8GB RAM, 2.4GHz Intel Core i5 and and SSD) went from 223 seconds using 'model.create!' to 7.2 seconds using a 'model.import!'.

在我的特殊情况下,使用“model.create!”将 19K+ 条记录插入到我的开发机器(配备 8GB RAM、2.4GHz Intel Core i5 和 SSD)的简单表中的时间为 223 秒。使用 'model.import!' 到 7.2 秒。

class ActiveRecord::Base

  def self.import!(record_list)
    raise ArgumentError "record_list not an Array of Hashes" unless record_list.is_a?(Array) && record_list.all? {|rec| rec.is_a? Hash }
    key_list, value_list = convert_record_list(record_list)        
    sql = "INSERT INTO #{self.table_name} (#{key_list.join(", ")}) VALUES #{value_list.map {|rec| "(#{rec.join(", ")})" }.join(" ,")}"
    self.connection.insert_sql(sql)
  end

  def self.convert_record_list(record_list)
    key_list = record_list.map(&:keys).flatten.uniq.sort

    value_list = record_list.map do |rec|
      list = []
      key_list.each {|key| list <<  ActiveRecord::Base.connection.quote(rec[key]) }
      list
    end

    return [key_list, value_list]
  end
end

回答by Luke

You can also use the activerecord-insert_manygem. Just make an array of objects!

您还可以使用activerecord-insert_manygem。只需制作一组对象!

events = [{name: "Movie Night", time: "10:00"}, {name: "Tutoring", time: "7:00"}, ...]

Event.insert_many(events)

回答by tvw

Using a transaction speeds up bulk inserts a lot!

使用事务可以大大加快批量插入的速度!

Model.transaction do
    many.times{ Model.create! }
end

If multiple Models are involved, do a Model.transaction for each model, which is affected:

如果涉及到多个Model,对每个Model做一个Model.transaction,受影响的:

Model1.transaction do
    Model2.transaction do
        many.times do
            m1 = Model1.create!
            m1.add_model2
        end
    end
end