Ruby on Rails - 从 CSV 文件导入数据
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4410794/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Ruby on Rails - Import Data from a CSV file
提问by freshest
I would like to import data from a CSV file into an existing database table. I do not want to save the CSV file, just take the data from it and put it into the existing table. I am using Ruby 1.9.2 and Rails 3.
我想将数据从 CSV 文件导入到现有的数据库表中。我不想保存 CSV 文件,只需从中获取数据并将其放入现有表中。我正在使用 Ruby 1.9.2 和 Rails 3。
This is my table:
这是我的表:
create_table "mouldings", :force => true do |t|
t.string "suppliers_code"
t.datetime "created_at"
t.datetime "updated_at"
t.string "name"
t.integer "supplier_id"
t.decimal "length", :precision => 3, :scale => 2
t.decimal "cost", :precision => 4, :scale => 2
t.integer "width"
t.integer "depth"
end
Can you give me some code to show me the best way to do this, thanks.
你能给我一些代码来告诉我最好的方法来做到这一点,谢谢。
回答by yfeldblum
require 'csv'
csv_text = File.read('...')
csv = CSV.parse(csv_text, :headers => true)
csv.each do |row|
Moulding.create!(row.to_hash)
end
回答by Tom De Leu
Simpler version of yfeldblum's answer, that is simpler and works well also with large files:
yfeldblum 答案的更简单版本,更简单,也适用于大文件:
require 'csv'
CSV.foreach(filename, :headers => true) do |row|
Moulding.create!(row.to_hash)
end
No need for with_indifferent_access or symbolize_keys, and no need to read in the file to a string first.
不需要with_indifferent_access 或symbol_keys,也不需要先将文件读入字符串。
It doesnt't keep the whole file in memory at once, but reads in line by line and creates a Moulding per line.
它不会立即将整个文件保存在内存中,而是逐行读取并每行创建一个 Molding。
回答by Tilo
The smarter_csvgem was specifically created for this use-case: to read data from CSV file and quickly create database entries.
该smarter_csv宝石是专为这个用例发布:读取从CSV文件中的数据,并快速创建数据库条目。
require 'smarter_csv'
options = {}
SmarterCSV.process('input_file.csv', options) do |chunk|
chunk.each do |data_hash|
Moulding.create!( data_hash )
end
end
You can use the option chunk_sizeto read N csv-rows at a time, and then use Resque in the inner loop to generate jobs which will create the new records, rather than creating them right away - this way you can spread the load of generating entries to multiple workers.
您可以使用该选项一次chunk_size读取 N 个 csv 行,然后在内循环中使用 Resque 来生成将创建新记录的作业,而不是立即创建它们 - 这样您就可以分散生成条目的负载给多个工人。
See also: https://github.com/tilo/smarter_csv
回答by Seamus Abshere
You might try Upsert:
你可以试试Upsert:
require 'upsert' # add this to your Gemfile
require 'csv'
u = Upsert.new Moulding.connection, Moulding.table_name
CSV.foreach(file, headers: true) do |row|
selector = { name: row['name'] } # this treats "name" as the primary key and prevents the creation of duplicates by name
setter = row.to_hash
u.row selector, setter
end
If this is what you want, you might also consider getting rid of the auto-increment primary key from the table and setting the primary key to name. Alternatively, if there is some combination of attributes that form a primary key, use that as the selector. No index is necessary, it will just make it faster.
如果这是您想要的,您还可以考虑从表中删除自动增量主键并将主键设置为name. 或者,如果存在形成主键的某些属性组合,请将其用作选择器。不需要索引,它只会让它更快。
回答by Kalyan Maddu
This can help. It has code examples too:
这可以提供帮助。它也有代码示例:
http://csv-mapper.rubyforge.org/
http://csv-mapper.rubyforge.org/
Or for a rake task for doing the same:
或用于执行相同操作的 rake 任务:
回答by Lorem Ipsum Dolor
It is better to wrap the database related process inside a transactionblock. Code snippet blow is a full process of seeding a set of languages to Language model,
最好将与数据库相关的进程包装在一个transaction块中。代码片段打击是将一组语言播种到语言模型的完整过程,
require 'csv'
namespace :lan do
desc 'Seed initial languages data with language & code'
task init_data: :environment do
puts '>>> Initializing Languages Data Table'
ActiveRecord::Base.transaction do
csv_path = File.expand_path('languages.csv', File.dirname(__FILE__))
csv_str = File.read(csv_path)
csv = CSV.new(csv_str).to_a
csv.each do |lan_set|
lan_code = lan_set[0]
lan_str = lan_set[1]
Language.create!(language: lan_str, code: lan_code)
print '.'
end
end
puts ''
puts '>>> Languages Database Table Initialization Completed'
end
end
Snippet below is a partial of languages.csvfile,
下面的片段是languages.csv文件的一部分,
aa,Afar
ab,Abkhazian
af,Afrikaans
ak,Akan
am,Amharic
ar,Arabic
as,Assamese
ay,Aymara
az,Azerbaijani
ba,Bashkir
...
回答by Michael Nera
Use this gem: https://rubygems.org/gems/active_record_importer
使用这个 gem:https: //rubygems.org/gems/active_record_importer
class Moulding < ActiveRecord::Base
acts_as_importable
end
Then you may now use:
那么你现在可以使用:
Moulding.import!(file: File.open(PATH_TO_FILE))
Just be sure to that your headers match the column names of your table
请确保您的标题与表格的列名匹配
回答by Ipsagel
The better way is to include it in a rake task. Create import.rake file inside /lib/tasks/ and put this code to that file.
更好的方法是将其包含在 rake 任务中。在 /lib/tasks/ 中创建 import.rake 文件并将此代码放入该文件。
desc "Imports a CSV file into an ActiveRecord table"
task :csv_model_import, [:filename, :model] => [:environment] do |task,args|
lines = File.new(args[:filename], "r:ISO-8859-1").readlines
header = lines.shift.strip
keys = header.split(',')
lines.each do |line|
values = line.strip.split(',')
attributes = Hash[keys.zip values]
Module.const_get(args[:model]).create(attributes)
end
end
After that run this command in your terminal rake csv_model_import[file.csv,Name_of_the_Model]
之后在终端中运行此命令 rake csv_model_import[file.csv,Name_of_the_Model]
回答by Yaroslav
I know it's old question but it still in first 10 links in google.
我知道这是个老问题,但它仍然在谷歌的前 10 个链接中。
It is not very efficient to save rows one-by-one because it cause database call in the loop and you better avoid that, especially when you need to insert huge portions of data.
逐行保存行的效率不是很高,因为它会导致循环中的数据库调用,您最好避免这种情况,尤其是当您需要插入大量数据时。
It's better (and significantly faster) to use batch insert.
使用批量插入更好(并且明显更快)。
INSERT INTO `mouldings` (suppliers_code, name, cost)
VALUES
('s1', 'supplier1', 1.111),
('s2', 'supplier2', '2.222')
You can build such a query manually and than do Model.connection.execute(RAW SQL STRING)(not recomended)
or use gem activerecord-import(it was first released on 11 Aug 2010) in this case just put data in array rowsand call Model.import rows
您可以手动构建这样的查询,而不是Model.connection.execute(RAW SQL STRING)(不推荐)或使用 gem activerecord-import(它于 2010 年 8 月 11 日首次发布)在这种情况下只需将数据放入数组rows并调用Model.import rows
回答by ysk
It's better to use CSV::Table and use String.encode(universal_newline: true). It converting CRLF and CR to LF
最好使用 CSV::Table 并使用String.encode(universal_newline: true). 它将 CRLF 和 CR 转换为 LF

