Ruby-on-rails 计算 CSV 文件的长度(行数)?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4662438/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-03 00:04:11  来源:igfitidea点击:

Count the length (number of lines) of a CSV file?

ruby-on-railsrubycsv

提问by Mathias

I have a form (Rails) which allows me to load a .csv file using the file_field. In the view:

我有一个表单(Rails),它允许我使用 .csv 文件加载 .csv 文件file_field。在视图中:

    <% form_for(:upcsv, :html => {:multipart => true}) do |f| %>
    <table>
        <tr>
            <td><%= f.label("File:") %></td>
            <td><%= f.file_field(:filename) %></td>
        </tr>
    </table>
        <%= f.submit("Submit") %>
    <% end %>

Clicking Submit redirects me to another page (create.html.erb). The file was loaded fine, and I was able to read the contents just fine in this second page. I am trying to show the number of lines in the .csv file in this second page.

单击提交会将我重定向到另一个页面 (create.html.erb)。该文件加载良好,我能够在第二页中很好地阅读内容。我试图在第二页中显示 .csv 文件中的行数。

My controller (semi-pseudocode):

我的控制器(半伪代码):

class UpcsvController < ApplicationController
    def index
    end

    def create
        file = params[:upcsv][:filename]
        ...
        #params[:upcsv][:file_length] = file.length # Show number of lines in the file
        #params[:upcsv][:file_length] = file.size
        ...
    end
end

Both file.lengthand file.sizereturns '91' when my file only contains 7 lines. From the Rails documentation that I read, once the Submit button is clicked, Rails creates a temp file of the uploaded file, and the params[:upcsv][:filename]contains the contents of the temp/uploaded file and not the pathto the file. And I don't know how to extract the number of lines in my original file. What is the correct way to get the number of lines in the file?

双方file.lengthfile.size返回“91”时,我的文件只包含7条线。从我阅读的 Rails 文档中,一旦单击提交按钮,Rails 就会创建一个上传文件的临时文件,其中params[:upcsv][:filename]包含临时/上传文件的内容,而不是文件的路径。而且我不知道如何提取原始文件中的行数。获取文件行数的正确方法是什么?

My create.html.erb:

我的 create.html.erb:

<table>
    <tr>
        <td>File length:</td>
        <td><%= params[:upcsv][:file_length] %></td>
    </tr>
</table>

I'm really new at Rails (just started last week), so please bear with my stupid questions.

我是 Rails 的新手(上周刚开始),所以请耐心回答我的愚蠢问题。

Thank you!

谢谢!

Update:apparently that number '91' is the number of individual characters (including carriage return) in my file. Each line in my file has 12 digits + 1 newline = 13. 91/13 = 7.

更新:显然,数字 '91' 是我的文件中单个字符(包括回车)的数量。我的文件中的每一行都有 12 位数字 + 1 个换行符 = 13。91/13 = 7。

采纳答案by gicappa

another way to read the number of lines is

另一种读取行数的方法是

file.readlines.size

回答by Jaco Pretorius

All of the solutions listed here actually load the entire file into memory in order to get the number of lines. If you're on a Unix-based system a much faster, easier and memory-efficient solution is:

此处列出的所有解决方案实际上都将整个文件加载到内存中以获取行数。如果您使用的是基于 Unix 的系统,则一个更快、更简单且节省内存的解决方案是:

`wc -l #{your_file_path}`.to_i

回答by roman

.length and .size are actually synonyms. to get the rowcount of the csv file you have to actually parse it. simply counting the newlines in the file won't work, because string fields in a csv can actually have linebreaks. a simple way to get the linecount would be:

.length 和 .size 实际上是同义词。要获得 csv 文件的行数,您必须实际解析它。简单地计算文件中的换行符是行不通的,因为 csv 中的字符串字段实际上可以有换行符。获取行数的一种简单方法是:

CSV.read(params[:upcsv][:filename]).length

回答by Taimoor Changaiz

CSV.foreach(file_path, headers: true).count

Above will exclue header while counting rows

以上将在计算行数时排除标题

CSV.read(file_path).count

回答by jamesdlivesinatree

your_csv.countshould do the trick.

your_csv.count应该做的伎俩。

回答by pcv

If your csv file doesn't fit to memory (can't use readlines), you can do:

如果您的 csv 文件不适合内存(不能使用 readlines),您可以执行以下操作:

def self.line_count(f)
  i = 0
  CSV.foreach(f) {|_| i += 1}
  i
end

Unlike wc -lthis counts actual record count, not number of lines. These can be different if there are new lines in field values.

wc -l此不同的是实际记录数,而不是行数。如果字段值中有新行,这些可能会有所不同。

回答by boulder_ruby

Just to demonstrate what IO#readlines does:

只是为了演示 IO#readlines 的作用:

if you had a file like this: "asdflkjasdlkfjsdakf\n asdfjljdaslkdfjlsadjfasdflkj\n asldfjksdjfa\n"

如果你有这样的文件:“asdflkjasdlkfjsdakf\n asdfjljdaslkdfjlsadjfasdflkj\n asldfjksdjfa\n”

in rails you'd do, say:

在 rails 你会做的,说:

file = File.open(File.join(Rails.root, 'lib', 'file.json'))
lines_ary = IO.readlines(file)
lines_ary.count #=> 3

IO#readlines converts a file into an array of strings using the \n (newlines) as separators, much like commas so often do, so it's basically like

IO#readlines 使用 \n(换行符)作为分隔符将文件转换为字符串数组,很像经​​常使用的逗号,所以它基本上就像

str.split(/\n/)

In fact, if you did

事实上,如果你这样做了

 x = file.read

this

这个

 x.split(/\n/)

would do the same thing as file.readlines

和 file.readlines 做同样的事情

** IO#readlines can be really handy when dealing with files which have a repeating line structure ("child_id", "parent_ary", "child_id", "parent_ary",...) etc

** IO#readlines 在处理具有重复行结构(“child_id”、“parent_ary”、“child_id”、“parent_ary”等)等的文件时非常方便