bash 如何读取文件的第 N 行并将其打印到新文件?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/7996629/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 21:09:25  来源:igfitidea点击:

How do I read the Nth line of a file and print it to a new file?

bashshellunix

提问by captain

I have a folder called foo. Foo has some other folders which might have sub folders and text files. I want to find every file which begins with the name year and and read its Nth line and print it to a new file. For example foo has a file called year1 and the sub folders have files called year2, year3 etc. The program will print the 1st line of year1 to a file called writeout, then it will print the 2nd line of year2 to the file writeout etc.

我有一个名为 foo 的文件夹。Foo 还有一些其他文件夹,它们可能包含子文件夹和文本文件。我想找到每个以年份开头的文件,并读取它的第 N 行并将其打印到一个新文件中。例如 foo 有一个名为 year1 的文件,子文件夹中有名为 year2、year3 等的文件。程序会将 year1 的第一行打印到名为 writeout 的文件中,然后将 year2 的第二行打印到文件 writeout 等中。

I also didn't really understand how to do a for loop for a file.

我也不太明白如何为文件执行 for 循环。

So far I have:

到目前为止,我有:

#!/bin/bash

for year* in ~/foo
do
  Here I tried writing some code using the sed command but I can't think of something       else.
done

I also get a message in the terminal which says `year*' not a valid identifier. Any ideas?

我还在终端中收到一条消息,显示“年份*”不是有效标识符。有任何想法吗?

回答by shellter

Sed can help you.

Sed 可以帮助您。

Recall that sed will normally process all lines in a file AND print each line in the file.

回想一下,sed 通常会处理文件中的所有行并打印文件中的每一行。

You can turn off that feature, and have sed only print lines of interest by matching a pattern or line number.

您可以关闭该功能,并通过匹配模式或行号仅打印感兴趣的行。

So, to print the 2nd line of file 2, you can say

因此,要打印文件 2 的第二行,您可以说

sed -n '2p' file2 > newFile2

To print the 2nd line and then stop processing add the q (for quit) command (you also need braces to group the 2 commands together), i.e.

要打印第二行然后停止处理添加 q(退出)命令(您还需要大括号将 2 个命令组合在一起),即

sed -n '2{p;q;}' file2 > newFile2

(if you are processing large files, this can be quite a time saving).

(如果您正在处理大文件,这可以节省相当多的时间)。

To make that more general, you can change the number to a variable that will hold a number, i.e.

为了使其更通用,您可以将数字更改为将保存数字的变量,即

  lineNo=3
  sed -n "${lineNo}{p;q;}" file3 > newFile3

If you want all of your sliced lines to go into 1 file, then use the shells 'append-redirection', i.e.

如果您希望将所有切片行放入 1 个文件中,请使用 shell 'append-redirection',即

 for lineNo in 1 2 3 4 5 ; do
     sed -n  "${lineNo}{p;q;}" file${lineNo} >> aggregateFile
 done

The other postings, with using the results of find ...to drive your filelist, are an excellent approach.

使用 的结果find ...来驱动您的文件列表的其他帖子是一种很好的方法。

I hope this helps.

我希望这有帮助。

回答by Karoly Horvath

Here is one way to do it:

这是一种方法:

awk "NR==$YEAR" $file

回答by Emil Sit

Use findto locate the files you want, and then sedto extract what you want:

使用find找到你想要的文件,然后sed提取你想要什么:

find foo -type f -name year* |
while read file; do
    line=$(echo $file | sed 's/.*year\([0-9]*\)$//')
    sed -n -e "$line {p; q}" $file
done

This approach:

这种方法:

  • Use findto produce a list of files with a name starting with the string "year".
  • Pipes the file list to a whileloop to avoid long command lines
  • Uses sedto extract the desired line number from the name of the file
  • Uses sedto print just the desired line and then immediately quit. (You can leave out the qand just write ${line}pwhich would work but be potentially less efficient of $fileis big. Also, qmay not be fully supported on all versions of sed.)
  • 使用find产生的文件列表与名称以字符串“年”。
  • 将文件列表通过管道传输到while循环以避免长命令行
  • 用于sed从文件名中提取所需的行号
  • 用于sed仅打印所需的行,然后立即退出。(您可以省略q并且只写${line}p哪些可以工作但可能效率较低的$file很大。此外,q可能并非所有版本的 都完全支持sed。)

It will not work properly for files with spaces in their names though.

但是,对于名称中带有空格的文件,它无法正常工作。

回答by Okx

The best way that always works, provided you provide 2 arguments:

始终有效的最佳方式,前提是您提供 2 个参数:

$ touch myfile
$ touch mycommand
$ chmod +x mycommand
$ touch yearfiles
$ find / -type f -name year* >> yearfiles
$ nano mycommand
$ touch foo

Type this:

输入:

#/bin/bash
head -n   >> myfile
less -n 1 myfile >> foo

Use ^X, y, and enter to save. Then run mycommand:

使用^Xy和 enter 进行保存。然后运行我的命令:

$ ./mycommand 2 yearfiles
$ cat foo
year2

Presuming your yearfiles are:

假设您的year文件是:

year1, year2, year3

Additionally, now you have setup, you just have to use $ ./mycommand LINENUMBER FILENAMEfrom now on.

此外,现在你已经设置好了,你只需$ ./mycommand LINENUMBER FILENAME要从现在开始使用。

回答by Karol Król

Here you go

干得好

sed ${index}'q;d' ${input_file} > ${output_file}

回答by parmeet

1.time head -5 emp.lst tail -1
It has taken time for execution is
real 0m0.004s
user 0m0.001s
sys 0m0.001s

or

2.awk 'NR==5' emp.lst
It has taken time for execution is
real 0m0.003s
user 0m0.000s
sys 0m0.002s

or 

3.sed -n '5p' emp.lst
It has taken time for execution is
real 0m0.001s
user 0m0.000s
sys 0m0.001s

or 

4.using some cute trick we can get this with cut command
cut -d “
“ -f 5 emp.lst
# after -d press enter ,it means delimiter is newline
It has taken time for execution is
real 0m0.001s

回答by thiton

Your task has two sub-tasks: Find the name of all the year files, and then extract the Nth line. Consider the following script:

您的任务有两个子任务:找到所有年份文件的名称,然后提取第 N 行。考虑以下脚本:

for file in `find foo -name 'year*'`; do
     YEAR=`echo $file | sed -e 's/.*year\([0-9]*\)$//'`
     head -n $YEAR $file | tail -n 1
done

The find call finds the matching files for you in the directory foo. The second line extracts only the digits at the end of the filename from the filename. The third line then extracts the first N lines from the file, keeping only the last of the first N lines (read: only the Nth line).

find 调用会在目录 foo 中为您找到匹配的文件。第二行仅从文件名中提取文件名末尾的数字。第三行然后从文件中提取前 N 行,仅保留前 N 行中的最后一行(读取:仅第 N 行)。