bash 如何读取文件的第 N 行并将其打印到新文件?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/7996629/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How do I read the Nth line of a file and print it to a new file?
提问by captain
I have a folder called foo. Foo has some other folders which might have sub folders and text files. I want to find every file which begins with the name year and and read its Nth line and print it to a new file. For example foo has a file called year1 and the sub folders have files called year2, year3 etc. The program will print the 1st line of year1 to a file called writeout, then it will print the 2nd line of year2 to the file writeout etc.
我有一个名为 foo 的文件夹。Foo 还有一些其他文件夹,它们可能包含子文件夹和文本文件。我想找到每个以年份开头的文件,并读取它的第 N 行并将其打印到一个新文件中。例如 foo 有一个名为 year1 的文件,子文件夹中有名为 year2、year3 等的文件。程序会将 year1 的第一行打印到名为 writeout 的文件中,然后将 year2 的第二行打印到文件 writeout 等中。
I also didn't really understand how to do a for loop for a file.
我也不太明白如何为文件执行 for 循环。
So far I have:
到目前为止,我有:
#!/bin/bash
for year* in ~/foo
do
Here I tried writing some code using the sed command but I can't think of something else.
done
I also get a message in the terminal which says `year*' not a valid identifier. Any ideas?
我还在终端中收到一条消息,显示“年份*”不是有效标识符。有任何想法吗?
回答by shellter
Sed can help you.
Sed 可以帮助您。
Recall that sed will normally process all lines in a file AND print each line in the file.
回想一下,sed 通常会处理文件中的所有行并打印文件中的每一行。
You can turn off that feature, and have sed only print lines of interest by matching a pattern or line number.
您可以关闭该功能,并通过匹配模式或行号仅打印感兴趣的行。
So, to print the 2nd line of file 2, you can say
因此,要打印文件 2 的第二行,您可以说
sed -n '2p' file2 > newFile2
To print the 2nd line and then stop processing add the q (for quit) command (you also need braces to group the 2 commands together), i.e.
要打印第二行然后停止处理添加 q(退出)命令(您还需要大括号将 2 个命令组合在一起),即
sed -n '2{p;q;}' file2 > newFile2
(if you are processing large files, this can be quite a time saving).
(如果您正在处理大文件,这可以节省相当多的时间)。
To make that more general, you can change the number to a variable that will hold a number, i.e.
为了使其更通用,您可以将数字更改为将保存数字的变量,即
lineNo=3
sed -n "${lineNo}{p;q;}" file3 > newFile3
If you want all of your sliced lines to go into 1 file, then use the shells 'append-redirection', i.e.
如果您希望将所有切片行放入 1 个文件中,请使用 shell 'append-redirection',即
for lineNo in 1 2 3 4 5 ; do
sed -n "${lineNo}{p;q;}" file${lineNo} >> aggregateFile
done
The other postings, with using the results of find ...
to drive your filelist, are an excellent approach.
使用 的结果find ...
来驱动您的文件列表的其他帖子是一种很好的方法。
I hope this helps.
我希望这有帮助。
回答by Karoly Horvath
Here is one way to do it:
这是一种方法:
awk "NR==$YEAR" $file
回答by Emil Sit
Use find
to locate the files you want, and then sed
to extract what you want:
使用find
找到你想要的文件,然后sed
提取你想要什么:
find foo -type f -name year* |
while read file; do
line=$(echo $file | sed 's/.*year\([0-9]*\)$//')
sed -n -e "$line {p; q}" $file
done
This approach:
这种方法:
- Use
find
to produce a list of files with a name starting with the string "year". - Pipes the file list to a
while
loop to avoid long command lines - Uses
sed
to extract the desired line number from the name of the file - Uses
sed
to print just the desired line and then immediately quit. (You can leave out theq
and just write${line}p
which would work but be potentially less efficient of$file
is big. Also,q
may not be fully supported on all versions ofsed
.)
- 使用
find
产生的文件列表与名称以字符串“年”。 - 将文件列表通过管道传输到
while
循环以避免长命令行 - 用于
sed
从文件名中提取所需的行号 - 用于
sed
仅打印所需的行,然后立即退出。(您可以省略q
并且只写${line}p
哪些可以工作但可能效率较低的$file
很大。此外,q
可能并非所有版本的 都完全支持sed
。)
It will not work properly for files with spaces in their names though.
但是,对于名称中带有空格的文件,它无法正常工作。
回答by Okx
The best way that always works, provided you provide 2 arguments:
始终有效的最佳方式,前提是您提供 2 个参数:
$ touch myfile
$ touch mycommand
$ chmod +x mycommand
$ touch yearfiles
$ find / -type f -name year* >> yearfiles
$ nano mycommand
$ touch foo
Type this:
输入:
#/bin/bash
head -n >> myfile
less -n 1 myfile >> foo
Use ^X
, y
, and enter to save. Then run mycommand:
使用^X
、y
和 enter 进行保存。然后运行我的命令:
$ ./mycommand 2 yearfiles
$ cat foo
year2
Presuming your year
files are:
假设您的year
文件是:
year1, year2, year3
Additionally, now you have setup, you just have to use $ ./mycommand LINENUMBER FILENAME
from now on.
此外,现在你已经设置好了,你只需$ ./mycommand LINENUMBER FILENAME
要从现在开始使用。
回答by Karol Król
Here you go
干得好
sed ${index}'q;d' ${input_file} > ${output_file}
回答by parmeet
1.time head -5 emp.lst tail -1
It has taken time for execution is
real 0m0.004s
user 0m0.001s
sys 0m0.001s
or
2.awk 'NR==5' emp.lst
It has taken time for execution is
real 0m0.003s
user 0m0.000s
sys 0m0.002s
or
3.sed -n '5p' emp.lst
It has taken time for execution is
real 0m0.001s
user 0m0.000s
sys 0m0.001s
or
4.using some cute trick we can get this with cut command
cut -d “
“ -f 5 emp.lst
# after -d press enter ,it means delimiter is newline
It has taken time for execution is
real 0m0.001s
回答by thiton
Your task has two sub-tasks: Find the name of all the year files, and then extract the Nth line. Consider the following script:
您的任务有两个子任务:找到所有年份文件的名称,然后提取第 N 行。考虑以下脚本:
for file in `find foo -name 'year*'`; do
YEAR=`echo $file | sed -e 's/.*year\([0-9]*\)$//'`
head -n $YEAR $file | tail -n 1
done
The find call finds the matching files for you in the directory foo. The second line extracts only the digits at the end of the filename from the filename. The third line then extracts the first N lines from the file, keeping only the last of the first N lines (read: only the Nth line).
find 调用会在目录 foo 中为您找到匹配的文件。第二行仅从文件名中提取文件名末尾的数字。第三行然后从文件中提取前 N 行,仅保留前 N 行中的最后一行(读取:仅第 N 行)。