如何使用 bash/sed 脚本删除文本文件的第一行？

Question

提问by Brent

I need to repeatedly remove the first line from a huge text file using a bash script.

我需要使用 bash 脚本从巨大的文本文件中重复删除第一行。

Right now I am using sed -i -e "1d" $FILE- but it takes around a minute to do the deletion.

现在我正在使用sed -i -e "1d" $FILE- 但删除大约需要一分钟。

Is there a more efficient way to accomplish this?

有没有更有效的方法来实现这一目标？

Answer 1

回答by Aaron Digulla

Try tail:

试试尾巴：

tail -n +2 "$FILE"

-n x: Just print the last xlines. tail -n 5would give you the last 5 lines of the input. The +sign kind of inverts the argument and make tailprint anything but the first x-1lines. tail -n +1would print the whole file, tail -n +2everything but the first line, etc.

-n x: 只打印最后x几行。tail -n 5会给你输入的最后 5 行。该+标志那种反转的争论，使tail打印任何东西，但第一x-1线。tail -n +1将打印整个文件，tail -n +2除了第一行之外的所有内容，等等。

GNU tailis much faster than sed. tailis also available on BSD and the -n +2flag is consistent across both tools. Check the FreeBSDor OS Xman pages for more.

GNUtail比sed. tail也可以在 BSD 上使用，并且这-n +2两个工具的标志是一致的。查看FreeBSD或OS X手册页了解更多信息。

The BSD version can be much slower than sed, though. I wonder how they managed that; tailshould just read a file line by line while seddoes pretty complex operations involving interpreting a script, applying regular expressions and the like.

不过，BSD 版本可能比慢得多sed。我想知道他们是如何做到的；tail应该只是逐行读取文件，而sed执行涉及解释脚本、应用正则表达式等相当复杂的操作。

Note: You may be tempted to use

注意：您可能会尝试使用

# THIS WILL GIVE YOU AN EMPTY FILE!
tail -n +2 "$FILE" > "$FILE"

but this will give you an empty file. The reason is that the redirection (>) happens before tailis invoked by the shell:

但这会给你一个空文件。原因是重定向 ( >) 发生在tailshell 调用之前：

Shell truncates file $FILE
Shell creates a new process for tail
Shell redirects stdout of the tailprocess to $FILE
tailreads from the now empty $FILE

Shell 截断文件 $FILE
Shell 创建了一个新的进程 tail
Shell 将tail进程的标准输出重定向到$FILE
tail从现在空读取 $FILE

If you want to remove the first line inside the file, you should use:

如果你想删除文件中的第一行，你应该使用：

tail -n +2 "$FILE" > "$FILE.tmp" && mv "$FILE.tmp" "$FILE"

The &&will make sure that the file doesn't get overwritten when there is a problem.

在&&将确保该文件不被覆盖时，有一个问题。

Answer 2

回答by amit

You can use -i to update the file without using '>' operator. The following command will delete the first line from the file and save it to the file.

您可以使用 -i 来更新文件而不使用“>”运算符。以下命令将从文件中删除第一行并将其保存到文件中。

sed -i '1d' filename

Answer 3

回答by Nasri Najib

For those who are on SunOS which is non-GNU, the following code will help:

对于使用非 GNU 的 SunOS 的用户，以下代码将有所帮助：

sed '1d' test.dat > tmp.dat

Answer 4

回答by paxdiablo

No, that's about as efficient as you're going to get. You could write a C program which could do the job a little faster (less startup time and processing arguments) but it will probably tend towards the same speed as sed as files get large (and I assume they're large if it's taking a minute).

不，这与您将获得的效率一样高。你可以编写一个 C 程序，它可以更快地完成这项工作（更少的启动时间和处理参数），但它可能会随着文件变大而趋向于与 sed 相同的速度（如果需要一分钟，我认为它们很大）。

But your question suffers from the same problem as so many others in that it pre-supposes the solution. If you were to tell us in detail whatyou're trying to do rather then how, we may be able to suggest a better option.

但是您的问题与许多其他问题面临相同的问题，因为它预设了解决方案。如果你要详细告诉我们什么你想要做而不是如何，我们也许能够提出更好的选择。

For example, if this is a file A that some other program B processes, one solution would be to not strip off the first line, but modify program B to process it differently.

例如，如果这是某个其他程序 B 处理的文件 A，一个解决方案是不去掉第一行，而是修改程序 B 以对其进行不同的处理。

Let's say all your programs append to this file A and program B currently reads and processes the first line before deleting it.

假设您的所有程序都附加到此文件 A 中，程序 B 当前在删除第一行之前读取并处理第一行。

You could re-engineer program B so that it didn't try to delete the first line but maintains a persistent (probably file-based) offset into the file A so that, next time it runs, it could seek to that offset, process the line there, and update the offset.

您可以重新设计程序 B，使其不会尝试删除第一行，而是在文件 A 中保持一个持久的（可能基于文件的）偏移量，以便下次运行时，它可以寻找该偏移量，处理那里的行，并更新偏移量。

Then, at a quiet time (midnight?), it could do special processing of file A to delete all lines currently processed and set the offset back to 0.

然后，在一个安静的时间（午夜？），它可以对文件 A 进行特殊处理以删除当前处理的所有行并将偏移量设置回 0。

It will certainly be faster for a program to open and seek a file rather than open and rewrite. This discussion assumes you have control over program B, of course. I don't know if that's the case but there may be other possible solutions if you provide further information.

程序打开和查找文件肯定比打开和重写要快。当然，此讨论假定您可以控制程序 B。我不知道是否是这种情况，但如果您提供更多信息，可能还有其他可能的解决方案。

Answer 5

回答by alexis

You canedit the files in place: Just use perl's -iflag, like this:

您可以就地编辑文件：只需使用 perl 的-i标志，如下所示：

perl -ni -e 'print unless $. == 1' filename.txt

This makes the first line disappear, as you ask. Perl will need to read and copy the entire file, but it arranges for the output to be saved under the name of the original file.

如您所问，这会使第一行消失。Perl 需要读取并复制整个文件，但它会安排将输出保存在原始文件的名称下。

Answer 6

回答by Robert Gamble

As Pax said, you probably aren't going to get any faster than this. The reason is that there are almost no filesystems that support truncating from the beginning of the file so this is going to be an O(n) operation where nis the size of the file. What you can do muchfaster though is overwrite the first line with the same number of bytes (maybe with spaces or a comment) which might work for you depending on exactly what you are trying to do (what is that by the way?).

正如 Pax 所说，您可能不会比这更快。原因是几乎没有支持从文件开头截断的文件系统，所以这将是一个 O( n) 操作，其中n是文件的大小。你可以做多，虽然速度是覆盖具有相同的字节数（也许用空格或注释），这可能会为您取决于正是你正在尝试做的工作第一线（那是什么来着？）。

Answer 7

回答by agc

The spongeutilavoids the need for juggling a temp file:

该spongeUTIL避免了杂耍一个临时文件的需要：

tail -n +2 "$FILE" | sponge "$FILE"

Answer 8

回答by Ingo Baab

You can easily do this with:

你可以很容易地做到这一点：

cat filename | sed 1d > filename_without_first_line

on the command line; or to remove the first line of a file permanently, use the in-place mode of sed with the -iflag:

在命令行上；或者要永久删除文件的第一行，请使用带有-i标志的 sed 就地模式：

sed -i 1d <filename>

Answer 9

回答by Mark Reed

If you want to modify the file in place, you could always use the original edinstead of its streaming successor sed:

如果要修改到位的文件，你总是可以使用原始ed的，而不是它的小号treaming继任者sed：

ed "$FILE" <<<$'1d\nwq\n'

The edcommand was the original UNIX text editor, before there were even full-screen terminals, much less graphical workstations. The exeditor, best known as what you're using when typing at the colon prompt in vi, is an extended version of ed, so many of the same commands work. While edis meant to be used interactively, it can also be used in batch mode by sending a string of commands to it, which is what this solution does.

该ed命令是最初的 UNIX 文本编辑器，甚至在全屏终端出现之前，更不用说图形工作站了。在ex编辑器中，最有名的你使用的是什么类型时，在结肠中的提示vi，是一个前的趋向版本ed，所以很多相同的命令工作。虽然ed旨在以交互方式使用，但也可以通过向其发送一串命令以批处理模式使用，这就是本解决方案的作用。

The sequence <<<$'1d\nwq\n'takes advantage of Bash's support for here-strings (<<<) and POSIX quotes ($'...') to feed input to the edcommand consisting of two lines: 1d, which deletes line 1, and then wq, which writes the file back out to disk and then quits the editing session.

序列<<<$'1d\nwq\n'利用了bash的支持，这里串（<<<）和POSIX引号（$'... '），以饲料投入到ed由两行命令：1d，其中deletes行1，然后wq，这W¯¯仪式的文件重新出磁盘，然后qUITS编辑会话。

Answer 10

回答by serup

should show the lines except the first line :

应该显示除第一行之外的行：

cat textfile.txt | tail -n +2

如何使用 bash/sed 脚本删除文本文件的第一行？

提问by Brent

回答by Aaron Digulla

回答by amit

回答by Nasri Najib

回答by paxdiablo

回答by alexis

回答by Robert Gamble

回答by agc

回答by Ingo Baab

回答by Mark Reed

回答by serup

相关推荐

最近更新

标签

如何使用 bash/sed 脚本删除文本文件的第一行？

提问by Brent

回答by Aaron Digulla

回答by amit

回答by Nasri Najib

回答by paxdiablo

回答by alexis

回答by Robert Gamble

回答by agc

回答by Ingo Baab

回答by Mark Reed

回答by serup

相关推荐

bash 在 OS X 上设置环境变量

bash 如何使用bash“就地”执行编辑其文件（参数）的任何命令？

bash 如何递归删除所有文件的尾随空格？

bash 如何使用“查找”来搜索在特定日期创建的文件？

相关推荐

最近更新

标签