bash 不使用 sed 或 awk 从文件中删除特定行

Question

提问by user2773624

I need to remove a specific line number from a file using a bash script.

我需要使用 bash 脚本从文件中删除特定的行号。

I get the line number from the grep command with the -n option.

我使用 -n 选项从 grep 命令中获取行号。

I cannot use sed for a variety of reasons, least of which is that it is not installed on all the systems this script needs to run on and installing it is not an option.

由于各种原因，我不能使用 sed，其中最重要的是它没有安装在该脚本需要运行的所有系统上，并且安装它不是一种选择。

awk is out of the question because in testing, on different machines with different UNIX/Linux OS's (RHEL, SunOS, Solaris, Ubuntu, etc.), it gives (sometimes wildly) different results on each. So, no awk.

awk 是不可能的，因为在测试中，在具有不同 UNIX/Linux 操作系统（RHEL、SunOS、Solaris、Ubuntu 等）的不同机器上，它给出了（有时非常）不同的结果。所以，没有 awk。

The file in question is just a flat text file, with one record per line, so nothing fancy needs to be done, except for remove the line by number.

有问题的文件只是一个纯文本文件，每行一条记录，所以除了按数字删除行外，不需要做任何花哨的事情。

If at all possible, I need to avoid doing something like extracting the contents of the file, not including the line I want gone, and then overwriting the original file.

如果可能的话，我需要避免执行诸如提取文件内容之类的操作，不包括我想要删除的行，然后覆盖原始文件。

Answer 1

回答by Digital Trauma

Since you have grep, the obvious thing to do is:

既然你有grep，显而易见的事情是：

$ grep -v "line to remove" file.txt > /tmp/tmp
$ mv /tmp/tmp file.txt
$

But it sounds like you don't want to use any temporary files - I assume the input file is large and this is an embedded system where memory and storage are in short supply. I think you ideally need a solution that edits the file in place. I think this might be possible with ddbut haven't figured it out yet :(

但听起来您不想使用任何临时文件 - 我假设输入文件很大，而且这是一个内存和存储短缺的嵌入式系统。我认为您理想情况下需要一个可以就地编辑文件的解决方案。我认为这可能是可能的，dd但还没有弄清楚:(

Update- I figured out how to edit the file in place with dd. Also grep, headand cutare needed. If these are not available then they can probably be worked around for the most part:

更新- 我想出了如何使用 dd 就地编辑文件。还有grep，head和cut是需要的。如果这些不可用，那么它们可能在很大程度上可以解决：

#!/bin/bash

# get the line number to remove
rline=$(grep -n "" "" | head -n1 | cut -d: -f1)
# number of bytes before the line to be removed
hbytes=$(head -n$((rline-1)) "" | wc -c)
# number of bytes to remove
rbytes=$(grep "" "" | wc -c)
# original file size
fsize=$(cat "" | wc -c)
# dd will start reading the file after the line to be removed
ddskip=$((hbytes + rbytes))
# dd will start writing at the beginning of the line to be removed
ddseek=$hbytes
# dd will move this many bytes
ddcount=$((fsize - hbytes - rbytes))
# the expected new file size
newsize=$((fsize - rbytes))
# move the bytes with dd.  strace confirms the file is edited in place
dd bs=1 if="" skip=$ddskip seek=$ddseek conv=notrunc count=$ddcount of=""
# truncate the remainder bytes of the end of the file
dd bs=1 if="" skip=$newsize seek=$newsize count=0 of=""

Run it thusly:

运行它：

$ cat > file.txt
line 1
line two
line 3
$ ./grepremove "tw" file.txt
7+0 records in
7+0 records out
0+0 records in
0+0 records out
$ cat file.txt
line 1
line 3
$

Suffice to say that ddis a very dangeroustool. You can easily unintentionally overwrite files or entire disks. Be very careful!

可以说这dd是一个非常危险的工具。您很容易无意中覆盖文件或整个磁盘。要非常小心！

Answer 2

回答by iruvar

Try ed. The here-document-based example below deletes line 2from test.txt

试试ed。下面基于 here-document 的示例2从test.txt

ed -s test.txt <<!
2d
w
!

Answer 3

回答by kojiro

If nis the line you want to omit:

如果n是您要省略的行：

{
  head -n $(( n-1 )) file
  tail +$(( n+1 )) file
} > newfile

Answer 4

回答by technosaurus

You can do it without grep using posix shell builtins which should be on any *nix.

您可以在不使用 grep 的情况下使用应该在任何 *nix 上的 posix shell 内置函数来完成。

while read LINE || [ "$LINE" ];do
  case "$LINE" in
    *thing_you_are_grepping_for*)continue;;
    *)echo "$LINE";;
  esac
done <infile >outfile

Answer 5

回答by Digital Trauma

Given ddis deemed too dangerous for this in-place line removal, we need some other method where we have fairly fine-grained control over the file system calls. My initial urge is to write something in c, but while possible, I think that is a bit of overkill. Instead it is worth looking to common scripting (not shell-scripting) languages, as these typically have fairly low-level file APIs which map to the file syscalls in a fairly straightforward manner. I'm guessing this can be done using python, perl, Tcl or one of many other scripting language that might be available. I'm most familiar with Tcl, so here we go:

鉴于dd这种就地行删除被认为太危险了，我们需要一些其他方法来对文件系统调用进行相当细粒度的控制。我最初的冲动是用 c 写一些东西，但尽管可能，我认为这有点矫枉过正。相反，值得寻找常见的脚本（而不是 shell 脚本）语言，因为它们通常具有相当低级的文件 API，它们以相当直接的方式映射到文件系统调用。我猜这可以使用 python、perl、Tcl 或许多其他可能可用的脚本语言之一来完成。我最熟悉 Tcl，所以我们开始：

#!/bin/sh
# \
exec tclsh "#!/bin/bash

n=
filename=
exec 3<> $filename
exec 4<> $filename
linecount=1
bytecount=0
while IFS="" read -r line <&3 ; do
    if [[ $linecount == $n ]]; then
        echo "omitting line $linecount: $line"
    else
        echo "$line" >&4
        ((bytecount += ${#line} + 1))
    fi
    ((linecount++))
done
exec 3>&-
exec 4>&-

truncate -s $bytecount $filename
#### or if you can tolerate dd, just to do the truncate:
# dd of="$filename" bs=1 seek=$bytecount count=0
#### or if you have python
# python -c "open(\"$filename\", \"ab\").truncate($bytecount)"
" "$@"

package require Tclx

set removeline [lindex $argv 0]
set filename [lindex $argv 1]

set infile [open $filename RDONLY]
for {set lineNumber 1} {$lineNumber < $removeline} {incr lineNumber} {
    if {[eof $infile]} {
        close $infile
        puts "EOF at line $lineNumber"
        exit
    }
    gets $infile line
}
set bytecount [tell $infile]
gets $infile rmline

set outfile [open $filename RDWR]
seek $outfile $bytecount start

while {[gets $infile line] >= 0} {
    puts $outfile $line
}

ftruncate -fileid $outfile [tell $outfile]
close $infile
close $outfile

Note on my particular box I have Tcl 8.4, so I had to load the Tclx package in order to use the ftruncate command. In Tcl 8.5, there is chan truncatewhich could be used instead.

注意我的特定盒子上有 Tcl 8.4，所以我必须加载 Tclx 包才能使用 ftruncate 命令。在 Tcl 8.5 中，chan truncate可以使用 which 来代替。

You can pass the line number you want to remove and the filename to this script.

您可以将要删除的行号和文件名传递给此脚本。

In short, the script does this:

简而言之，该脚本执行以下操作：

open the file for reading
read the first n-1 lines
get the offset of the start of the next line (line n)
read line n
open the file with a new FD for writing
move the file location of the write FD to the offset of the start of line n
continue reading the remaining lines from the read FD and write them to the write FD until the whole read FD is read
truncate the write FD

打开文件进行阅读
读取前 n-1 行
获取下一行（第n行）开头的偏移量
读取第 n 行
用新的 FD 打开文件进行写入
将写入 FD 的文件位置移动到第 n 行开头的偏移量
继续从读 FD 中读取剩余的行并将它们写入写 FD，直到读完整个读 FD
截断写FD

The file is edited exactly in place. No temporary files are used.

该文件被准确地编辑到位。不使用临时文件。

I'm pretty sure this can be re-written in python or perl or ... if necessary.

我很确定这可以用 python 或 perl 或...重写。

Update

更新

Ok, so in-place line removal can be done in almost-pure bash, using similar techniques to the Tcl script above. But the big caveat is that you need to have truncatecommand available. I do have it on my Ubuntu 12.04 VM, but not on my older Redhat-based box. Here is the script:

好的，所以就地行删除可以在几乎纯 bash 中完成，使用与上面的 Tcl 脚本类似的技术。但需要注意的是，您需要有truncate可用的命令。我的 Ubuntu 12.04 VM 上确实有它，但我的旧版基于 Redhat 的机器上没有。这是脚本：

awk "NR!=$N" infile >outfile

I would love to hear of a more generic (bash-only?) way to do the partial truncate at the end and complete this answer. Of course the truncate can be done with ddas well, but I think that was already ruled out for my earlier answer.

我很想听到一种更通用的（仅限 bash？）方法来在最后进行部分截断并完成此答案。当然，截断也可以完成dd，但我认为我之前的回答已经排除了这一点。

And for the record this sitelists how to do an in-place file truncation in many different languages - in case any of these could be used in your environment.

作为记录，此站点列出了如何以多种不同语言进行就地文件截断 - 以防这些中的任何一种都可以在您的环境中使用。

Answer 6

回答by tripleee

If you can indicate under which circumstances on which platform(s) the most obvious Awk script is failing for you, perhaps we can devise a workaround.

如果您能指出在哪种情况下在哪个平台上最明显的 awk 脚本对您来说失败了，也许我们可以设计一个解决方法。

awk '/foo/ { if (!p++) next } 1' infile >outfile

If course, obtaining $Nwith grepjust to feed it to Awk is pretty bass-ackwards. This will delete the line containing the first occurrence of foo:

如果当然，获得$N与grep只是将其提供给AWK是漂亮的低音ackwards。这将删除包含第一次出现的行foo：

echo $(grep -v PATTERN file.txt) > file.txt

Answer 7

回答by tripleee

Based on Digital Trauma's answere, I found an improvement that just needs grep and echo, but no tempfile:

根据 Digital Trauma 的回答，我发现了一项改进，只需要 grep 和 echo，但不需要临时文件：

echo "$(grep -v PATTERN file.txt)" > file.txt

Depending on the kind of lines your file contains and whether your pattern requires a more complex syntax or not, you can embrace the grep command with double quotes:

根据您的文件包含的行类型以及您的模式是否需要更复杂的语法，您可以使用双引号包含 grep 命令：

##代码##

(useful when deleting from your crontab)

（从您的 crontab 中删除时很有用）

bash 不使用 sed 或 awk 从文件中删除特定行

提问by user2773624

回答by Digital Trauma

回答by iruvar

回答by kojiro

回答by technosaurus

回答by Digital Trauma

回答by tripleee

回答by tripleee

相关推荐

最近更新

标签

bash 不使用 sed 或 awk 从文件中删除特定行

提问by user2773624

回答by Digital Trauma

回答by iruvar

回答by kojiro

回答by technosaurus

回答by Digital Trauma

回答by tripleee

回答by tripleee

相关推荐

bash 打印 'find' linux 命令找到匹配项的目录

解压未知名称文件的 Bash 脚本

bash 在bash中删除给定文件夹中的所有目录

在 Windows 批处理文件中运行 bash 命令

相关推荐

最近更新

标签