bash 在 awk 中打印除选择字段之外的所有字段

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/6458414/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 20:41:35  来源:igfitidea点击:

print all but select fields in awk

bashawk

提问by Stedy

I have a large file with hundreds of columns that I want to remove only the third and fourth columns from and print the rest to a file. My initial idea was to make an awk script like awk '{print $1, $2, for (i=$5; i <= NF; i++) print $i }' file > outfile. However, this code does not work.

我有一个包含数百列的大文件,我只想从中删除第三列和第四列并将其余列打印到文件中。我最初的想法是制作一个像awk '{print $1, $2, for (i=$5; i <= NF; i++) print $i }' file > outfile. 但是,此代码不起作用。

I then tried:

然后我尝试:

awk '{for(i = 1; i<=NF; i++)
if(i == 3 || i == 4) continue
else
print($i)}' file > outfile

But this just printed everything out in one field. It would be possible to split this up into two scripts and combine them with unix pastebut this seems like something that should be able to be done in one line.

但这只是在一个字段中打印出所有内容。可以将其拆分为两个脚本并将它们与 unix 结合起来,paste但这似乎应该能够在一行中完成。

回答by Carl Norum

Your first try was pretty close. Modifying it to use printfand including the field separators worked for me:

你的第一次尝试非常接近。修改它以使用printf并包括字段分隔符对我有用:

awk '{printf FS; for (i=5; i <= NF; i++) printf FS$i; print NL }'

回答by thomascirca

What about something like:

怎么样:

cat SOURCEFILE | cut -f1-2,5- >> DESTFILE

It prints the first two columns, skips the 3rd and 4rth, and then prints from 5 onwards to the end.

它打印前两列,跳过第 3 和第 4 列,然后从 5 开始打印到最后。

回答by matchew

Say you have a tabdelimited file that looks like the following:

假设您有一个制表符分隔的文件,如下所示:

temp.txt

临时文件

field1 field2 field3 field4 field5 field6
field1 field2 field3 field4 field5 field6
field1 field2 field3 field4 field5 field6

场1场2场3场4
场5场6
场1场2场3场4场5场6场1场2场3场4场5场6

running the following will remove field 3 and 4 and output to end of line.

运行以下将删除字段 3 和 4 并输出到行尾。

awk '{print $1"\t"$2"\t"substr($0, index($0,$5))}' temp.txt

awk '{print $1"\t"$2"\t"substr($0, index($0,$5))}' temp.txt

field1 field2 field5 field6
field1 field2 field5 field6
field1 field2 field5 field6

field1 field2 field5 field6
field1 field2 field5 field6
field1 field2 field5 field6

My example(s) print to stdout. > newFilewill send stdout to newFile and >> newFilewill append to newFile.

我的示例打印到标准输出。 > newFile将标准输出发送到 newFile>> newFile并将附加到 newFile。

So you may want to use the following:

因此,您可能需要使用以下内容:

awk '{print $1"\t"$2"\t"substr($0, index($0,$5))}' temp.txt > newFile.txt

awk '{print $1"\t"$2"\t"substr($0, index($0,$5))}' temp.txt > newFile.txt

some will argue for cut

有些人会争论削减

cut -f1,2,5- temp.txt

cut -f1,2,5- temp.txt

which produce the same output, and cut is great for simplicity, but does not handle inconsistent delimiters. For example mixture of different whitespaces. However, in this case cut may be what you are after.

它产生相同的输出,并且 cut 为简单起见很好,但不处理不一致的分隔符。例如不同空格的混合。但是,在这种情况下,cut 可能就是您所追求的。

you could also accomplish this in perl,python,ruby,and many others, but here is the simplest awksolution.

您也可以在 perl、python、ruby 和许多其他程序中完成此操作,但这里是最简单的awk解决方案。

回答by jim

How about just setting the third and fourth columns to an empty string:

如何将第三列和第四列设置为空字符串:

echo 1 2 3 4 5 6 7 8 9 10 |
awk -F" " '{ ="";  =""; print}'

回答by progz

Yes, it's possible to just set the third and fourth columns to an empty string; but, in addition, field $1should be set to itself ($1=$1) to make awkactually consume the input field separator (delimeter) :on the entire current line $0in one go.

是的,可以将第三列和第四列设置为空字符串;但是,另外,$1应该将field设置为它自己($1=$1),以使awk实际上一次性消耗:整个当前行上的输入字段分隔符(分隔符)$0

echo 1:2:3:4:5:6:7:8:9:10 | awk -F: '{ =; =""; =""; print 
awk -v "Exclude=3:4:5" '
   # load exclusion
   BEGIN{
      Count=split(Exclude, aTmp, ":")
      for( i = 1; i <= Count; i++) aExc[ aTmp[ i]]=1
      }

   # treat each line, taking only wanted field
   {
    Result=""
    for( i = 1; i <= NF; i++) {
       # field to take ?
       if( ! aExc[ i]) {
         # first element or add a separator before
         if( Result != "") Result=Result OFS $i
          else Result=$i
         }
       }

    print Result
   }' YourFile
}'

回答by NeronLeVelu

The hard but generic way (to forget for a simple oneliner)

困难但通用的方法(忘记一个简单的单行)

##代码##
  • you can specify any field that you want to exclude
    • fill field index in varaible Excludeseparate by a :in first line
  • separator are correct in place an quantity
  • code is "expanded" for better understanding
  • the final result is not exactly as input (without excluded field) because the output separator is used instead of original separator (ex 2 space or a tab is changed to 1 space with default behaviour)
  • 您可以指定要排除的任何字段
    • 在变量中填充字段索引排除:在第一行中由 a 分隔
  • 分隔符正确到位数量
  • 代码被“扩展”以便更好地理解
  • 最终结果与输入不完全相同(没有排除的字段),因为使用输出分隔符而不是原始分隔符(例如 2 个空格或制表符更改为 1 个空格,默认行为)