bash 根据列中的值选择行

Question

提问by Tom

I have a tab delimited table for which I want to print all lines where column 'x' is greater than 'Y'. I have attempted using the code below but am new to using awk so am unsure how to use it based on columns.

我有一个制表符分隔的表格，我想打印其中列“x”大于“Y”的所有行。我曾尝试使用下面的代码，但我不熟悉使用 awk，所以我不确定如何基于列使用它。

awk '$X >= Y {print} ' Table.txt | cat > Wanted_lines

Y are values from 1 to 100.

Y 是从 1 到 100 的值。

If the input were like below with column X were the second column.

如果输入如下所示，X 列是第二列。

The wanted output would be:

想要的输出是：

The first 2 lines of the file is:

文件的前两行是：

1   OTU1    243622  208679  121420  265864  0   0   2   0   0   11  1   5   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   839604  OTU1    -   Archaea 100%    Euryarchaeota   100%    Methanobacteria 100%    Methanobacteriales  100%    Methanobacteriaceae 100%    Methanobrevibacter  100%
2   OTU2    84366   120817  15834   74737   0   0   0   0   0   1   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   295755  OTU2    -   Archaea 100%    Euryarchaeota   100%    Methanobacteria 100%    Methanobacteriales  100%    Methanobacteriaceae 100%    Methanobrevibacter  100%

Answer 1

采纳答案by Anthony Rutledge

First

第一的

awk's default internal field separator (FS) will work on space or tab delimited files.

awk 的默认内部字段分隔符 (FS) 将适用于空格或制表符分隔的文件。

Secondly

其次

awk '$x > FLOOR' Table.txt

Where $xis the target column, and FLOORis the actual numeric floor (i.e. 5000, etc ...)

$x目标列在哪里，FLOOR是实际数字楼层（即 5000 等...）

Example file: awktest

示例文件：awktest

500  100
400  1100
1000 400
1200 500


awk ' > 1000' awktest

1200   500

awk ' >= 1000' awktest

1000   400 
1200   500

Thus, you should be able to use a relational expression to print the lines where x > y, in the form:

因此，您应该能够使用关系表达式以以下形式打印 x > y 的行：

awk '$x > $y' awktest

Where $xis a numeric column as in $1, or other.

哪里$x是数字列，如 in$1或其他。

Where $yis a numeric column as in $2, or other.

哪里$y是数字列，如 in$2或其他。

Example:

示例：

awk ' > ' awktest

or ...

或者 ...

awk ' > ' awktest

awk numbers are floating point numbers, so you can compare decimals, too.

awk 数字是浮点数，因此您也可以比较小数。

Answer 2

回答by ghoti

So...

所以...

'$X >= Y {print}'is redundant, as the default action in awk is to print.
| cat > fileis UUOC.
Your expected output shows lines where that value is 80 or above. This answer assumes the output is what you really want, despite the lack of code to handle it.
I don't see how your last input example relates to things. Is there particular output you'd like from that input?

'$X >= Y {print}'是多余的，因为 awk 中的默认操作是打印。
| cat > file是UUOC。
您的预期输出显示该值为 80 或更高的行。尽管缺少处理它的代码，但此答案假定输出是您真正想要的。
我不明白你的最后一个输入示例与事物有何关系。您是否希望该输入有特定的输出？

Consider:

考虑：

$ awk '$X >= Y' X=2 Y=80 input.txt
3    100
4    100
5    80
7    90
$ awk '$X >= Y' X=2 Y=90 input.txt
3    100
4    100
7    90

The notation above relies on the following statement from man awk:

上述符号依赖于以下声明man awk：

Any file of the form var=value is treated as an assignment, not a filename, and is executed at the time it would have been opened if it were a filename.

任何形式为 var=value 的文件都被视为赋值，而不是文件名，并且在它是文件名时会被打开时执行。

This is functionally equivalent to:

这在功能上等同于：

$ awk -v X=2 -v Y=80 '$X >= Y' input.txt

Either of these notations for getting shell variables into your awk script will do just fine, I believe any version of awk you come across (bsdawk, gawk, mawk) should handle both equally well.

将 shell 变量添加到 awk 脚本中的这些符号中的任何一个都可以，我相信您遇到的任何版本的 awk（bsdawk、gawk、mawk）都应该同样好地处理。

Within a shell script, you might see something like this:

在 shell 脚本中，您可能会看到如下内容：

#!/usr/bin/env bash

if [[ $# != 2 ]]; then
  printf 'Please supply column and floor values as parameters.\n'
  exit 1
elif [[  =~ [^0-9] ]] || [[  =~ [^0-9] ]]; then
  printf 'Invalid parameters.\n'
  exit 1
fi

awk '$X >= Y' X="" Y="" input.txt

Answer 3

回答by Juan Diego Godoy Robles

Try:

尝试：

awk -v num_col=$X -v limit=$Y '$num_col + 0 >= limit + 0' Table.txt > Wanted_lines

Example:

例子：

$ cat Table.txt
1    30
2    50
3    100
4    100
5    80
6    79
7    90


$ X=2
$ Y=80
$ awk -v num_col=$X -v limit=$Y '$num_col + 0 > limit + 0' Table.txt
3    100
4    100
5    80
7    90

Alternatively (hacky and NOT recomended) awk enclosure could be broken this way:

或者（hacky 且不推荐）awk 外壳可以通过这种方式破坏：

$  awk '$'"${X}"' + 0 >= '"${Y}"' + 0' Table.txt

This is what you need to get rid of %symbol in your actual file:

这是您在实际文件中摆脱%符号所需的内容：

$ awk -v num_col=43 -v limit=80 '{sub(/%/,"",$num_col)}$num_col + 0 >= limit + 0 ' Table.txt

bash 根据列中的值选择行

提问by Tom

采纳答案by Anthony Rutledge

回答by ghoti

回答by Juan Diego Godoy Robles

相关推荐

最近更新

标签

bash 根据列中的值选择行

提问by Tom

采纳答案by Anthony Rutledge

回答by ghoti

回答by Juan Diego Godoy Robles

相关推荐

bash awk 和 md5：替换一列

bash 如何使用sed仅删除文件中第一次出现的行

bash 在 Linux 或 Mac OS X 上构建和启动 hybris 5.1.1

bash 如何仅获取 yum 更新列表

相关推荐

最近更新

标签