bash 使用 grep 在文件中查找负数
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/27169078/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Find negative numbers in file with grep
提问by aironman
i have this script that reads a file, the file looks like this:
我有这个读取文件的脚本,文件如下所示:
711324865,438918283,2
-333308476,886548365,2
1378685449,-911401007,2
-435117907,560922996,2
259073357,714183955,2
...
the script:
剧本:
#!/bin/bash
while IFS=, read childId parentId parentLevel
do
grep "$parentId" parent_child_output_level2.csv
resul=$?
echo "child is $childId, parent is $parentId parentLevel is $parentLevel resul is $resul"
done < parent_child_output_level1.csv
but it is not working, resul is allways returning me 1, which is a false positive.
但它不起作用,结果总是给我返回 1,这是一个误报。
I know that because i can launch the next command, equivalent, i think:
我知道因为我可以启动下一个命令,相当于,我想:
[core@dub-vcd-vms165 generated-and-saved-to-hdfs]$
grep "\-911401007"parent_child_output_level2.csv
-911401007,-157143722,3
Please help.
请帮忙。
回答by Avinash Raj
grep command to print only the negative numbers.
grep 命令只打印负数。
$ grep -oP '(^|,)\K-\d+' file.csv
-333308476
-911401007
-435117907
(^|,)
matches the start of a line or comma.\K
discards the previously matched characters.-\d+
Matches-
plus the following one or more numbers.
(^|,)
匹配行或逗号的开头。\K
丢弃先前匹配的字符。-\d+
匹配-
加上以下一个或多个数字。
回答by bgoldst
Your title is inconsistent with your question. Your title asks for how to grep negative numbers, which Avinash Raj answered well, although I'd suggest you don't even need the (Perl-style) look-behind positive assertion (^|,)\K
to match start-of-field, because if the file is well-formed, then -\d+
would match all numbers just as well. So you could just run (edit: realized that with a leading - you need -- to prevent grep from taking the pattern as an option):
你的标题与你的问题不一致。您的标题要求如何 grep 负数,Avinash Raj 回答得很好,尽管我建议您甚至不需要(Perl 风格的)后视正断言(^|,)\K
来匹配字段开始,因为如果文件是格式良好的,那么-\d+
也将匹配所有数字。所以你可以运行(编辑:意识到用一个领先的 - 你需要 - 来防止 grep 将模式作为一个选项):
grep -oP -- '-\d+' file.csv;
Your question includes a script whose intention seems to be to grep for any number (positive or negative) in the first field (childId) of one file (parent_child_output_level2.csv) that occurs in the second field (parentId) of another file (parent_child_output_level1.csv). To accomplish this, I wouldn't use grep, because you're trying to do an exact numerical equality test, which can even be done as an exact string equality test assuming your numbers are always consistently represented (e.g. no redundant leading zeroes). Repeatedly grepping through the entire file just to search for a number in one column is also wasteful of CPU.
您的问题包括一个脚本,其意图似乎是对出现在另一个文件 (parent_child_output_level1.csv) 的第二个字段 (parentId) 中的一个文件 (parent_child_output_level2.csv) 的第一个字段 (childId) 中的任何数字(正数或负数)进行 grep。 .csv)。为了实现这一点,我不会使用 grep,因为您正在尝试进行精确的数字相等测试,假设您的数字始终一致表示(例如,没有多余的前导零),甚至可以作为精确的字符串相等测试来完成。重复地遍历整个文件只是为了在一列中搜索一个数字也是浪费 CPU 的。
Here's what I would do:
这是我会做的:
parentIdList=($(cut -d, -f2 parent_child_output_level1.csv));
childIdList=($(cut -d, -f1 parent_child_output_level2.csv));
for parentId in "${parentIdList[@]}"; do
for childId in "${childIdList[@]}"; do
if [[ "$childId" == "$parentId" ]]; then
echo "$parentId";
fi;
done;
done;
With this approach, you precompute both the parent id list and the child id list just once, using cut to extract the appropriate field from each file. Then you can use the shell-builtin for loop, shell-builtin if conditional, and shell-builtin [[ test command to accomplish the check, and finally finish with a shell-builtin echo to print the matches. Everything is shell-builtin, after the initial command substitutions that run the cut external executable.
使用这种方法,您只需预先计算父 id 列表和子 id 列表一次,使用 cut 从每个文件中提取适当的字段。然后你可以使用shell-builtin for loop、shell-builtin if conditional和shell-builtin [[ test命令来完成检查,最后用shell-builtin echo完成打印匹配。在运行剪切的外部可执行文件的初始命令替换之后,一切都是 shell 内置的。
If you alsowant to filter these results on negative numbers, you could grep for ^-
in the results of the above script, or grep for it in the results of each (or just the first) cut command, or add the following line just inside the outer for loop:
如果你也想过滤的负数这些结果,您可以用grep用于^-
在上述脚本的结果,或在每个结果用grep它(或仅仅是第一)剪切命令,或者只是里面添加了以下行外部for循环:
if [[ "${parentId:0:1}" != '-' ]]; then continue; fi;
Alternative approach:
替代方法:
if [[ "$parentId" != -* ]]; then continue; fi;
Either approach will skip non-negatives.
这两种方法都将跳过非负数。