Linux 如何对所有少于 4 个字符的单词进行 grep?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4982052/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How do I grep for all words that are less than 4 characters?
提问by TIMEX
I have a dictionary with words separated by line breaks.
我有一本字典,里面的单词用换行符分隔。
采纳答案by Michael Goldshteyn
You can just do:
你可以这样做:
egrep -x '.{1,3}' myfile
This will also skip blank lines, which are technically not words. Unfortunately, the above reg-ex will count apostrophes in contractions as letters as well as hyphens in hyphenated compound words. Hyphenated compound words are not a problem at such a low letter count, but I am not sure whether or not you want to count apostrophes in contractions, which are possible (e.g., I'm). You can try to use a reg-ex such as:
这也将跳过空行,这在技术上不是单词。不幸的是,上面的 reg-ex 会将收缩中的撇号计算为字母以及带连字符的复合词中的连字符。在如此低的字母数下,带连字符的复合词不是问题,但我不确定您是否要计算收缩中的撇号,这是可能的(例如,I'm)。您可以尝试使用正则表达式,例如:
egrep -x '\w{1,3}' myfile
..., but this will only match upper/lower case letters and not match contractions or hyphenated compound words at all.
...,但这只会匹配大写/小写字母,而根本不匹配收缩或带连字符的复合词。
回答by Paul Tomblin
Like this:
grep -v "^...." my_file
像这样:
grep -v "^...." my_file
回答by Mark Byers
Try this regular expression:
试试这个正则表达式:
grep -E '^.{1,3}$' your_dictionary