Linux 如何对所有少于 4 个字符的单词进行 grep？

Question

提问by TIMEX

I have a dictionary with words separated by line breaks.

我有一本字典，里面的单词用换行符分隔。

Answer 1

采纳答案by Michael Goldshteyn

You can just do:

你可以这样做：

egrep -x '.{1,3}' myfile

This will also skip blank lines, which are technically not words. Unfortunately, the above reg-ex will count apostrophes in contractions as letters as well as hyphens in hyphenated compound words. Hyphenated compound words are not a problem at such a low letter count, but I am not sure whether or not you want to count apostrophes in contractions, which are possible (e.g., I'm). You can try to use a reg-ex such as:

这也将跳过空行，这在技术上不是单词。不幸的是，上面的 reg-ex 会将收缩中的撇号计算为字母以及带连字符的复合词中的连字符。在如此低的字母数下，带连字符的复合词不是问题，但我不确定您是否要计算收缩中的撇号，这是可能的（例如，I'm）。您可以尝试使用正则表达式，例如：

egrep -x '\w{1,3}' myfile

..., but this will only match upper/lower case letters and not match contractions or hyphenated compound words at all.

...，但这只会匹配大写/小写字母，而根本不匹配收缩或带连字符的复合词。

Answer 2

回答by Paul Tomblin

Like this: grep -v "^...." my_file

像这样： grep -v "^...." my_file

Answer 3

回答by Mark Byers

Try this regular expression:

试试这个正则表达式：

grep -E '^.{1,3}$' your_dictionary

Linux 如何对所有少于 4 个字符的单词进行 grep？

提问by TIMEX

采纳答案by Michael Goldshteyn

回答by Paul Tomblin

回答by Mark Byers

相关推荐

最近更新

标签

Linux 如何对所有少于 4 个字符的单词进行 grep？

提问by TIMEX

采纳答案by Michael Goldshteyn

回答by Paul Tomblin

回答by Mark Byers

相关推荐

如何在 linux 中获取给定语言环境的语言名称

从linux shell中的'ftp'命令获取退出状态代码

Linux 安装 RMagick Gem

Linux 为什么指定的初始值设定项未在 g++ 中实现

相关推荐

最近更新

标签