Linux中的Uniq命令示例-IGI

时间：2020-03-05 15:30:00 　来源:igfitidea点击:

通过这些实际示例，学习在Unix和Linux中使用uniq命令。

Unix和Linux中的uniq命令用于过滤重复的文本。
它可以单独使用，但通常与其他命令一起使用，例如识别文件中的冗余信息。

这是uniq命令的语法：

uniq [options] <input-file> <output-file>

当我们运行不带选项的uniq时，它将使用stdin和stdout进行输入和输出。

虽然可以通过剪贴板（复制/粘贴）使用stdin，但这并不是最实际的用法。

相反，我们可能希望在怀疑包含重复信息的文件上使用此命令。

uniq命令的一个局限性在于，它将仅标识文件中相邻或者彼此相邻的重复项。

这非常简单，但是让我向我们展示一个示例，以便我们可以实际使用它。

[Hyman@theitroad ~]$cat apple.txt
apple
apple
orange
orange
apple 
orange

[Hyman@theitroad ~]$uniq apple.txt 
apple
orange
apple 
orange

因此，我们马上就知道，我们不能信任该程序自己识别每个重复项。
有一些方法可以解决此问题，通常是使用sort命令。

我将在本文后面向我们展示。
首先，让我浏览一些示例，以使我们熟悉“ uniq”，然后再混入其他命令并可能使事情变得混乱。

Linux中uniq命令的7个示例

我使用了真实的系统日志，但出于演示目的对其进行了编辑。
大多数文件已经按照相邻顺序进行了排序，但是我留下了几行“不合时宜”以显示uniq命令的功能。

https://gist.github.com/igipc/7dada8c6e57fd5b854f9d2dae72dddb0下载示例文本文件

示例1：默认方式使用uniq命令

尽管我已经向我们展示了这一点，但让我们使用默认语法来查看示例文件。

[Hyman@theitroad ~]$uniq sample_log_file.txt 
/usr/lib/gdm3/gdm-x-session[1443]: (II) No input driver specified, ignoring this device.
/usr/lib/gdm3/gdm-x-session[1443]: (II) event9  - Intel HID events: device is a keyboard
/usr/lib/gdm3/gdm-x-session[1443]: (II) event9  - Intel HID events: device removed
/usr/lib/gdm3/gdm-x-session[1443]: (II) event9  - Intel HID events: is tagged by udev as: Keyboard
/usr/lib/gdm3/gdm-x-session[1443]: (II) No input driver specified, ignoring this device.
/usr/lib/gdm3/gdm-x-session[1443]: (II) systemd-logind: got fd for /dev/input/event10 13:74 fd 55 paused 0
/usr/lib/gdm3/gdm-x-session[1443]: (II) This device Jan have been added with another device file.
PackageKit: get-updates transaction /354_eebeebaa from uid 1000 finished with success after 1514ms
wpa_supplicant[898]: RRM: Ignoring radio measurement request: Not RRM network

我们可以看到许多重复的行已合并，但是仍然有多余的信息。
这是由于我已经描述的功能限制。
让我们再看一些示例，并研究“ uniq”命令行实用程序内置的一些选项。

示例2：将过滤后的结果输出到目标文件

我们可能需要保存此输出，以便可以轻松地对其进行编辑或者保留。
我们可以将我们的输出定向到一个单独的文件，而不是普通的stdout（终端）。
重要的是要注意，我们不能使用此格式来覆盖原始文件。

[Hyman@theitroad ~]$uniq sample_log_file.txt uniq_log_output.txt

这是输出文件的内容：

[Hyman@theitroad ~]$cat uniq_log_output.txt 
/usr/lib/gdm3/gdm-x-session[1443]: (II) No input driver specified, ignoring this device.
/usr/lib/gdm3/gdm-x-session[1443]: (II) event9  - Intel HID events: device is a keyboard
/usr/lib/gdm3/gdm-x-session[1443]: (II) event9  - Intel HID events: device removed
/usr/lib/gdm3/gdm-x-session[1443]: (II) event9  - Intel HID events: is tagged by udev as: Keyboard
/usr/lib/gdm3/gdm-x-session[1443]: (II) No input driver specified, ignoring this device.
/usr/lib/gdm3/gdm-x-session[1443]: (II) systemd-logind: got fd for /dev/input/event10 13:74 fd 55 paused 0
/usr/lib/gdm3/gdm-x-session[1443]: (II) This device Jan have been added with another device file.
PackageKit: get-updates transaction /354_eebeebaa from uid 1000 finished with success after 1514ms
wpa_supplicant[898]: RRM: Ignoring radio measurement request: Not RRM network

示例3：使用“ -c”获取重复行数

这个选项很不言自明。
程序会将计数添加到每一行的开头。

[Hyman@theitroad ~]$uniq sample_log_file.txt -c
      2 /usr/lib/gdm3/gdm-x-session[1443]: (II) No input driver specified, ignoring this device.
      2 /usr/lib/gdm3/gdm-x-session[1443]: (II) event9  - Intel HID events: device is a keyboard
      1 /usr/lib/gdm3/gdm-x-session[1443]: (II) event9  - Intel HID events: device removed
      2 /usr/lib/gdm3/gdm-x-session[1443]: (II) event9  - Intel HID events: is tagged by udev as: Keyboard
      5 /usr/lib/gdm3/gdm-x-session[1443]: (II) No input driver specified, ignoring this device.
      1 /usr/lib/gdm3/gdm-x-session[1443]: (II) systemd-logind: got fd for /dev/input/event10 13:74 fd 55 paused 0
      7 /usr/lib/gdm3/gdm-x-session[1443]: (II) This device Jan have been added with another device file.
      1 PackageKit: get-updates transaction /354_eebeebaa from uid 1000 finished with success after 1514ms
      8 wpa_supplicant[898]: RRM: Ignoring radio measurement request: Not RRM network

示例4：仅打印带有“ -d”的重复行

如我们所见，如果使用uniq命令的-d选项，则仅显示在整个文件中重复的行。

[Hyman@theitroad ~]$uniq sample_log_file.txt -d
/usr/lib/gdm3/gdm-x-session[1443]: (II) No input driver specified, ignoring this device.
/usr/lib/gdm3/gdm-x-session[1443]: (II) event9  - Intel HID events: device is a keyboard
/usr/lib/gdm3/gdm-x-session[1443]: (II) event9  - Intel HID events: is tagged by udev as: Keyboard
/usr/lib/gdm3/gdm-x-session[1443]: (II) No input driver specified, ignoring this device.
/usr/lib/gdm3/gdm-x-session[1443]: (II) This device Jan have been added with another device file.
wpa_supplicant[898]: RRM: Ignoring radio measurement request: Not RRM network

示例5：仅打印带有“ -u”的唯一行

其中我们将获得上一个命令的反输出。
在文件中没有重复这些命令。

[Hyman@theitroad ~]$uniq sample_log_file.txt -u
/usr/lib/gdm3/gdm-x-session[1443]: (II) event9  - Intel HID events: device removed
/usr/lib/gdm3/gdm-x-session[1443]: (II) systemd-logind: got fd for /dev/input/event10 13:74 fd 55 paused 0
PackageKit: get-updates transaction /354_eebeebaa from uid 1000 finished with success after 1514ms

示例6：使用uniq ['-f'和'-s']忽略字段或者字符

这确实是两个示例，但是功能几乎相同。
我将解释它们是如何工作的，然后对它们两者之间的区别进行一些说明。

他们每个人都使用以下语法

Skip fields with:
uniq <source_file> -f N
Skip characters with:
uniq <source_file> -s N

在上述每个示例中，“ N”是我们希望跳过的项目数。
当我们跳过此数量的项目时，uniq将在该点开始比较，而不是比较整行。

选项“ f”将跳过分配的字段数。
这些字段将使用空格进行解释。

[Hyman@theitroad ~]$cat field_separated_values.txt 
blue fish
blue fish
blue fish
blue class
red fish
green fish
two class
two class

如果要在第二列上使用uniq命令，则必须跳过第一个字段，如下所示：

[Hyman@theitroad ~]$uniq -f1 field_separated_values.txt  
blue fish
blue class
red fish
two class

如我们所见，由于第一个字段（带有颜色）已被忽略，因此将“红色鱼”和“绿色鱼”作为同一行。
如果在此处使用count选项，它将显示已找到的唯一行的计数：

[Hyman@theitroad ~]$uniq -f1 -c field_separated_values.txt  
      3 blue fish
      1 blue class
      2 red fish
      2 two class

你为什么需要那个？
我会给你一个实际的情况。
许多日志文件的行首都有时间戳。
如果要查找此类文件中的唯一行，则可以使用-f选项跳过带有时间戳的第一个字段。

同样，我们可以跳过特定数量的字符。

[Hyman@theitroad ~]$uniq -s 10 field_separated_values.txt 
blue    fish

示例7：使用“ -w”仅比较N个字符

“ -w”选项使我们可以指定要在比较中使用的确切字符数。

如果我们在前面的几个示例中使用了日志文件，那很好。
我想使比较文本更简单一些，以减少混乱。
如果没有，让我们将其备份，看看仅使用第一个字符查找重复项时会发生什么。

[Hyman@theitroad ~]$uniq -w 4 sample_log_file.txt 
/usr/lib/gdm3/gdm-x-session[1443]: (II) No input driver specified, ignoring this device.
PackageKit: get-updates transaction /354_eebeebaa from uid 1000 finished with success after 1514ms
wpa_supplicant[898]: RRM: Ignoring radio measurement request: Not RRM network

从程序的角度来看，所有以“/usr”开头的行现在都被标识为“相同”。

如果我们正在寻找特定的日志事件，这可能会有所帮助。

避免同时使用“ sort”和“ uniq”进行不完全匹配。

我们可以分别运行这些命令以达到相同的效果，但是，如果我们从未在Linux中使用过管道（|字符），那么这是了解它们的好方法。

我们可以使用管道来组合不同的命令，以节省我们的击键次数并改善我们的工作流程。
这些命令将按照其键入的顺序执行。

这是我将要使用的示例输入：

[Hyman@theitroad ~]$cat apple.txt 
apple
orange
orange
apple
apple
banana
apple
banana

现在，让我们对输入文件进行排序，然后在其上使用uniq命令。
sort命令重新排列文本，以使所有项目首先按相邻顺序排列。
然后，当运行uniq命令时，它在文件中仅找到3个唯一的行。

[Hyman@theitroad ~]$sort apple.txt | uniq 
apple
banana
orange

如果我们颠倒顺序，情况将会改变。
首先执行“ uniq”命令将仅识别相邻的重复项，然后使用“ sort”命令将它们分别按字母顺序排序。

[Hyman@theitroad ~]$uniq apple.txt | sort
apple
apple
apple
banana
banana
orange

管道允许我们同时运行多个命令，但是考虑它们的顺序很重要。

请注意，文件内容保持不变，就像单独运行命令时一样。
同时使用这两个命令还可以将结果保存在系统的“内存”中。
如果单独运行它们，除非生成新文件并在运行第二个命令之前使用它覆盖原始内容，否则无法获得这些结果。

Linux中的Uniq命令示例

Linux中uniq命令的7个示例

示例1：默认方式使用uniq命令

示例2：将过滤后的结果输出到目标文件

示例3：使用“ -c”获取重复行数

示例4：仅打印带有“ -d”的重复行

示例5：仅打印带有“ -u”的唯一行

示例6：使用uniq ['-f'和'-s']忽略字段或者字符

示例7：使用“ -w”仅比较N个字符

避免同时使用“ sort”和“ uniq”进行不完全匹配。

相关推荐

最近更新

标签

Linux中的Uniq命令示例

Linux中uniq命令的7个示例

示例1：默认方式使用uniq命令

示例2：将过滤后的结果输出到目标文件

示例3：使用“ -c”获取重复行数

示例4：仅打印带有“ -d”的重复行

示例5：仅打印带有“ -u”的唯一行

示例6：使用uniq ['-f'和'-s']忽略字段或者字符

示例7：使用“ -w”仅比较N个字符

避免同时使用“ sort”和“ uniq”进行不完全匹配。

相关推荐

Linux中的Tee命令示例说明

如何在Linux上使用terminator（管理多个终端）

termtosvg - 在Linux中记录终端会话的工具

ternimal - 向终端显示动画生命

相关推荐

最近更新

标签