bash 从 UNIX shell 脚本的列表中选择唯一或不同的值

Question

提问by brabster

I have a ksh script that returns a long list of values, newline separated, and I want to see only the unique/distinct values. It is possible to do this?

我有一个 ksh 脚本，它返回一长串值，换行符分隔，我只想看到唯一/不同的值。有可能做到这一点吗？

For example, say my output is file suffixes in a directory:

例如，假设我的输出是目录中的文件后缀：

tar
gz
java
gz
java
tar
class
class

tar
gz
java
gz
java
tar
class
class

I want to see a list like:

我想看到一个列表，如：

tar
gz
java
class

tar
gz
java
class

Answer 1

回答by Matthew Scharley

You might want to look at the uniqand sortapplications.

您可能想查看uniq和sort应用程序。

./yourscript.ksh | sort | uniq

(FYI, yes, the sort is necessary in this command line, uniqonly strips duplicate lines that are immediately after each other)

（仅供参考，是的，此命令行中需要排序，uniq仅删除紧接其后的重复行）

EDIT:

编辑：

Contrary to what has been posted by Aaron Digullain relation to uniq's commandline options:

与Aaron Digulla发布的有关uniq命令行选项的内容相反：

Given the following input:

给定以下输入：

class
jar
jar
jar
bin
bin
java

uniqwill output all lines exactly once:

uniq将只输出所有行一次：

class
jar
bin
java

uniq -dwill output all lines that appear more than once, and it will print them once:

uniq -d将输出出现多次的所有行，并打印一次：

jar
bin

uniq -uwill output all lines that appear exactly once, and it will print them once:

uniq -u将输出所有出现一次的行，并打印一次：

class
java

Answer 2

回答by gpojd

./script.sh | sort -u

This is the same as monoxide's answer, but a bit more concise.

这与monoxide 的答案相同，但更简洁一些。

Answer 3

回答by paxdiablo

For larger data sets where sorting may not be desirable, you can also use the following perl script:

对于可能不需要排序的较大数据集，您还可以使用以下 perl 脚本：

./yourscript.ksh | perl -ne 'if (!defined $x{$_}) { print $_; $x{$_} = 1; }'

This basically just remembers every line output so that it doesn't output it again.

这基本上只记住每一行输出，以便它不会再次输出。

It has the advantage over the "sort | uniq" solution in that there's no sorting required up front.

与“ sort | uniq”解决方案相比，它的优势在于无需预先进行排序。

Answer 4

回答by Dimitre Radoulov

With zshyou can do this:

使用zsh你可以这样做：

% cat infile 
tar
more than one word
gz
java
gz
java
tar
class
class
zsh-5.0.0[t]% print -l "${(fu)$(<infile)}"
tar
more than one word
gz
java
class

Or you can use AWK:

或者你可以使用 AWK：

% awk '!_[ ./yourscript.ksh | awk '!a[bag2set () {
    # Reduce a_bag to a_set.
    local -i i j n=${#a_bag[@]}
    for ((i=0; i < n; i++)); do
        if [[ -n ${a_bag[i]} ]]; then
            a_set[i]=${a_bag[i]}
            a_bag[i]=$'awk '##代码## != x ":FOO" && NR>1 {print x} {x=##代码##} END {print}' file_name | uniq -f1 -u

'
            for ((j=i+1; j < n; j++)); do
                [[ ${a_set[i]} == ${a_bag[j]} ]] && a_bag[j]=$'##代码##'
            done
        fi
    done
}
declare -a a_bag=() a_set=()
stdin="$(</dev/stdin)"
declare -i i=0
for e in $stdin; do
    a_bag[i]=$e
    i=$i+1
done
bag2set
echo "${a_set[@]}"
]++'
]++' infile    
tar
more than one word
gz
java
class

Answer 5

回答by Aaron Digulla

Pipe them through sortand uniq. This removes all duplicates.

通过sort和管道它们uniq。这将删除所有重复项。

uniq -dgives only the duplicates, uniq -ugives only the unique ones (strips duplicates).

uniq -d仅给出重复项，uniq -u仅给出唯一项（去除重复项）。

Answer 6

回答by Ajak6

With AWK you can do, I find it faster than sort

使用 AWK 你可以做到，我发现它比排序更快

##代码##

Answer 7

回答by FGrose

Unique, as requested, (but not sorted);
uses fewer system resources for less than ~70 elements (as tested with time);
written to take input from stdin,
(or modify and include in another script):
(Bash)

唯一，根据要求，（但未排序）；
为少于约 70 个元素使用更少的系统资源（经时间测试）；
编写以从 stdin 获取输入，
（或修改并包含在另一个脚本中）：（
Bash）

##代码##

Answer 8

回答by Mary Marty

I get a better tips to get non-duplicate entries in a file

我得到了一个更好的提示来获取文件中的非重复条目

##代码##

bash 从 UNIX shell 脚本的列表中选择唯一或不同的值

提问by brabster

回答by Matthew Scharley

回答by gpojd

回答by paxdiablo

回答by Dimitre Radoulov

回答by Aaron Digulla

回答by Ajak6

回答by FGrose

回答by Mary Marty

相关推荐

最近更新

标签

bash 从 UNIX shell 脚本的列表中选择唯一或不同的值

提问by brabster

回答by Matthew Scharley

回答by gpojd

回答by paxdiablo

回答by Dimitre Radoulov

回答by Aaron Digulla

回答by Ajak6

回答by FGrose

回答by Mary Marty

相关推荐

为命令格式化 bash 变量

bash 如何确定一个目录是否是 shellscript 中已挂载的 NFS 挂载点

用于创建指向共享库的符号链接的 Bash 脚本

您将如何在 bash 中表示 EOF？

相关推荐

最近更新

标签