Linux 在数据文件中查找唯一值

Question

提问by Illusionist

I can do this in python but I was wondering if I could do this in Linux

我可以在 python 中做到这一点，但我想知道我是否可以在 Linux 中做到这一点

I have a file like this

我有一个这样的文件

name1 text text 123432re text
name2 text text 12344qp text
name3 text text 134234ts text

I want to find all the different types of values in the 3rd column by a particular username lets say name 1.

我想通过特定的用户名在第 3 列中找到所有不同类型的值，比如名称 1。

grep name1 filename gives me all the lines, but there must be some way to just list all the different type of values? (I don't want to display duplicate values for the same username)

grep name1 filename 给了我所有的行，但必须有某种方法来列出所有不同类型的值？（我不想显示相同用户名的重复值）

Answer 1

采纳答案by Mike Mertsock

grep name1 filename | cut -d ' ' -f 4 | sort -u

This will find all lines that have name1, then get just the fourth column of data and show only unique values.

这将找到所有具有 name1 的行，然后仅获取第四列数据并仅显示唯一值。

Answer 2

回答by Micha? ?rajer

You can let sort look only on 4-th key, and then ask only for records with unique keys:

您可以让 sort 仅查看第 4 个键，然后仅询问具有唯一键的记录：

grep name1 | sort -k4 -u

Answer 3

回答by glenn Hymanman

As an all-in-one awk solution:

作为多合一的 awk 解决方案：

awk ' == "name1" && ! seen[" "]++ {print }' filename

Answer 4

回答by Rohan Khude

I tried using cat

我尝试使用 cat

File contains :(here file is foo.sh you can input any file name here)

文件包含：（这里的文件是 foo.sh 你可以在这里输入任何文件名）

$cat foo.sh

tar
world
class
zip
zip
zip
python
jin
jin
doo
doo

uniqwill get each word only once

uniq每个单词只会得到一次

$ cat foo.sh | sort | uniq

class
doo
jin
python
tar
world
zip

uniq -uwill get the word appeared only one time in file

uniq -u将让这个词在文件中只出现一次

$ cat foo.sh | sort | uniq -u

class
python
tar
world

uniq -dwill get the only the duplicate words and print them only once

uniq -d将获得唯一的重复单词并只打印一次

$ cat foo.sh | sort | uniq -d

doo
jin
zip

Answer 5

回答by Mansur Ali

In my opinion, you need to select the field from which you need the unique values. I was trying to retrieve unique source IPs from IPTables log.

在我看来，您需要选择需要唯一值的字段。我试图从 IPTables 日志中检索唯一的源 IP。

cat /var/log/iptables.log | grep "May  5" | awk '{print }' | sort -u

Here is the output of the above command:

以下是上述命令的输出：

SRC=192.168.10.225

SRC=192.168.10.29

SRC=192.168.20.125

SRC=192.168.20.147

SRC=192.168.20.155

SRC=192.168.20.183

SRC=192.168.20.194

So, the best idea is to select the field first and then filter out the unique data.

所以，最好的办法是先选择字段，然后过滤掉唯一的数据。

Answer 6

回答by Sobhit Sharma

The following command worked for me.

以下命令对我有用。

sudo cat AirtelFeb.txt | awk '{print }' | sort -u

Here it prints the 3rd column with unique values.

在这里它打印具有唯一值的第三列。

Answer 7

回答by Ivan

IMHO Micha? ?rajer got the best answer but a filename needed after grep name1And i've got this fancy solution using indexed array

恕我直言，米查？? rajer 得到了最好的答案，但是在grep name1之后需要一个文件名而且我使用索引数组得到了这个奇特的解决方案

user=name1

IFSOLD=$IFS; IFS=$'\n'; test=( $(grep $user test) ); IFS=$IFSOLD
declare -A index
for item in "${test[@]}"; {
    sub=( $item )
    name=${sub[3]}
    index[$name]=$item
}

for item in "${index[@]}"; { echo $item; }

Linux 在数据文件中查找唯一值

提问by Illusionist

采纳答案by Mike Mertsock

回答by Micha? ?rajer

回答by glenn Hymanman

回答by Rohan Khude

回答by Mansur Ali

回答by Sobhit Sharma

回答by Ivan

相关推荐

最近更新

标签

Linux 在数据文件中查找唯一值

提问by Illusionist

采纳答案by Mike Mertsock

回答by Micha? ?rajer

回答by glenn Hymanman

回答by Rohan Khude

回答by Mansur Ali

回答by Sobhit Sharma

回答by Ivan

相关推荐

Linux 如何删除使用 tempfile.mkdtemp 创建的目录？

Linux sed 替换为十六进制

Linux 如何替换而不在sed中创建中间文件？

Linux 带有相对路径的Makefile？

相关推荐

最近更新

标签