pandas python计算csv列中唯一元素的数量
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/29634417/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
python count number of unique elements in csv column
提问by pam
I'm trying to get the counts of unique items in a csv column using Python.
我正在尝试使用 Python 获取 csv 列中唯一项目的数量。
Sample CSV file (has no header):
示例 CSV 文件(没有标题):
AB,asd
AB,poi
AB,asd
BG,put
BG,asd
I've tried this so far.
到目前为止我已经尝试过这个。
import csv
from collections import defaultdict, Counter
input_file = open('Results/1_sample.csv')
csv_reader = csv.reader(input_file, delimiter=',')
data = defaultdict(list)
for row in csv_reader:
data[row[0]].append(row[1])
for k, v in data.items():
print k
print Counter(v)
This gives output in this format:
这给出了以下格式的输出:
AB
Counter({'asd': 2, 'poi': 1})
BG
Counter({'asd': 1, 'put': 1})
But I want my output to be like:
但我希望我的输出是这样的:
AB:2
BG:2
total_unique_count:3 #unique count of column[1], irrespective of the data in column[0]
回答by Andy Hayden
回答by Celeo
Use sets:
使用sets:
data = (('AB', 'asd'),
('AB', 'poi'),
('AB', 'asd'),
('BG', 'put'),
('BG', 'asd'))
unique_items = set(data)
keys = [[entry[0] for entry in unique_items]]
for key in set(keys):
print("Key '{}' appears {} unique times".format(key, keys.count(key)))
Key 'BG' appears 2 unique times
Key 'AB' appears 2 unique times
键“BG”出现 2 次唯一
键“AB”出现 2 次

