Python：根据值检查列表中的出现次数

Question

提问by user469652

lst = [1,2,3,4,1]

I want to know 1 occurs twice in this list, is there any efficient way to do?

我想知道 1 在这个列表中出现了两次，有什么有效的方法吗？

Answer 1

采纳答案by wkl

lst.count(1)would return the number of times it occurs. If you're going to be counting items in a list, O(n) is what you're going to get.

lst.count(1)将返回它发生的次数。如果您要计算列表中的项目，那么您将得到 O(n)。

The general function on the list is list.count(x), and will return the number of times xoccurs in a list.

列表上的一般函数是list.count(x)，将返回x列表中出现的次数。

Answer 2

回答by Katriel

Are you asking whether every item in the list is unique?

您是在问列表中的每个项目是否都是唯一的？

len(set(lst)) == len(lst)

Whether 1occurs more than once?

是否1发生不止一次？

lst.count(1) > 1

Note that the above is not maximally efficient, because it won't short-circuit -- even if 1occurs twice, it will still count the rest of the occurrences. If you want it to short-circuit you will have to write something a little more complicated.

请注意，以上不是最大效率，因为它不会短路——即使1出现两次，它仍然会计算其余的出现次数。如果你想让它短路，你将不得不写一些更复杂的东西。

Whether the firstelement occurs more than once?

第一个元素是否出现多次？

lst[0] in lst[1:]

How often each element occurs?

每个元素出现的频率是多少？

import collections
collections.Counter(lst)

Something else?

还有什么？

Answer 3

回答by Hugh Bothwell

def valCount(lst):
    res = {}
    for v in lst:
        try:
            res[v] += 1
        except KeyError:
            res[v] = 1
    return res

u = [ x for x,y in valCount(lst).iteritems() if y > 1 ]

u is now a list of all values which appear more than once.

u 现在是出现不止一次的所有值的列表。

Edit:

编辑：

@katrielalex: thank you for pointing out collections.Counter, of which I was not previously aware. It can also be written more concisely using a collections.defaultdict, as demonstrated in the following tests. All three methods are roughly O(n) and reasonably close in run-time performance (using collections.defaultdict is in fact slightly faster than collections.Counter).

@katrielalex：感谢您指出我之前不知道的 collections.Counter。也可以使用 collections.defaultdict 更简洁地编写它，如以下测试所示。所有三种方法都大致为 O(n) 并且在运行时性能上相当接近（使用 collections.defaultdict 实际上比 collections.Counter 略快）。

My intention was to give an easy-to-understand response to what seemed a relatively unsophisticated request. Given that, are there any other senses in which you consider it "bad code" or "done poorly"?

我的意图是对看似相对简单的请求做出易于理解的回应。鉴于此，您是否还有其他意义上认为它是“糟糕的代码”或“做得不好”？

import collections
import random
import time

def test1(lst):
    res = {}
    for v in lst:
        try:
            res[v] += 1
        except KeyError:
            res[v] = 1
    return res

def test2(lst):
    res = collections.defaultdict(lambda: 0)
    for v in lst:
        res[v] += 1
    return res

def test3(lst):
    return collections.Counter(lst)

def rndLst(lstLen):
    r = random.randint
    return [r(0,lstLen) for i in xrange(lstLen)]

def timeFn(fn, *args):
    st = time.clock()
    res = fn(*args)
    return time.clock() - st

def main():
    reps = 5000

    res = []
    tests = [test1, test2, test3]

    for t in xrange(reps):
        lstLen = random.randint(10,50000)
        lst = rndLst(lstLen)
        res.append( [lstLen] + [timeFn(fn, lst) for fn in tests] )

    res.sort()
    return res

And the results, for random lists containing up to 50,000 items, are as follows: (Vertical axis is time in seconds, horizontal axis is number of items in list) alt text

结果，对于最多包含 50,000 个项目的随机列表，结果如下：（纵轴是以秒为单位的时间，横轴是列表中的项目数）替代文字

Answer 4

回答by Jochen Ritzel

Another way to get all items that occur more than once:

另一种获取多次出现的所有项目的方法：

lst = [1,2,3,4,1]
d = {}
for x in lst: 
    d[x] = x in d
print d[1] # True
print d[2] # False
print [x for x in d if d[x]] # [1]

Answer 5

回答by dawg

For multiple occurrences, this give you the index of each occurence:

对于多次出现，这将为您提供每次出现的索引：

>>> lst=[1,2,3,4,5,1]
>>> tgt=1
>>> found=[]
>>> for index, suspect in enumerate(lst):
...     if(tgt==suspect):
...        found.append(index)
...
>>> print len(found), "found at index:",", ".join(map(str,found))
2 found at index: 0, 5

If you want the count of each item in the list:

如果您想要列表中每个项目的计数：

>>> lst=[1,2,3,4,5,2,2,1,5,5,5,5,6]
>>> count={}
>>> for item in lst:
...     count[item]=lst.count(item)
...
>>> count
{1: 2, 2: 3, 3: 1, 4: 1, 5: 5, 6: 1}

Answer 6

回答by ncmathsadist

You could also sort the list which is O(n*log(n)), then check the adjacent elements for equality, which is O(n). The result is O(n*log(n)). This has the disadvantage of requiring the entire list be sorted before possibly bailing when a duplicate is found.

您还可以对 O(n*log(n)) 的列表进行排序，然后检查相邻元素是否相等，即 O(n)。结果是 O(n*log(n))。这有一个缺点，即在发现重复项时可能需要对整个列表进行排序。

For a large list with a relatively rare duplicates, this could be the about the best you can do. The best way to approach this really does depend on the size of the data involved and its nature.

对于具有相对罕见的重复项的大列表，这可能是您能做的最好的事情。解决此问题的最佳方法确实取决于所涉及数据的大小及其性质。

Python：根据值检查列表中的出现次数

提问by user469652

采纳答案by wkl

回答by Katriel

回答by Hugh Bothwell

回答by Jochen Ritzel

回答by dawg

回答by ncmathsadist

相关推荐

最近更新

标签

Python：根据值检查列表中的出现次数

提问by user469652

采纳答案by wkl

回答by Katriel

回答by Hugh Bothwell

回答by Jochen Ritzel

回答by dawg

回答by ncmathsadist

相关推荐

Python multiprocessing.Pool 示例

在 tearDown() 方法中获取 Python 的单元测试结果

将常规 Python 字符串转换为原始字符串

python theading.Timer：如何将参数传递给回调？

相关推荐

最近更新

标签