python 如何合并两个列表并在“线性”时间内对它们进行排序?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2488889/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-04 00:46:12  来源:igfitidea点击:

How can I merge two lists and sort them working in 'linear' time?

pythonlistmerge

提问by Stephan202

I have this, and it works:

我有这个,它有效:

# E. Given two lists sorted in increasing order, create and return a merged
# list of all the elements in sorted order. You may modify the passed in lists.
# Ideally, the solution should work in "linear" time, making a single
# pass of both lists.
def linear_merge(list1, list2):
  finalList = []
  for item in list1:
    finalList.append(item)
  for item in list2:
    finalList.append(item)
  finalList.sort()
  return finalList
  # +++your code here+++
  return

But, I'd really like to learn this stuff well. :) What does 'linear' time mean?

但是,我真的很想好好学习这些东西。:) “线性”时间是什么意思?

回答by Max Shawabkeh

Linear means O(n) in Big O notation, while your code uses a sort()which is most likely O(nlogn).

线性在大 O 表示法中表示 O(n) ,而您的代码使用sort()最有可能的O(nlogn)

The question is asking for the standard merge algorithm. A simple Python implementation would be:

问题是要求标准合并算法。一个简单的 Python 实现是:

def merge(l, m):
    result = []
    i = j = 0
    total = len(l) + len(m)
    while len(result) != total:
        if len(l) == i:
            result += m[j:]
            break
        elif len(m) == j:
            result += l[i:]
            break
        elif l[i] < m[j]:
            result.append(l[i])
            i += 1
        else:
            result.append(m[j])
            j += 1
    return result

>>> merge([1,2,6,7], [1,3,5,9])
[1, 1, 2, 3, 5, 6, 7, 9]

回答by Steve314

Linear time means that the time taken is bounded by some undefined constant times (in this context) the number of items in the two lists you want to merge. Your approach doesn't achieve this - it takes O(n log n) time.

线性时间意味着所花费的时间受一些未定义的常数倍(在此上下文中)要合并的两个列表中的项目数的限制。您的方法没有实现这一点 - 它需要 O(n log n) 时间。

When specifying how long an algorithm takes in terms of the problem size, we ignore details like how fast the machine is, which basically means we ignore all the constant terms. We use "asymptotic notation" for that. These basically describe the shapeof the curve you would plot in a graph of problem size in x against time taken in y. The logic is that a bad curve (one that gets steeper quickly) will always lead to a slower execution time if the problem is big enough. It may be faster on a very small problem (depending on the constants, which probably depends on the machine) but for small problems the execution time isn't generally a big issue anyway.

在根据问题大小指定算法需要多长时间时,我们忽略了机器速度等细节,这基本上意味着我们忽略了所有常数项。为此,我们使用“渐近符号”。这些基本上描述了您将在 x 中的问题大小与 y 中的时间关系图中绘制的曲线形状。逻辑是,如果问题足够大,糟糕的曲线(曲线变得更陡峭)总是会导致执行时间变慢。对于非常小的问题(取决于常数,这可能取决于机器),它可能会更快,但对于小问题,执行时间通常不是一个大问题。

The "big O" specifies an upper bound on execution time. There are related notations for average execution time and lower bounds, but "big O" is the one that gets all the attention.

“big O”指定了执行时间的上限。平均执行时间和下限有相关的表示法,但“大 O”是最受关注的。

  • O(1) is constant time - the problem size doesn't matter.
  • O(log n) is a quite shallow curve - the time increases a bit as the problem gets bigger.
  • O(n) is linear time - each unit increase means it takes a roughly constant amount of extra time. The graph is (roughly) a straight line.
  • O(n log n) curves upwards more steeply as the problem gets more complex, but not by very much. This is the best that a general-purpose sorting algorithm can do.
  • O(n squared) curves upwards a lot more steeply as the problem gets more complex. This is typical for slower sorting algorithms like bubble sort.
  • O(1) 是常数时间 - 问题大小无关紧要。
  • O(log n) 是一条非常浅的曲线——随着问题的增大,时间会增加一点。
  • O(n) 是线性时间——每增加一个单位意味着它需要大致恒定的额外时间。该图(大致)是一条直线。
  • 随着问题变得更复杂,O(n log n) 向上弯曲得更陡峭,但幅度不大。这是通用排序算法所能做的最好的事情。
  • 随着问题变得越来越复杂,O(n squared) 曲线向上陡峭得多。这对于较慢的排序算法(如冒泡排序)来说是典型的。

The nastiest algorithms are classified as "np-hard" or "np-complete" where the "np" means "non-polynomial" - the curve gets steeper quicker than any polynomial. Exponential time is bad, but some are even worse. These kinds of things are still done, but only for very small problems.

最讨厌的算法被归类为“np-hard”或“np-complete”,其中“np”的意思是“非多项式”——曲线比任何多项式都更陡峭。指数时间很糟糕,但有些甚至更糟。这些事情仍然可以完成,但只是针对非常小的问题。

EDITthe last paragraph is wrong, as indicated by the comment. I do have some holes in my algorithm theory, and clearly it's time I checked the things I thoughtI had figured out. In the mean time, I'm not quite sure how to correct that paragraph, so just be warned.

编辑最后一段是错误的,如评论所示。我的算法理论确实有一些漏洞,显然是时候检查我认为我已经弄清楚的事情了。同时,我不太确定如何更正该段落,所以请注意。

For your merging problem, consider that your two input lists are already sorted. The smallest item from your output must be the smallest item from one of your inputs. Get the first item from both and compare the two, and put the smallest in your output. Put the largest back where it came from. You have done a constant amount of work and you have handled one item. Repeat until both lists are exhausted.

对于您的合并问题,请考虑您的两个输入列表已经排序。输出中的最小项必须是输入之一中的最小项。从两者中获取第一项并比较两者,然后将最小的项放入输出中。把最大的背放在它来自的地方。您已经完成了恒定数量的工作,并且您已经处理了一个项目。重复直到两个列表都用完。

Some details... First, putting the item back in the list just to pull it back out again is obviously silly, but it makes the explanation easier. Next - one input list will be exhausted before the other, so you need to cope with that (basically just empty out the rest of the other list and add it to the output). Finally - you don't actually have to remove items from the input lists - again, that's just the explanation. You can just step through them.

一些细节... 首先,将项目放回列表只是为了再次将其拉出显然很愚蠢,但它使解释更容易。接下来 - 一个输入列表将在另一个之前用完,因此您需要解决这个问题(基本上只需清空另一个列表的其余部分并将其添加到输出中)。最后 - 您实际上不必从输入列表中删除项目 - 同样,这只是解释。你可以单步通过它们。

回答by John La Rooy

If you build the result in reverse sorted order, you can use pop()and still be O(N)
pop()from the right end of the list does not require shifting the elements, so is O(1)
Reversing the list before we return it is O(N)

如果以反向排序顺序构建结果,则可以使用pop()并且仍然是 O(N)
pop()从列表的右端不需要移动元素,O(1)
也是如此 在我们返回列表之前反转它是 O(否)

>>> def merge(l, r):
...     result = []
...     while l and r:
...         if l[-1] > r[-1]:
...             result.append(l.pop())
...         else:
...             result.append(r.pop())
...     result+=(l+r)[::-1]
...     result.reverse()
...     return result
... 
>>> merge([1,2,6,7], [1,3,5,9])
[1, 1, 2, 3, 5, 6, 7, 9]

回答by Stephan202

Linear timemeans that the runtime of the program is proportional to the length of the input. In this case the input consists of two lists. If the lists are twice as long, then the program will run approximately twice as long. Technically, we say that the algorithm should be O(n), where nis the size of the input (in this case the length of the two input lists combined).

线性时间意味着程序的运行时间与输入的长度成正比。在这种情况下,输入由两个列表组成。如果列表是两倍长,那么程序将运行大约两倍长。从技术上讲,我们说算法应该是O(n),其中n是输入的大小(在这种情况下,两个输入列表的长度组合)。

This appears to be homework, so I will no supply you with an answer.Even though this is not homework, I am of the opinion that you will be best served by taking a pen and a piece of paper, construct two smallish example lists which are sorted, and figure out how youwould merge those two lists, by hand. Once you figured that out, implementing the algorithm is a piece of cake.

这似乎是家庭作业,所以我不会给你答案。尽管这不是功课,我认为你会是最好采取一支笔和一张纸服务的意见,构建其排序的两个短小的例子清单,并找出如何将合并这两个名单,由专人. 一旦你弄清楚了,实现算法就是小菜一碟。

(If all goes well, you will notice that you need to iterate over each list only once, in a single direction. That means that the algorithm is indeed linear. Good luck!)

(如果一切顺利,你会注意到你只需要在一个方向上迭代每个列表一次。这意味着算法确实是线性的。祝你好运!)

回答by Mike Graham

This thread contains various implementations of a linear-time merge algorithm. Note that for practical purposes, you would use heapq.merge.

该线程包含线性时间合并算法的各种实现。请注意,出于实际目的,您将使用heapq.merge.

回答by Reza Abtin

A simpler version which will require equal sized lists:

一个更简单的版本,需要相同大小的列表:

def merge_sort(L1, L2):
   res = [] 
   for i in range(len(L1)):
      if(L1[i]<L2[i]):
          first = L1[i]  
          secound = L2[i] 
      else:
          first = L2[i]  
          secound = L1[i]  
      res.extend([first,secound]) 
   return res

回答by Gabriel ??erbák

Linear time means O(n) complexity. You can read something about algorithmn comlexity and big-O notation here: http://en.wikipedia.org/wiki/Big_O_notation. You should try to combine those lists not after getting them in the finalList, try to merge them gradually - adding an element, assuring the result is sorted, then add next element... this should give you some ideas.

线性时间意味着 O(n) 复杂度。您可以在此处阅读有关算法复杂性和大 O 符号的内容:http: //en.wikipedia.org/wiki/Big_O_notation。您应该尝试组合这些列表,而不是在将它们放入 finalList 之后,尝试逐渐合并它们 - 添加一个元素,确保结果已排序,然后添加下一个元素......这应该会给你一些想法。

回答by Li0liQ

'Linear time'means that time is an O(n)function, where n - the number of items input (items in the lists).
f(n) = O(n) means that that there exist constants x and y such that x * n <= f(n) <= y * n.

“线性时间”意味着时间是一个O(n)函数,其中 n - 输入的项目数(列表中的项目)。
f(n) = O(n) 意味着存在常数 x 和 y,使得 x * n <= f(n) <= y * n。

def linear_merge(list1, list2):
  finalList = []
  i = 0
  j = 0
  while i < len(list1):
    if j < len(list2):
      if list1[i] < list2[j]:
        finalList.append(list1[i])
        i += 1
      else:
        finalList.append(list2[j])
        j += 1
    else:
      finalList.append(list1[i])
      i += 1
  while j < len(list2):
    finalList.append(list2[j])
    j += 1
  return finalList