Java 代码审查:将排序列表合并为一个排序列表

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1774256/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-12 22:29:39  来源:igfitidea点击:

Java Code Review: Merge sorted lists into a single sorted list

javalistsorting

提问by Nick Heiner

I want to merge sorted lists into a single list. How is this solution? I believe it runs in O(n) time. Any glaring flaws, inefficiencies, or stylistic issues?

我想将排序的列表合并为一个列表。这个解决方案如何?我相信它在 O(n) 时间内运行。有任何明显的缺陷、低效或风格问题吗?

I don't really like the idiom of setting a flag for "this is the first iteration" and using it to make sure "lowest" has a default value. Is there a better way around that?

我真的不喜欢为“这是第一次迭代”设置标志并使用它来确保“最低”具有默认值的习惯用法。有没有更好的方法来解决这个问题?

public static <T extends Comparable<? super T>> List<T> merge(Set<List<T>> lists) {
    List<T> result = new ArrayList<T>();

    int totalSize = 0; // every element in the set
    for (List<T> l : lists) {
        totalSize += l.size();
    }

    boolean first; //awkward
    List<T> lowest = lists.iterator().next(); // the list with the lowest item to add

    while (result.size() < totalSize) { // while we still have something to add
        first = true;

        for (List<T> l : lists) {
            if (! l.isEmpty()) {
                if (first) {
                    lowest = l;
                    first = false;
                }
                else if (l.get(0).compareTo(lowest.get(0)) <= 0) {
                    lowest = l;
                }
            }
        }
        result.add(lowest.get(0));
        lowest.remove(0);
    }
    return result;
}

Note: this isn't homework, but it isn't for production code, either.

注意:这不是家庭作业,但也不适用于生产代码。

采纳答案by meriton

Your solution is probably the fastest one. SortedLists have an insert cost of log(n), so you'll end up with M log (M) (where M is the total size of the lists).

您的解决方案可能是最快的。SortedLists 的插入成本为 log(n),因此您最终会得到 M log (M)(其中 M 是列表的总大小)。

Adding them to one list and sorting, while easier to read, is still M log(M).

将它们添加到一个列表中并进行排序,虽然更容易阅读,但仍然是 M log(M)。

Your solution is just M.

您的解决方案只是 M。

You can clean up your code a bit by sizing the result list, and by using a reference to the lowest list instead of a boolean.

您可以通过调整结果列表的大小并使用对最低列表的引用而不是布尔值来稍微清理代码。

public static <T extends Comparable<? super T>> List<T> merge(Set<List<T>> lists) {
    int totalSize = 0; // every element in the set
    for (List<T> l : lists) {
        totalSize += l.size();
    }

    List<T> result = new ArrayList<T>(totalSize);

    List<T> lowest;

    while (result.size() < totalSize) { // while we still have something to add
        lowest = null;

        for (List<T> l : lists) {
            if (! l.isEmpty()) {
                if (lowest == null) {
                    lowest = l;
                } else if (l.get(0).compareTo(lowest.get(0)) <= 0) {
                    lowest = l;
                }
            }
        }

        result.add(lowest.get(0));
        lowest.remove(0);
    }

    return result;
}

If you're really particular, use a List object as input, and lowest can be initialized to be lists.get(0) and you can skip the null check.

如果您真的很特别,请使用 List 对象作为输入,并且可以将最低值初始化为 lists.get(0) 并且您可以跳过空检查。

回答by Stephen Denne

To expand on Anton's comment:

扩展安东的评论:

By placing the latest result from each List, along with an indicator of whch list it is, into a heap, then continually take the top off the heap, and put a new item on the heap from the list belonging to the item you just took off.

通过将每个 List 的最新结果以及它是哪个列表的指示符放入一个堆中,然后不断地从堆中取出顶部,并从属于您刚刚取出的项目的列表中将一个新项目放入堆中离开。

Java's PriorityQueue can provide the heap implementation.

Java 的 PriorityQueue 可以提供堆实现。

回答by meriton

Efficiency will suck if listscontains an ArrayList, since lowest.remove(0)will take linear time in the length of the list, making your algorithm O(n^2).

如果lists包含一个 ArrayList,效率会很差,因为lowest.remove(0)在列表的长度上会花费线性时间,使你的算法 O(n^2)。

I'd do:

我会做:

List<T> result = new ArrayList<T>();
for (List<T> list : lists) {
    result.addAll(list);
}
Collections.sort(result);

which is in O(n log n), and leaves far less tedious code to test, debug and maintain.

这是在 O(n log n) 中,并且留下了远不那么繁琐的代码来测试、调试和维护。

回答by CPerkins

Since Balus and meriton have together given an excellent response to your question about the algorithm, I'll speak to your aside about the "first" idiom.

由于 Balus 和 Meriton 一起对你关于算法的问题给出了很好的回答,我将在旁边谈谈“第一个”习语。

There are definitely other approaches (like setting lowest to a 'magic' value), but I happen to feel that "first" (to which I'd probably give a longer name, but that's being pedantic) is the best, because it's very clear. Presence of a boolean like "first" is a clear signal that your loop will do something special the first time through. It helps the reader.

肯定还有其他方法(例如将最低值设置为“魔法”值),但我碰巧觉得“第一”(我可能会给它一个更长的名称,但这是迂腐的)是最好的,因为它非常清除。像“first”这样的布尔值的存在是一个明确的信号,表明你的循环将在第一次通过时做一些特别的事情。它对读者有所帮助。

Of course you don't need it if you take the Balus/meriton approach, but it's a situation which crops up.

当然,如果您采用 Balus/meriton 方法,则不需要它,但这是一种突然出现的情况。

回答by Erik

This is a really old question, but I don't like any of the submitted answers, so this is what I ended up doing.

这是一个非常古老的问题,但我不喜欢任何提交的答案,所以这就是我最终要做的。

The solution of just adding them all into one list and sorting is bad because of the log linear complexity (O(m n log(m n))). If that's not important to you, then it's definitely the simplest and most straightforward answer. Your initial solution isn't bad, but it's a little messy, and @Dathan pointed out that the complexity is O(m n)for m lists and n total elements. You can reduce this to O(n log(m))by using a heap to reduce the number of comparisons for each element. I use a helper class that allows me to compare iterables. This way I don't destroy the initial lists, and it should operate with reasonable complexity no matter what type of lists are input. The only flaw I see with the implementation below is that it doesn't support lists with nullelements, however this could be fixed with sentinels if desired.

由于对数线性复杂度 ( O(m n log(m n))) ,将它们全部添加到一个列表中并进行排序的解决方案很糟糕。如果这对您来说并不重要,那么这绝对是最简单、最直接的答案。您最初的解决方案还不错,但有点混乱,@Dathan 指出复杂性是O(m n)针对 m 个列表和 n 个总元素。您可以O(n log(m))通过使用堆来减少每个元素的比较次数。我使用了一个帮助类,它允许我比较可迭代对象。这样我就不会破坏初始列表,无论输入什么类型的列表,它都应该以合理的复杂性运行。我在下面的实现中看到的唯一缺陷是它不支持包含null元素的列表,

public static <E extends Comparable<? super E>> List<E> merge(Collection<? extends List<? extends E>> lists) {
    PriorityQueue<CompIterator<E>> queue = new PriorityQueue<CompIterator<E>>();
    for (List<? extends E> list : lists)
        if (!list.isEmpty())
            queue.add(new CompIterator<E>(list.iterator()));

    List<E> merged = new ArrayList<E>();
    while (!queue.isEmpty()) {
        CompIterator<E> next = queue.remove();
        merged.add(next.next());
        if (next.hasNext())
            queue.add(next);
    }
    return merged;
}

private static class CompIterator<E extends Comparable<? super E>> implements Iterator<E>, Comparable<CompIterator<E>> {
    E peekElem;
    Iterator<? extends E> it;

    public CompIterator(Iterator<? extends E> it) {
        this.it = it;
        if (it.hasNext()) peekElem = it.next();
        else peekElem = null;
    }

    @Override
    public boolean hasNext() {
        return peekElem != null;
    }

    @Override
    public E next() {
        E ret = peekElem;
        if (it.hasNext()) peekElem = it.next();
        else peekElem = null;
        return ret;
    }

    @Override
    public void remove() {
        throw new UnsupportedOperationException();
    }

    @Override
    public int compareTo(CompIterator<E> o) {
        if (peekElem == null) return 1;
        else return peekElem.compareTo(o.peekElem);
    }

}

Every element of the returned list involves two O(log(m)) heap operations, there is also an initial iteration over all of the lists. Therefore the overall complexity is O(n log(m) + m) for n total elements and m lists. making this always faster than concatenating and sorting.

返回列表的每个元素都涉及两个 O(log(m)) 堆操作,还有对所有列表的初始迭代。因此,对于总共 n 个元素和 m 个列表,整体复杂度为 O(n log(m) + m)。使这总是比连接和排序更快。