java 小写所有 HashMap 键

Question

提问by sestus

I 've run into a scenario where I want to lowercase all the keys of a HashMap (don't ask why, I just have to do this). The HashMap has some millions of entries.

我遇到了一个场景，我想小写 HashMap 的所有键（不要问为什么，我只需要这样做）。HashMap 有数百万个条目。

At first, I thought I 'd just create a new Map, iterate over the entries of the map that is to be lowercased, and add the respective values. This task should run only once per day or something like that, so I thought I could bare this.

起初，我以为我只是创建一个新的 Map，遍历要小写的地图条目，然后添加相应的值。这个任务应该每天只运行一次或类似的东西，所以我想我可以忍受这个。

Map<String, Long> lowerCaseMap = new HashMap<>(myMap.size());
for (Map.Entry<String, Long> entry : myMap.entrySet()) {
   lowerCaseMap.put(entry.getKey().toLowerCase(), entry.getValue());
}

this, however, caused some OutOfMemory errors when my server was overloaded during this one time that I was about to copy the Map.

然而，这导致了一些 OutOfMemory 错误，因为在我即将复制 Map 的这段时间内我的服务器过载。

Now my question is, how can I accomplish this task with the smallest memory footprint?

现在我的问题是，如何以最小的内存占用完成这项任务？

Would removing each key after lowercased - added to the new Map help?

会在小写后删除每个键 - 添加到新的 Map 帮助中吗？

Could I utilize java8 streams to make this faster? (e.g something like this)

我可以利用 java8 流来加快速度吗？（例如这样的东西）

Map<String, Long> lowerCaseMap = myMap.entrySet().parallelStream().collect(Collectors.toMap(entry -> entry.getKey().toLowerCase(), Map.Entry::getValue));

UpdateIt seems that it's a Collections.unmodifiableMapso I don't have the option of

更新似乎是这样，Collections.unmodifiableMap所以我没有选择

removing each key after lowercased - added to the new Map

小写后删除每个键 - 添加到新地图

Answer 1

回答by Kenster

Instead of using HashMap, you could try using a TreeMapwith case-insensitive ordering. This would avoid the need to create a lower-case version of each key:

HashMap您可以尝试使用TreeMap不区分大小写的排序，而不是使用。这将避免需要为每个键创建小写版本：

Map<String, Long> map = new TreeMap<>(String.CASE_INSENSITIVE_ORDER);
map.putAll(myMap);

Once you've constructed this map, put()and get()will behave case-insensitively, so you can save and fetch values using all-lowercase keys. Iterating over keys will return them in their original, possibly upper-case forms.

一旦您构建了这个映射，put()并且get()将不区分大小写，因此您可以使用全小写的键来保存和获取值。迭代键将以其原始的，可能是大写形式返回它们。

Here are some similar questions:

下面是一些类似的问题：

Answer 2

回答by loicmathieu

You cannot remove the entry while iterating over the map. You will have a ConcurentModificationException if you try to do this.

迭代地图时不能删除条目。如果您尝试这样做，您将遇到 ConcurentModificationException。

As the issue is an OutOfMemoryError, not a performance error, using parallel stream will not help either.

由于问题是 OutOfMemoryError，而不是性能错误，因此使用并行流也无济于事。

Despite some task on the Stream API will be done lately, this will still lead to have two maps in memory at some point so you will still have the issue.

尽管最近将完成有关 Stream API 的一些任务，但这仍然会导致在某个时候内存中有两个映射，因此您仍然会遇到问题。

To workaround it, I only saw two ways :

为了解决这个问题，我只看到了两种方法：

Give more memory to your process (by increasing -Xmx on the Java command line). Memory is cheap these days ;)
Split the map and work in chunks : for example you divide the size of the map by ten and you process one chunck at a time and delete the processed entries before processing the new chunk. By this instead of having two times the map in memory you will just have 1.1 times the map.

为您的进程提供更多内存（通过在 Java 命令行上增加 -Xmx）。这些天内存很便宜;)
拆分地图并分块工作：例如，您将地图的大小除以 10，一次处理一个块，并在处理新块之前删除已处理的条目。通过这样，而不是内存中地图的两倍，您将只有地图的 1.1 倍。

For the split algorithm, you can try someting like this using the Stream API :

对于拆分算法，您可以使用 Stream API 尝试这样的操作：

Map<String, String> toMap = new HashMap<>();            
int chunk = fromMap.size() / 10;
for(int i = 1; i<= 10; i++){
    //process the chunk
    List<Entry<String, String>> subEntries = fromMap.entrySet().stream().limit(chunk)
        .collect(Collectors.toList());  

    for(Entry<String, String> entry : subEntries){
        toMap.put(entry.getKey().toLowerCase(), entry.getValue());
        fromMap.remove(entry.getKey());
    }
}

java 小写所有 HashMap 键

提问by sestus

回答by Kenster

回答by loicmathieu

相关推荐

最近更新

标签

java 小写所有 HashMap 键

提问by sestus

回答by Kenster

回答by loicmathieu

相关推荐

java 无法解析导入的 com.fasterxml.jackson.xml

在 Java 中执行简单异步任务的最佳方法？

Spring MVC - 无法将“java.lang.String”类型的属性值转换为所需类型“java.lang.Integer”

Spark：RDD.map/mapToPair 如何与 Java 一起工作

相关推荐

最近更新

标签