java 小写所有 HashMap 键
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/41225031/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Lowercase all HashMap keys
提问by sestus
I 've run into a scenario where I want to lowercase all the keys of a HashMap (don't ask why, I just have to do this). The HashMap has some millions of entries.
我遇到了一个场景,我想小写 HashMap 的所有键(不要问为什么,我只需要这样做)。HashMap 有数百万个条目。
At first, I thought I 'd just create a new Map, iterate over the entries of the map that is to be lowercased, and add the respective values. This task should run only once per day or something like that, so I thought I could bare this.
起初,我以为我只是创建一个新的 Map,遍历要小写的地图条目,然后添加相应的值。这个任务应该每天只运行一次或类似的东西,所以我想我可以忍受这个。
Map<String, Long> lowerCaseMap = new HashMap<>(myMap.size());
for (Map.Entry<String, Long> entry : myMap.entrySet()) {
lowerCaseMap.put(entry.getKey().toLowerCase(), entry.getValue());
}
this, however, caused some OutOfMemory errors when my server was overloaded during this one time that I was about to copy the Map.
然而,这导致了一些 OutOfMemory 错误,因为在我即将复制 Map 的这段时间内我的服务器过载。
Now my question is, how can I accomplish this task with the smallest memory footprint?
现在我的问题是,如何以最小的内存占用完成这项任务?
Would removing each key after lowercased - added to the new Map help?
会在小写后删除每个键 - 添加到新的 Map 帮助中吗?
Could I utilize java8 streams to make this faster? (e.g something like this)
我可以利用 java8 流来加快速度吗?(例如这样的东西)
Map<String, Long> lowerCaseMap = myMap.entrySet().parallelStream().collect(Collectors.toMap(entry -> entry.getKey().toLowerCase(), Map.Entry::getValue));
UpdateIt seems that it's a Collections.unmodifiableMap
so I don't have the option of
更新似乎是这样,Collections.unmodifiableMap
所以我没有选择
removing each key after lowercased - added to the new Map
小写后删除每个键 - 添加到新地图
回答by Kenster
Instead of using HashMap
, you could try using a TreeMap
with case-insensitive ordering. This would avoid the need to create a lower-case version of each key:
HashMap
您可以尝试使用TreeMap
不区分大小写的排序,而不是使用。这将避免需要为每个键创建小写版本:
Map<String, Long> map = new TreeMap<>(String.CASE_INSENSITIVE_ORDER);
map.putAll(myMap);
Once you've constructed this map, put()
and get()
will behave case-insensitively, so you can save and fetch values using all-lowercase keys. Iterating over keys will return them in their original, possibly upper-case forms.
一旦您构建了这个映射,put()
并且get()
将不区分大小写,因此您可以使用全小写的键来保存和获取值。迭代键将以其原始的,可能是大写形式返回它们。
Here are some similar questions:
下面是一些类似的问题:
回答by loicmathieu
You cannot remove the entry while iterating over the map. You will have a ConcurentModificationException if you try to do this.
迭代地图时不能删除条目。如果您尝试这样做,您将遇到 ConcurentModificationException。
As the issue is an OutOfMemoryError, not a performance error, using parallel stream will not help either.
由于问题是 OutOfMemoryError,而不是性能错误,因此使用并行流也无济于事。
Despite some task on the Stream API will be done lately, this will still lead to have two maps in memory at some point so you will still have the issue.
尽管最近将完成有关 Stream API 的一些任务,但这仍然会导致在某个时候内存中有两个映射,因此您仍然会遇到问题。
To workaround it, I only saw two ways :
为了解决这个问题,我只看到了两种方法:
- Give more memory to your process (by increasing -Xmx on the Java command line). Memory is cheap these days ;)
- Split the map and work in chunks : for example you divide the size of the map by ten and you process one chunck at a time and delete the processed entries before processing the new chunk. By this instead of having two times the map in memory you will just have 1.1 times the map.
- 为您的进程提供更多内存(通过在 Java 命令行上增加 -Xmx)。这些天内存很便宜;)
- 拆分地图并分块工作:例如,您将地图的大小除以 10,一次处理一个块,并在处理新块之前删除已处理的条目。通过这样,而不是内存中地图的两倍,您将只有地图的 1.1 倍。
For the split algorithm, you can try someting like this using the Stream API :
对于拆分算法,您可以使用 Stream API 尝试这样的操作:
Map<String, String> toMap = new HashMap<>();
int chunk = fromMap.size() / 10;
for(int i = 1; i<= 10; i++){
//process the chunk
List<Entry<String, String>> subEntries = fromMap.entrySet().stream().limit(chunk)
.collect(Collectors.toList());
for(Entry<String, String> entry : subEntries){
toMap.put(entry.getKey().toLowerCase(), entry.getValue());
fromMap.remove(entry.getKey());
}
}