java 原始值的映射替代

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/41726286/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-03 06:06:31  来源:igfitidea点击:

Map alternative for primitive values

javajava-8primitive

提问by hotzst

I did some profiling on my application and one of the results turned out that about 18% of memory on the heap is used by objects of type Double. It turns out these objects are the values in Maps, where I cannot use the primitive type.

我对我的应用程序做了一些分析,结果之一是堆上大约 18% 的内存被类型的对象使用Double。事实证明,这些对象是Maps中的值,我不能在其中使用原始类型。

My reasoning is that the primitive type of doubleconsumes less memory than it's object Double. Is there a way to have a map like data structure, that would accept any type as key and a primitive doubleas values?

我的推理是原始类型的double消耗的内存比它的 object 少Double。有没有办法让地图像数据结构一样,可以接受任何类型作为键和一个原始类型double作为值?

Main operations would be:

主要操作将是:

  • insertion (probably only once)
  • Lookup (contains by key)
  • Retrieval (by key)
  • Iteration
  • 插入(可能只有一次)
  • 查找(按关键字包含)
  • 检索(按键)
  • 迭代

Typical maps that I have are:

我拥有的典型地图是:

  • HashMap<T, HashMap<NodeData<T>, Double>> graph
  • HashMap<Point2D, Boolean> onSea(though not a double value)
  • ConcurrentHashMap<Point2D, HashMap<Point2D, Double>>
  • HashMap<T, HashMap<NodeData<T>, Double>> graph
  • HashMap<Point2D, Boolean> onSea(虽然不是双重价值)
  • ConcurrentHashMap<Point2D, HashMap<Point2D, Double>>

All used with Java 8.

全部用于 Java 8。

Addendum

附录

I am mainly not interested in frameworks that have a solution to these type of maps, but on what has to be considered when addressing these issues. If you want, what are the concepts/ideas/approaches behind any such framework. Or the solution may be also on another level, where the maps are replaced with objects following a certain pattern like @Ilmari Karonen pointed out in his answer.

我主要对解决这些类型地图的框架不感兴趣,但对解决这些问题时必须考虑的问题不感兴趣。如果您愿意,任何此类框架背后的概念/想法/方法是什么。或者解决方案也可能在另一个层面上,其中地图被替换为遵循特定模式的对象,如@Ilmari Karonen 在他的回答中指出的那样。

采纳答案by Ilmari Karonen

Others have already suggested several third-party implementations of primitive-valued maps. For completeness, I'd like to mention some ways to get rid of the maps entirelythat you might want to consider. These solutions won't always be possible, but when they are, they will usually be both faster and more memory-efficient than any map can be.

其他人已经建议了几个原始值映射的第三方实现。为了完整起见,我想提及一些您可能想要考虑的完全摆脱地图的方法。这些解决方案并不总是可行的,但是当它们可行时,它们通常比任何地图都更快且内存效率更高。

Alternative 1: Use plain old arrays.

备选方案 1:使用普通的旧数组。

A simple double[]array may not be as sexy as a fancy map, but very little can beat it in compactness and speed of access.

一个简单的double[]数组可能不像花哨的地图那么性感,但很少有人能在紧凑性和访问速度方面击败它。

Of course, arrays come with a bunch of limitations: their size is fixed (although you can always create a new array and copy the old one's content into it) and their keys can only be small positive integers which, for efficiency, should be reasonably dense (i.e. the total number of used keys should be a reasonably large fraction of the highest key value). But if that happens to be the case for your keys, or if you can arrange for it to be the case, arrays of primitive values can be very efficient.

当然,数组有很多限制:它们的大小是固定的(尽管你总是可以创建一个新数组并将旧数组的内容复制到其中)并且它们的键只能是小的正整数,为了效率,应该合理密集(即使用的密钥总数应该是最高密钥值的相当大的一部分)。但是,如果您的键恰好是这种情况,或者如果您可以安排这种情况,原始值数组可能非常有效。

In particular, if you can assign a unique small integer ID to each key object, then you can use that ID as an index into an array. Similarly, if you're already storing your objects in an array (e.g. as part of some more complex data structure) and looking them up by index, then you can simply use the same index to look up any additional metadata values in another array.

特别是,如果您可以为每个键对象分配一个唯一的小整数 ID,那么您可以使用该 ID 作为数组的索引。类似地,如果您已经将对象存储在一个数组中(例如,作为一些更复杂数据结构的一部分)并通过索引查找它们,那么您可以简单地使用相同的索引在另一个数组中查找任何其他元数据值。

You could even dispense with the ID uniqueness requirement, if you implemented some kind of a collision handling mechanism, but at that point you're well on your way towards implementing your own hash table. In some cases that mightactually make sense, but usually at that point it's probably easier to use an existing third-party implementation.

如果您实现了某种冲突处理机制,您甚至可以免除 ID 唯一性要求,但此时您已经在实现自己的哈希表了。在某些情况下,这可能确实有意义,但通常在那时使用现有的第三方实现可能更容易。

Alternative 2: Customize your objects.

备选方案 2:自定义您的对象。

Instead of maintaining a map from key objects into primitive values, why not just make those values into properties of the objects themselves? This is, after all, what object-oriented programming is all about — grouping related data into meaningful objects.

与其维护从关键对象到原始值的映射,为什么不直接将这些值变成对象本身的属性?毕竟,这就是面向对象编程的全部内容——将相关数据分组为有意义的对象。

For example, instead of maintaining a HashMap<Point2D, Boolean> onSea, why not just give your points a boolean onSeaproperty? Of course, you'll need to define your own custom point class for this, but there's no reason why you can't make it extend the standard Point2Dclass if you want, so that you can pass your custom points into any method that expects a Point2D.

例如,与其维护 a HashMap<Point2D, Boolean> onSea,为什么不给您的点一个布尔onSea属性?当然,您需要为此定义自己的自定义点类,但是Point2D如果您愿意,没有理由不能让它扩展标准类,以便您可以将自定义点传递给任何需要Point2D.

Again, this approach may not always work directly, e.g. if you need to work with classes that you cannot modify (but see below), or if the values you want to store are associated with more than one object (as in your ConcurrentHashMap<Point2D, HashMap<Point2D, Double>>).

同样,这种方法可能并不总是直接有效,例如,如果您需要使用无法修改的类(但请参见下文),或者如果您要存储的值与多个对象相关联(如在您的 中ConcurrentHashMap<Point2D, HashMap<Point2D, Double>>)。

However, for the latter case, you may still be able to solve the problem by suitably redesigning your data representation. For example, instead of representing a weighted graph as a Map<Node, Map<Node, Double>>, you can define an Edgeclass like:

但是,对于后一种情况,您仍然可以通过适当地重新设计数据表示来解决问题。例如,Map<Node, Map<Node, Double>>您可以定义一个Edge类,而不是将加权图表示为 a :

class Edge {
    Node a, b;
    double weight;
}

and then add an Edge[](or Vector<Edge>) property to each node that contains any edges connected to that node.

然后向包含连接到该节点的任何边的每个节点添加一个Edge[](或Vector<Edge>) 属性。

Alternative 3: Combine multiple maps into one.

方案三:将多张地图合二为一。

If you have multiple maps with the same keys, and cannot just turn the values into new properties of the key objects as suggested above, consider grouping them into a single metadata class and creating a single map from the keys into objects of that class. For example, instead of a Map<Item, Double> accessFrequencyand a Map<Item, Long> creationTime, consider defining a single metadata class like:

如果您有多个具有相同键的映射,并且不能像上面建议的那样将值转换为键对象的新属性,请考虑将它们分组到单个元数据类中,并从键到该类的对象创建单个映射。例如,代替 aMap<Item, Double> accessFrequency和 a Map<Item, Long> creationTime,考虑定义一个元数据类,如:

class ItemMetadata {
    double accessFrequency;
    long creationTime;
}

and having a single Map<Item, ItemMetadata>to store all the metadata values. This is more memory-efficient than having multiple maps, and may also save time by avoiding redundant map lookups.

并有一个Map<Item, ItemMetadata>存储所有元数据值。这比拥有多个地图更节省内存,并且还可以通过避免冗余地图查找来节省时间。

In some cases, for convenience, you may also wish to include in each metadata object a reference to its corresponding main object, so that you can access both through a single reference to the metadata object. Which naturally segues into...

在某些情况下,为了方便起见,您可能还希望在每个元数据对象中包含对其相应主对象的引用,以便您可以通过对元数据对象的单个引用来访问这两个对象。这自然会变成……

Alternative 4: Use a decorator.

备选方案 4:使用装饰器。

As a combination of the previous two alternatives, if you cannot directly add extra metadata properties into the key objects, consider instead wrapping them with decoratorsthat can hold the extra values. Thus, for example, instead of directly creating your own point class with extra properties, you could simply do something like:

作为前两种选择的组合,如果您不能直接将额外的元数据属性添加到关键对象中,请考虑使用可以容纳额外值的装饰器来包装它们。因此,例如,您可以简单地执行以下操作,而不是直接创建您自己的具有额外属性的点类:

class PointWrapper {
    Point2D point;
    boolean onSea;
    // ...
}

If you like, you can even turn this wrapper into a full-blown decorator by implementing method forwarding, but even just a simple "dumb" wrapper may be sufficient for many purposes.

如果您愿意,您甚至可以通过实现方法转发将这个包装器变成一个成熟的装饰器,但即使只是一个简单的“哑”包装器也可能足以满足许多目的。

This approach is most useful if you can then arrange to store and work with just the wrappers, so that you never need to look up the wrapper corresponding to an unwrapped object. Of course, if you do need to do that occasionally (e.g. because you're only receiving the unwrapped objects from other code), then you can do that with a single Map<Point2D, PointWrapper>, but then you're effectively back at the previous alternative.

如果您可以安排仅存储和使用包装器,则此方法最有用,这样您就无需查找与未包装对象对应的包装器。当然,如果您确实需要偶尔这样做(例如,因为您只从其他代码接收未包装的对象),那么您可以使用单个Map<Point2D, PointWrapper>.

回答by Donald Raab

Eclipse Collectionshas objectand primitive mapsand has Mutable and Immutable versions for both.

Eclipse Collections对象映射原始映射,并且两者都有可变和不可变版本。

MutableObjectDoubleMap<String> doubleMap = ObjectDoubleMaps.mutable.empty();
doubleMap.put("1", 1.0d);
doubleMap.put("2", 2.0d);

MutableObjectBooleanMap<String> booleanMap = ObjectBooleanMaps.mutable.empty();
booleanMap.put("ok", true);

ImmutableObjectDoubleMap<String> immutableMap = doubleMap.toImmutable();
Assert.assertEquals(doubleMap, immutableMap);

A MutableMapcan be used as a factory for an ImmutableMapin Eclipse Collections by calling toImmutableas I have done in the example above. Both mutable and immutable maps share a common parent interface, which in the case of the MutableObjectDoubleMapand ImmutableObjectDoubleMapabove, is named ObjectDoubleMap.

通过调用AMutableMap可以用作ImmutableMapEclipse Collections 中的工厂,toImmutable就像我在上面的示例中所做的那样。两个可变和不可变的映射共享共同父接口,其中在的情况下MutableObjectDoubleMapImmutableObjectDoubleMap上述,被命名ObjectDoubleMap

Eclipse Collections also has synchronized and unmodifiable versions for all mutable containers in the library. The following code will give you a synchronized view wrapped around the primitive maps.

Eclipse Collections 还为库中的所有可变容器提供同步和不可修改的版本。以下代码将为您提供一个围绕原始地图的同步视图。

MutableObjectDoubleMap<String> doubleMap = 
        ObjectDoubleMaps.mutable.<String>empty().asSynchronized();
doubleMap.put("1", 1.0d);
doubleMap.put("2", 2.0d);

MutableObjectBooleanMap<String> booleanMap = 
        ObjectBooleanMaps.mutable.<String>empty().asSynchronized();
booleanMap.put("ok", true);

This performance comparison of large maps was published a couple of years ago.

几年前发布了大型地图的性能比较。

Large HashMap overview: JDK, FastUtil, Goldman Sachs, HPPC, Koloboke, Trove – January 2015 version

大型 HashMap 概述:JDK、FastUtil、Goldman Sachs、HPPC、Koloboke、Trove – 2015 年 1 月版

GS Collections has since been migrated to the Eclipse Foundation and is now Eclipse Collections.

GS Collections 已经迁移到 Eclipse Foundation,现在更名为 Eclipse Collections。

Note: I am a committer for Eclipse Collections.

注意:我是 Eclipse Collections 的提交者。

回答by Nicolas Filotto

What you are looking for is a Object2DoubleOpenHashMapfrom fastutil(Collections Framework with a small memory footprint and fast access and insertion) which provides methods of type double getDouble(Object k)and double put(K k, double v).

您正在寻找的是Object2DoubleOpenHashMap来自fastutil(具有小内存占用和快速访问和插入的集合框架),它提供了类型double getDouble(Object k)double put(K k, double v).

For example:

例如:

// Create a Object2DoubleOpenHashMap instance
Object2DoubleMap<String> map = new Object2DoubleOpenHashMap<>();
// Put a new entry
map.put("foo", 12.50d);
// Access to the entry
double value = map.getDouble("foo");

The class Object2DoubleOpenHashMapis a real implementation of a Mapthat is not thread-safe, however you can still use the utility method Object2DoubleMaps.synchronize(Object2DoubleMap<K> m)to make it thread-safe thanks to a decorator.

该类Object2DoubleOpenHashMap是 a 的真正实现Map,它不是线程安全的,但是Object2DoubleMaps.synchronize(Object2DoubleMap<K> m)由于装饰器,您仍然可以使用实用程序方法使其成为线程安全的。

The creation code would then be:

然后创建代码将是:

// Create a thread safe Object2DoubleMap
Object2DoubleMap<String> map =  Object2DoubleMaps.synchronize(
    new Object2DoubleOpenHashMap<>()
);

回答by hotzst

To have a better estimate of how these various libraries stack up against each other, I put together a small benchmark that checks the performance of:

为了更好地估计这些不同的库如何相互叠加,我整理了一个小型基准测试来检查以下各项的性能:

  • total time for 300'000 insertions
  • average time for a check of containments with 1000 samples that are in the map
  • Memory size of the data structure I had a look at the Map-like structure which takes a Stringas key and doubleas value. The checked frameworks are Eclipse Collection, HPPC, Troveand FastUtil, as well as for comparison the HashMapand ConcurrentHashMap.
  • 300'000 次插入的总时间
  • 使用地图中的 1000 个样本检查容器的平均时间
  • 数据结构的内存大小我查看了Map将 aString作为键和double值的-like 结构。被检查的框架是Eclipse CollectionHPPCTroveFastUtil,以及用于比较的HashMapConcurrentHashMap.

In short, these are the results:

简而言之,这些是结果:

Filling in 300000 into the JDK HashMap took 107ms
Filling in 300000 into the JDK ConcurrentHashMap took 152ms
Filling in 300000 into the Eclipse map took 107ms
Filling in 300000 into the Trove map took 855ms
Filling in 300000 into the HPPC map took 93ms
Filling in 300000 into the FastUtil map took 163ms
1000 lookups average in JDK HashMap took: 550ns
1000 lookups average in JDK Concurrent HashMap took: 748ns
1000 lookups average in Eclipse Map took: 894ns
1000 lookups average in Trove Map took: 1033ns
1000 lookups average in HPPC Map took: 523ns
1000 lookups average in FastUtil Map took: 680ns
JDK HashMap:            43'809'895B
JDK Concurrent HashMap: 43'653'740B => save  0.36%
Eclipse Map:            35'755'084B => save 18.39%
Trove Map:              32'147'798B => save 26.62%
HPPC Map:               27'366'533B => save 37.53%
FastUtil Map:           31'560'889B => save 27.96%

For all the details, as well as the test application take a look at my blog entry.

有关所有详细信息以及测试应用程序,请查看我的博客条目