Java 将一个集合分成更小的子集并作为批处理进行处理
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/19423326/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Partition a Set into smaller Subsets and process as batch
提问by Pawan
I have a continous running thread in my Application , which consists of an HashSet to store all the symbols inside the Application . As per the design at the time it was written , inside the Thread's while true condition it will iterate the hashset continosly and updates the Database for all the symbols contained inside HashSet .
我的 Application 中有一个连续运行的线程,它由一个 HashSet 组成,用于存储 Application 中的所有符号。根据编写时的设计,在 Thread 的 while true 条件中,它将不断迭代 hashset 并更新 HashSet 中包含的所有符号的数据库。
The max symbols that might be present inside the hashset will be around 6000 . I dont the db with all the 6000 symbols at once , but divide this hashset into different subsets of 500 each (12 Sets ) and execute each Subset individually and have a Thread sleep after each Subset for 15 minutes , so taht i can reduce the pressure on Database .
哈希集中可能存在的最大符号数约为 6000 。我没有同时使用所有 6000 个符号的数据库,但将此哈希集划分为每个 500 个的不同子集(12 个集合)并单独执行每个子集,并在每个子集后让线程休眠 15 分钟,这样我就可以减轻压力在数据库上。
This is my code , (sample code snippet )
这是我的代码,(示例代码片段)
How can i Partition a Set into smaller Subsets and process , ( i have seen the examples for partioning ArrayList , TreeSet , but didn't find any example related to HashSet )
我如何将 Set 划分为更小的子集并进行处理,(我已经看到了划分 ArrayList 、 TreeSet 的示例,但没有找到任何与 HashSet 相关的示例)
package com.ubsc.rewji.threads;
import java.util.Arrays;
import java.util.Collections;
import java.util.HashSet;
import java.util.Iterator;
import java.util.Set;
import java.util.concurrent.PriorityBlockingQueue;
public class TaskerThread extends Thread {
private PriorityBlockingQueue<String> priorityBlocking = new PriorityBlockingQueue<String>();
String symbols[] = new String[] { "One", "Two", "Three", "Four" };
Set<String> allSymbolsSet = Collections
.synchronizedSet(new HashSet<String>(Arrays.asList(symbols)));
public void addsymbols(String commaDelimSymbolsList) {
if (commaDelimSymbolsList != null) {
String[] symAr = commaDelimSymbolsList.split(",");
for (int i = 0; i < symAr.length; i++) {
priorityBlocking.add(symAr[i]);
}
}
}
public void run() {
while (true) {
try {
while (priorityBlocking.peek() != null) {
String symbol = priorityBlocking.poll();
allSymbolsSet.add(symbol);
}
Iterator<String> ite = allSymbolsSet.iterator();
System.out.println("=======================");
while (ite.hasNext()) {
String symbol = ite.next();
if (symbol != null && symbol.trim().length() > 0) {
try {
updateDB(symbol);
} catch (Exception e) {
e.printStackTrace();
}
}
}
Thread.sleep(2000);
} catch (Exception e) {
e.printStackTrace();
}
}
}
public void updateDB(String symbol) {
System.out.println("THE SYMBOL BEING UPDATED IS" + " " + symbol);
}
public static void main(String args[]) {
TaskerThread taskThread = new TaskerThread();
taskThread.start();
String commaDelimSymbolsList = "ONVO,HJI,HYU,SD,F,SDF,ASA,TRET,TRE,JHG,RWE,XCX,WQE,KLJK,XCZ";
taskThread.addsymbols(commaDelimSymbolsList);
}
}
采纳答案by Amir Pashazadeh
Do something like
做类似的事情
private static final int PARTITIONS_COUNT = 12;
List<Set<Type>> theSets = new ArrayList<Set<Type>>(PARTITIONS_COUNT);
for (int i = 0; i < PARTITIONS_COUNT; i++) {
theSets.add(new HashSet<Type>());
}
int index = 0;
for (Type object : originalSet) {
theSets.get(index++ % PARTITIONS_COUNT).add(object);
}
Now you have partitioned the originalSet
into 12 other HashSets.
现在您已将originalSet
HashSet划分为 12 个其他 HashSet。
回答by TwoThe
A very simple way for your actual problem would be to change your code as follows:
对于您的实际问题,一个非常简单的方法是更改您的代码,如下所示:
Iterator<String> ite = allSymbolsSet.iterator();
System.out.println("=======================");
int i = 500;
while ((--i > 0) && ite.hasNext()) {
A general method would be to use the iterator to take the elements out one by one in a simple loop:
一般的方法是使用迭代器在一个简单的循环中一个一个地取出元素:
int i = 500;
while ((--i > 0) && ite.hasNext()) {
sublist.add(ite.next());
ite.remove();
}
回答by Andrey Chaschev
回答by PipoTells
We can use the following approach to divide a Set.
我们可以使用下面的方法来划分一个 Set。
We will get the output as [a, b] [c, d] [e]`
我们将得到输出为 [a, b] [c, d] [e]`
private static List<Set<String>> partitionSet(Set<String> set, int partitionSize)
{
List<Set<String>> list = new ArrayList<>();
int setSize = set.size();
Iterator iterator = set.iterator();
while(iterator.hasNext())
{
Set newSet = new HashSet();
for(int j = 0; j < partitionSize && iterator.hasNext(); j++)
{
String s = (String)iterator.next();
newSet.add(s);
}
list.add(newSet);
}
return list;
}
public static void main(String[] args)
{
Set<String> set = new HashSet<>();
set.add("a");
set.add("b");
set.add("c");
set.add("d");
set.add("e");
int size = 2;
List<Set<String>> list = partitionSet(set, 2);
for(int i = 0; i < list.size(); i++)
{
Set<String> s = list.get(i);
System.out.println(s);
}
}
回答by Aman
The Guava solution from @Andrey_chaschev seems the best, but in case it is not possible to use it, I believe the following would help
@Andrey_chaschev 的 Guava 解决方案似乎是最好的,但如果无法使用它,我相信以下内容会有所帮助
public static List<Set<String>> partition(Set<String> set, int chunk) {
if(set == null || set.isEmpty() || chunk < 1)
return new ArrayList<>();
List<Set<String>> partitionedList = new ArrayList<>();
double loopsize = Math.ceil((double) set.size() / (double) chunk);
for(int i =0; i < loopsize; i++) {
partitionedList.add(set.stream().skip((long)i * chunk).limit(chunk).collect(Collectors.toSet()));
}
return partitionedList;
}