mercredi 26 mars 2014

Overlapping ConcurrentHashMap puts using putIfAbsent


Vote count:

0




There seems to be an issue with inserting into the hashtable. I create about 8 threads, and in each thread I do the following code. Each thread receives a char[] array. The job of each thread is to tokenize this array (look for spaces). Once a token is found, I need to add it to the hashtable if it doesn't exist. If it does exist, then I need to add 1 to the current value of that token (the key).


Questions you might ask:


Why not convert from char[] to String?


I tried this, and since strings are immutable, I eventually ran out of memory (I am processing a 10g file), or I spend too long garbage collecting. With Character[], I am able to reuse the same variable and not take up extra space in memory.


What is the issue?


When I am done processing the entire file, I run the code:



for (Entry<Character [], Integer> e : wordCountMap.entrySet()) {
System.out.println(Arrays.toString(e.getKey()) + " = " + e.getValue());
}


in my main function. What I get as a result is less than 100 key/value pairs. I know that there should be around 20,000. There somehow seems to be some overlap.



Character [] charArray = new Character[8];
for (i = 0; i < newbyte.length; i++) { //newbyte is a char[] from main
if (newbyte[i] != ' ') {
charArray[counter] = newbyte[i];
counter++;
}
else {
check = wordCountMap.putIfAbsent(charArray, 1);
if (check != null) {
wordCountMap.put(charArray, wordCountMap.get(charArray) + 1);
}
for (j = 0; j < counter; j++) {
charArray[j] = null;
}//Null out the array

ConcurrentMap<Character [], Integer> wordCountMap //this is the definition in main


asked 1 min ago






Aucun commentaire:

Enregistrer un commentaire