patternjavaMinor
Object Creation during loops
Viewed 0 times
duringobjectcreationloops
Problem
I'm trying to parse a CSV file into objects. I've already got a very efficient CSV parser (SuperCSV). The problem arises when trying to create the object with the data that is being read out.
I've got a nested
My problem arises in that the above code takes roughly 10 seconds to run. This is with a CSV input of about 2MB, but this service has to handle a file of 200MB (at least) efficiently.
The reason I'm stuck is that the code below runs in roughly 1.6 seconds but only creates a single instance of a Contact object (which is fair enough) and I do see why this is occurring.
Obviously if I do the latter, I end up with a single instance of a Contact object at the end instead of the many thousands that I should have. However I have had code similar to the latter create and save the objects in about 1.6 seconds (using a
Edit 1:
```
@Ent
I've got a nested
foreach that iterates firstly through a map (that represents a CSV row) and then through each key within the map (which represents a column for that row) to create an object.for(Map csvRow : csvRows){
Contact contact = new Contact();
contact.setContactId(nextContactId);
for(String key : csvRow.keySet())
{
contact.setDetails(key, csvRow.get(key));
contactList.addContact(contact);
}
nextContactId++;
}My problem arises in that the above code takes roughly 10 seconds to run. This is with a CSV input of about 2MB, but this service has to handle a file of 200MB (at least) efficiently.
The reason I'm stuck is that the code below runs in roughly 1.6 seconds but only creates a single instance of a Contact object (which is fair enough) and I do see why this is occurring.
Contact contact = new Contact();
for(Map csvRow : csvRows){
contact.setContactId(nextContactId);
for(String key : csvRow.keySet())
{
contact.setDetails(key, csvRow.get(key));
contactList.addContact(contact);
}
nextContactId++;
}Obviously if I do the latter, I end up with a single instance of a Contact object at the end instead of the many thousands that I should have. However I have had code similar to the latter create and save the objects in about 1.6 seconds (using a
while loop rather than the first foreach) so I know it's possible.while( (readCsvRow() != null) {
Contact contact = new Contact();
contact.setContactId(nextContactId);
for(String header : headers){
if(csvRow.get(header) != null)
{
contact.setDetails(header, csvRow.get(header));
contactList.addContact(contact);
}
}
nextContactId++;
}Edit 1:
```
@Ent
Solution
Map.Entry
HashMaps (Maps in general) require time to do key lookups.
You can save a significant number of these by changing your inner loop:
to be:
This will save a bunch of time, but the actual amount will need to be measured.
Profile
There is little in your code that has any other performance impact other than the Contact.
You need to supply that Contact's code, or alternatively profile it yourself.
Memory
OK, I have seen this before.
Consider your code, it reads say 100,000 records from file.
First things first, are you using RegEx/SubString to break the columns out of the CSV?
If you are, you may be hanging on to a lot more memory than you may realize.
Secondly, you are creating 100,000 Maps.
Each Map is an object, and it creates a HashTable for the values. Each Map.Entry is an object, etc. So, with, say, 100,000 maps, and each one, with say 10 columns, is 512 bytes, it amounts to 50 MB.
Your for-loop is creating all of this infrastructure in memory, and that is a slow process.
The while-loop, I presume, has just a single row in memory at any one time, so the memory management is simpler, faster, smaller, and generally works.
HashMaps (Maps in general) require time to do key lookups.
You can save a significant number of these by changing your inner loop:
for(String key : csvRow.keySet())
{
contact.setDetails(key, csvRow.get(key));
contactList.addContact(contact);
}to be:
for(Map.Entry me : csvRow.entrySet())
{
contact.setDetails(me.getKey(), me.getValue());
contactList.addContact(contact);
}This will save a bunch of time, but the actual amount will need to be measured.
Profile
There is little in your code that has any other performance impact other than the Contact.
You need to supply that Contact's code, or alternatively profile it yourself.
Memory
OK, I have seen this before.
Consider your code, it reads say 100,000 records from file.
First things first, are you using RegEx/SubString to break the columns out of the CSV?
If you are, you may be hanging on to a lot more memory than you may realize.
Secondly, you are creating 100,000 Maps.
Each Map is an object, and it creates a HashTable for the values. Each Map.Entry is an object, etc. So, with, say, 100,000 maps, and each one, with say 10 columns, is 512 bytes, it amounts to 50 MB.
Your for-loop is creating all of this infrastructure in memory, and that is a slow process.
The while-loop, I presume, has just a single row in memory at any one time, so the memory management is simpler, faster, smaller, and generally works.
Code Snippets
for(String key : csvRow.keySet())
{
contact.setDetails(key, csvRow.get(key));
contactList.addContact(contact);
}for(Map.Entry<String,Object> me : csvRow.entrySet())
{
contact.setDetails(me.getKey(), me.getValue());
contactList.addContact(contact);
}Context
StackExchange Code Review Q#43361, answer score: 8
Revisions (0)
No revisions yet.