patternjavaMinor
Reading a text file API 15+
Viewed 0 times
readingfiletextapi
Problem
Is there a faster, more efficient way to read a text file than this implementation?
Taking into account phones capabilities:
Output:
Time to read database from file 272403 items 1.112 seconds
Update:
After taking @rolfl advice into account and doing a little more digging this is what I came up with. Any further advice or a tidy up would be very welcome
```
dictionary = new ArrayList<>(300000);
long start = System.currentTimeMillis();
InputStream inputStream = null;
try{
inputStream = context.getAssets().open("words.txt");
}catch(IOException e){
e.printStackTrace();
}
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
byte[] buff = new byte[1048576];
try{
for(int i; (i = inputStream.read(buff)) != -1; ){
byteArrayOutputStream.write(buff, 0, i);
}
}catch(IOException ex){
ex.printStackTrace();
}
String[] contents = byteArrayOutputStream.toString().split("\n");
for(int i = 0; i < contents.length; i++){
dictionary.add(contents[i]);
}
long end = System.currentTimeMillis();
double t = ((end - start) / 1000.0)
Taking into account phones capabilities:
dictionary = new ArrayList();
long start = System.currentTimeMillis();
int count = 0;
try{
InputStream inputStream = context.getAssets().open("words.txt");
InputStreamReader inputStreamReader = new InputStreamReader(inputStream);
BufferedReader bufferedReader = new BufferedReader(inputStreamReader);
String word;
while((word = bufferedReader.readLine()) != null){
dictionary.add(word);
count++;
}
inputStream.close();
inputStreamReader.close();
bufferedReader.close();
}catch(IOException e){
e.printStackTrace();
}
long end = System.currentTimeMillis();
double t = ((end - start) / 1000.0);
System.out.println("Time to read database from file " + count + " items " + t + " seconds");Output:
Time to read database from file 272403 items 1.112 seconds
Update:
After taking @rolfl advice into account and doing a little more digging this is what I came up with. Any further advice or a tidy up would be very welcome
```
dictionary = new ArrayList<>(300000);
long start = System.currentTimeMillis();
InputStream inputStream = null;
try{
inputStream = context.getAssets().open("words.txt");
}catch(IOException e){
e.printStackTrace();
}
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
byte[] buff = new byte[1048576];
try{
for(int i; (i = inputStream.read(buff)) != -1; ){
byteArrayOutputStream.write(buff, 0, i);
}
}catch(IOException ex){
ex.printStackTrace();
}
String[] contents = byteArrayOutputStream.toString().split("\n");
for(int i = 0; i < contents.length; i++){
dictionary.add(contents[i]);
}
long end = System.currentTimeMillis();
double t = ((end - start) / 1000.0)
Solution
-
Don't do unnecessary work. You have the
-
Android supports Java-7 language features, use them. In this case, the try-with-resources would be your friend.
-
guess the size of the ArrayList that yoy may need. In this case, you should be a little generous, and say, pre-size it at 300,000 entries.
-
Android now supports (since KitKat) the diamond operator, there should be no need to declare the generic type of the ArrayList as `
In essence, the larger the IO sizes the better, and the larger the cache sizes are, the better.
Don't do unnecessary work. You have the
count variable, but you also have the dictionary which has a size() method. There's no need for the count.-
Android supports Java-7 language features, use them. In this case, the try-with-resources would be your friend.
-
guess the size of the ArrayList that yoy may need. In this case, you should be a little generous, and say, pre-size it at 300,000 entries.
-
Android now supports (since KitKat) the diamond operator, there should be no need to declare the generic type of the ArrayList as `
.
-
I actually like the while loop you have. It is my preferred way of doing line-by-line IO too.
Here's a 'cleaned up' version of your code:
private static final int INITIALSIZE = 300000;
....
long start = System.currentTimeMillis();
dictionary = new ArrayList<>(INITIALSIZE);
try (BufferedReader bufferedReader = new BufferedReader(
new InputStreamReader(context.getAssets().open("words.txt")));) {
String word;
while((word = bufferedReader.readLine()) != null){
dictionary.add(word);
}
}catch(IOException e){
e.printStackTrace();
}
long end = System.currentTimeMillis();
double t = ((end - start) / 1000.0);
System.out.println("Time to read database from file " + dictionary.size()
+ " items " + t + " seconds");
So, that's a "simplified" version, how to make it faster?
Well, there's a few things. First up, nothing can be for sure unless you test it, so, run some experiments. Things I would try:
- Specify a buffer-size on the BufferedReader, something large like
1024 * 1024 (a megabyte). This should increase the size of IO's
- The pre-sized ArrayList will help
- Consider reading the whole data file in to a
ByteArrayOutputStream`, and then converting that in one go in to a large String, then splitting the string on line-breaks.In essence, the larger the IO sizes the better, and the larger the cache sizes are, the better.
Code Snippets
private static final int INITIALSIZE = 300000;
....
long start = System.currentTimeMillis();
dictionary = new ArrayList<>(INITIALSIZE);
try (BufferedReader bufferedReader = new BufferedReader(
new InputStreamReader(context.getAssets().open("words.txt")));) {
String word;
while((word = bufferedReader.readLine()) != null){
dictionary.add(word);
}
}catch(IOException e){
e.printStackTrace();
}
long end = System.currentTimeMillis();
double t = ((end - start) / 1000.0);
System.out.println("Time to read database from file " + dictionary.size()
+ " items " + t + " seconds");Context
StackExchange Code Review Q#92709, answer score: 4
Revisions (0)
No revisions yet.