snippetjavaMinor
Read text file filled with numbers and tranfer to an array
Viewed 0 times
filetranferreadwitharraytextnumbersfilledand
Problem
This is my ProcessDataFileParallel.Java file. I'm taking numbers from a .txt file and putting it into an array in Java. While it works, I would like to improve the code and possibly change some algorithms to better alternatives.
```
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.Scanner;
public class ProcessDataFileParallel {
public static void main(String[] args) throws IOException {
int[] array = new int[100000];
int index = 0;
Scanner inputFile = null;
Scanner scan = new Scanner(System.in);
boolean running = false;
String emptyString;
FileReader file = new FileReader("dataset529.txt");
BufferedReader br = new BufferedReader(file);
emptyString = br.readLine();
try {
while ((emptyString = br.readLine()) != null) {
array[index++] = Integer.parseInt(emptyString);
}
} finally {
if (br.readLine() == null)
br.close();
}
inputFile = new Scanner(file);
// Read File from Dataset529.txt
if(inputFile != null){
System.out.println("Number of integers in: " + file);
try{
while (inputFile.hasNext() && inputFile.hasNextInt()){
array[index] = inputFile.nextInt();
index++;
inputFile.next();
}
}finally{
inputFile.close();
}
// Print dataset529.txt to Array in Java with For Loop below.
for (int ai = 0; ai " + " " + array[ai]);
}
System.out.println("How many Threads?"); // Scanner - Ask user how many threads
int n = scan.nextInt(); // Create variable 'n' to handle whatever integer the user specifies. nextInt() is used for the scanner to expect and Int.
Thread[] myThreads = new Thread[n]; // Variable n placed here to determine how many Threads user wanted
Worker[] workers = new Worker[n]; // Variable N placed here - the nu
```
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.Scanner;
public class ProcessDataFileParallel {
public static void main(String[] args) throws IOException {
int[] array = new int[100000];
int index = 0;
Scanner inputFile = null;
Scanner scan = new Scanner(System.in);
boolean running = false;
String emptyString;
FileReader file = new FileReader("dataset529.txt");
BufferedReader br = new BufferedReader(file);
emptyString = br.readLine();
try {
while ((emptyString = br.readLine()) != null) {
array[index++] = Integer.parseInt(emptyString);
}
} finally {
if (br.readLine() == null)
br.close();
}
inputFile = new Scanner(file);
// Read File from Dataset529.txt
if(inputFile != null){
System.out.println("Number of integers in: " + file);
try{
while (inputFile.hasNext() && inputFile.hasNextInt()){
array[index] = inputFile.nextInt();
index++;
inputFile.next();
}
}finally{
inputFile.close();
}
// Print dataset529.txt to Array in Java with For Loop below.
for (int ai = 0; ai " + " " + array[ai]);
}
System.out.println("How many Threads?"); // Scanner - Ask user how many threads
int n = scan.nextInt(); // Create variable 'n' to handle whatever integer the user specifies. nextInt() is used for the scanner to expect and Int.
Thread[] myThreads = new Thread[n]; // Variable n placed here to determine how many Threads user wanted
Worker[] workers = new Worker[n]; // Variable N placed here - the nu
Solution
Bug
If the array length is not a multiplier of the number of threads, you'll miss the remainder! Quick example:
As you can see, you'll miss the 10th element for
edit: A simplistic approach is to spin a new
Multi-threading
The preferred way of doing multi-threading since Java 5 is to use an
In addition, since you want each thread to compute and return a result for you, you should be looking at the
If you are on Java 7 and above, you should use
Hard-coding array length
What happens if your file has more than this number of elements? Actually, you also seem to be processing your file twice, once using the
edit:
Getting the max value
Your current implementation forgets to actually compare the max of each partition with each other to arrive at the desired result.
Is this meant to be more like an academic exercise on multi-threading? - myself
The reason why I'm asking is that while it looks like you're looking more at a map-reduce approach to your problem, it may not even be necessary in the first place. Processing 100,000 integers (based on your
Getting the max value (part 2)
From Java 8 onwards, there are two approaches of running tasks asynchronously:
The code below demonstrates both:
For both approaches,
If the array length is not a multiplier of the number of threads, you'll miss the remainder! Quick example:
public static void printRange(int length, int n) {
int range = length / n;
for (int index = 0; index < n; index ++){
int start = index * range;
int end = start + range;
System.out.printf("[%s, %s)%n", start, end);
}
}
// Output for printRange(10, 3):
[0, 3)
[3, 6)
[6, 9)
// Output for printRange(10, 6):
[0, 1)
[1, 2)
[2, 3)
[3, 4)
[4, 5)
[5, 6)As you can see, you'll miss the 10th element for
printRange(10, 3), and up to half the array for printRange(10, 6).edit: A simplistic approach is to spin a new
Thread to handle the remaining elements, but as you can see from the second example, you will end up a relatively high number of five elements for it, whereas the first five threads are only processing (and returning) one element. An alternative approach is to add an extra element per Thread so that you have more threads doing slightly more work, than one single thread doing most of the work.Multi-threading
The preferred way of doing multi-threading since Java 5 is to use an
ExecutorService to help you manage the lifecycle of threads. You should read up more on Oracle's tutorial to understand how they are used.In addition, since you want each thread to compute and return a result for you, you should be looking at the
Callable interface. Implementations override the call() method to return an appropriate result.try-with-resourcesIf you are on Java 7 and above, you should use
try-with-resources for efficient handling of the underlying I/O resource:public static void main(String[] args) {
try (Scanner scanner = new Scanner(System.in)) {
// ...
}
}Hard-coding array length
int[] array = new int[100000];What happens if your file has more than this number of elements? Actually, you also seem to be processing your file twice, once using the
BufferedReader approach, and the second using a Scanner. Is this expected?edit:
Getting the max value
Your current implementation forgets to actually compare the max of each partition with each other to arrive at the desired result.
Is this meant to be more like an academic exercise on multi-threading? - myself
The reason why I'm asking is that while it looks like you're looking more at a map-reduce approach to your problem, it may not even be necessary in the first place. Processing 100,000 integers (based on your
int[] declaration) or less in a single thread is relatively fast enough on any modern hardware capable of running the JVM. If you are however trying to compare billions of numbers and/or given a (strangely low) memory constraint, then that's when multi-threading will help. However, your question will be quite different at that stage.Getting the max value (part 2)
From Java 8 onwards, there are two approaches of running tasks asynchronously:
ExecutorServiceto createFutures, and then getting the results from each of them.
- Chaining
CompletableFutures (Java 8 onwards).
The code below demonstrates both:
public class AsyncMax {
private static final int MAX = 10;
public static void main(String[] args) {
System.out.println("Using ExecutorService:");
ExecutorService service = Executors.newWorkStealingPool();
List> esResults = IntStream.range(0, MAX).parallel()
.mapToObj(i -> service.submit(() -> produce(i).get()))
.collect(Collectors.toList());
esResults.stream()
.map(f -> get(f))
.max(Comparator.naturalOrder())
.ifPresent(System.out::println);
service.shutdown();
System.out.println("Using CompletableFuture:");
CompletableFuture> cfResults = IntStream.range(0, MAX).parallel()
.mapToObj(i -> produce(i))
.map(CompletableFuture::supplyAsync)
.collect(allOf());
cfResults.thenApply(results -> results.stream().max(Comparator.naturalOrder()))
.join()
.ifPresent(System.out::println);
}
}For both approaches,
IntStream.range(0, MAX).parallel() is used to parallelize the production of MAX integers, via the produce(int) method (shown below). The intermediary esResults and cfResults variables are created solely to highlight where the asynchronous tasks will 'end', to be followed by how the results are collected. One can certainly daisy-chain the method calls completely.ExecutorService:
- Start an
ExecutorServiceimplementation, the Java 8newWorkStealingPool()is an example here.
submit()aCallablethat returns anIntegerupon completion.
- Collect the results to a
List>, i.e. aListofFutures that returnIntegers.
- Stream the
Listbyget()-ting (method shown below) each resultingInteger.
- From the
Stream, call `max(Comparator.naturalOrde
Code Snippets
public static void printRange(int length, int n) {
int range = length / n;
for (int index = 0; index < n; index ++){
int start = index * range;
int end = start + range;
System.out.printf("[%s, %s)%n", start, end);
}
}
// Output for printRange(10, 3):
[0, 3)
[3, 6)
[6, 9)
// Output for printRange(10, 6):
[0, 1)
[1, 2)
[2, 3)
[3, 4)
[4, 5)
[5, 6)public static void main(String[] args) {
try (Scanner scanner = new Scanner(System.in)) {
// ...
}
}int[] array = new int[100000];public class AsyncMax {
private static final int MAX = 10;
public static void main(String[] args) {
System.out.println("Using ExecutorService:");
ExecutorService service = Executors.newWorkStealingPool();
List<Future<Integer>> esResults = IntStream.range(0, MAX).parallel()
.mapToObj(i -> service.submit(() -> produce(i).get()))
.collect(Collectors.toList());
esResults.stream()
.map(f -> get(f))
.max(Comparator.naturalOrder())
.ifPresent(System.out::println);
service.shutdown();
System.out.println("Using CompletableFuture:");
CompletableFuture<List<Integer>> cfResults = IntStream.range(0, MAX).parallel()
.mapToObj(i -> produce(i))
.map(CompletableFuture::supplyAsync)
.collect(allOf());
cfResults.thenApply(results -> results.stream().max(Comparator.naturalOrder()))
.join()
.ifPresent(System.out::println);
}
}private static Supplier<Integer> produce(int i) {
return () -> {
try {
Thread.sleep((int) ((1 + Math.random()) * (MAX - i) * 500));
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
throw new RuntimeException(e);
}
System.out.printf("Returning [%s] from Thread[%s]%n",
i, Thread.currentThread().getName());
return i;
};
}Context
StackExchange Code Review Q#109205, answer score: 4
Revisions (0)
No revisions yet.