HiveBrain v1.2.0
Get Started
← Back to all entries
patternjavaMinor

Identifying which paragraph, if any, is a superset of all words in a document

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
paragraphallanywordsdocumentwhichidentifyingsuperset

Problem

Description

need to find out the paragraph number which is containing all the words in the file.

input file java.txt


What is a JVM?


What is the most important feature of Java?


Are JVM's platform independent?


What do you mean by platform independence? What is the most important
feature of Java? What is a JVM? Are JVM's platform independent?

Output

paragraph num : 4


Solution code:

```
public class WordsOfParagraphWithMap {
public void findParagraphWithAllWords() throws IOException{

File file = new File("C:\\Users\\VENKAT\\Documents\\javaquestions.txt");
FileReader fileReader = new FileReader(file);
BufferedReader bufferedReader = new BufferedReader(fileReader);
int paraNum = 1;
int count = 0;
String line;
int paragraphWithMaxWords = 0;
int maxWords = 0;

Map map = new HashMap();

line = nextLine(bufferedReader);

do {
if (line != null) {

if (line.trim().isEmpty()) {
if (maxWords < count) {
paragraphWithMaxWords = paraNum;
maxWords = count;
}
paraNum++;
count = 0;
line = nextLine(bufferedReader);
}

if(line!=null){
String[] words = line.split("\\s");
for (String word : words) {
if (!map.containsKey(word)) {
count++;
} else {
int paraNumber = map.get(word);
if (paraNumber != paraNum) {
count++;
}
}
map.put(word, paraNum);
}
}
}
} while ((line = bufferedReader.readLine()) != null);

if (maxWor

Solution

Your code is suffering from a lack of functions. Functions help to identify what core parts of your code does, and makes those parts reusable. It also makes the calling code more readable.

By using functions, and extracting core components, it's easy to see the actual logic. For example, a main method like:

public static void main(String[] args) throws IOException {
    List> paragraphs = composeParagraphs(Paths.get("java.txt"));
    int superIndex = superSetParagraph(paragraphs);
    System.out.println("SuperSet Paragraph is " + superIndex);
}


Well, that's simple enough... how is it done?

Given a list of paragraph sets, finding the paragraph which is a superset, is easy:

private static int superSetParagraph(List> paras) {
    // get all words from all paragraphs.
    Set allwords = paras.stream().flatMap(p -> p.stream()).collect(Collectors.toSet());
    // which paragraph has all words.
    for (int i = 0; i < paras.size(); i++) {
        if (paras.get(i).size() == allwords.size()) {
            return i + 1;
        }
    }
    return 0;
}


How do you get all paragraphs?

private static final Pattern SPACE = Pattern.compile("\\s+");

public static Setwords(String input) {
    return Arrays.stream(SPACE.split(input))
          .filter(word -> !word.isEmpty())
          .collect(Collectors.toSet());
}

private static List> composeParagraphs(Path path) throws IOException {
    Set para = new HashSet<>();
    List> contents = new ArrayList<>();
    for (String line : Files.readAllLines(path)) {
        if (line.trim().isEmpty()) {
            // indicates a new paragraph....
            if (!para.isEmpty()) {
                contents.add(para);
                para = new HashSet<>();
            }
        } else {
            para.addAll(words(line));
        }
    }
    if (!para.isEmpty()) {
        contents.add(para);
    }
    return contents;
}


Again, functional extraction makes a difference.

Code Snippets

public static void main(String[] args) throws IOException {
    List<Set<String>> paragraphs = composeParagraphs(Paths.get("java.txt"));
    int superIndex = superSetParagraph(paragraphs);
    System.out.println("SuperSet Paragraph is " + superIndex);
}
private static int superSetParagraph(List<Set<String>> paras) {
    // get all words from all paragraphs.
    Set<String> allwords = paras.stream().flatMap(p -> p.stream()).collect(Collectors.toSet());
    // which paragraph has all words.
    for (int i = 0; i < paras.size(); i++) {
        if (paras.get(i).size() == allwords.size()) {
            return i + 1;
        }
    }
    return 0;
}
private static final Pattern SPACE = Pattern.compile("\\s+");

public static Set<String>words(String input) {
    return Arrays.stream(SPACE.split(input))
          .filter(word -> !word.isEmpty())
          .collect(Collectors.toSet());
}

private static List<Set<String>> composeParagraphs(Path path) throws IOException {
    Set<String> para = new HashSet<>();
    List<Set<String>> contents = new ArrayList<>();
    for (String line : Files.readAllLines(path)) {
        if (line.trim().isEmpty()) {
            // indicates a new paragraph....
            if (!para.isEmpty()) {
                contents.add(para);
                para = new HashSet<>();
            }
        } else {
            para.addAll(words(line));
        }
    }
    if (!para.isEmpty()) {
        contents.add(para);
    }
    return contents;
}

Context

StackExchange Code Review Q#80030, answer score: 5

Revisions (0)

No revisions yet.