HiveBrain v1.2.0
Get Started
← Back to all entries
patternjavaMinor

Finding subsets from specified array, avoiding duplicates

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
arraysubsetsfindingfromavoidingduplicatesspecified

Problem

I am printing subsets from an array whose sum has been specified, while avoiding duplicates.

I think there may be improvements, potentially on how using map is bad idea, or how can I avoid map to filter the duplicates.

Additionally, in my code I want to reverse the sort order of an int array using streams which is asked over here, are there any alternatives rather than sorting and reversing the array?

public class FindSubSetArray {

    private static final int TARGET_SUM = 24;
    private static Map subSet = new HashMap<>();
    private static int count = 0;

    public static void main(String[] args) {
        int[] array= {2, 5, 1, 2, 4, 1, 6, 5, 2, 2};
        Arrays.sort(array);
        findSubset(array, 0, 0, "");
        subSet.keySet().stream().forEach(System.out::println);
    }

    public static void findSubset(int[] array, int index, int current, String subSetString) {

        if (current > TARGET_SUM || array.length < index) return;

        for (int i = index; i < array.length; i++) {
            int presentSum = current + array[i];
            if (presentSum == TARGET_SUM) {
     //         System.out.println(subSetString + " " + array[i]);
                subSet.put(subSetString + " " + array[i], count++);
            } else {
                findSubset(array, i + 1, presentSum, subSetString + " " + array[i]);
            }
        }
    }
}

Solution

To avoid the duplicates, you need to make the decision whether that element is already part of the subset. To do so, you need to do search in the already considered elements for the subset.

The challenge is performing this search optimally. Available options are to use a BST to store the subset elements which can be summed to the target sum. The searching operation will be efficient (log2H H -height of tree). This operation needs to be performed for every element, so it's better to use a BST.

Another alternative is to sort the given input, which is at the cost of \$\mathcal{O}(n \log n)\$. Then we can easily check the duplicates in the subset as they are appearing consecutively.

Let's compare the above approach:

BST has two costs:

  • Construction of BST for each subset and balancing it



  • Cearching (log2H)



Sorting approach needs: \$\mathcal{O}(n \log n)\$

The winning answer is to maintain the HashSet of the subset and contains can be performed in \$\mathcal{O}(1)\$ constant time.

Context

StackExchange Code Review Q#115727, answer score: 2

Revisions (0)

No revisions yet.