patterncsharpMinor

Hackerrank: Prefix neighbors

Submitted by: @import:stackexchange-codereview·Mar 10, 2026·

Viewed 0 times

hackerrankprefixneighbors

Problem

Problem statement

Mark has a dictionary, S, containing n distinct strings. He defines the benefit value of a string
as the sum of the ASCII values of its characters.

Mark calls some string A and some string B prefix neighbors if both of the following conditions are satisfied:

-
String A is a prefix of string B.

-
No other string C exists in the dictionary that is both smaller in length than string B and has string A as a prefix.

For example, in S = {"AA", "AB", "ABD", "ABDE"}, the strings AB and ABD are prefix neighbors because AB is a prefix of ABD and no other string of length


  
The sum of the benefit value of the selected strings is maximal.


  Given

S, help Mark by finding and printing the maximum benefit value he can obtain from a subset of non-prefix-neighbors in S

.
  
  
  
Input Format
  
  
The first line contains an integer denoting

 (the number of strings in the dictionary). 
  
  
The second line contains

 space-separated strings.
  
  
Constraints
  
  
1 < n <= 4 * \$10^5\$
  
  
1 <= \$S_i\$'s length <= 11
  
  
Each string contains only uppercase letters.
  
  
Output Format
  
  
Print the maximum benefit value that Mark can obtain from a subset of non-prefix-neighbors in

.
  
  
Sample Input 0
  
  
3
  
  
A B AE
  
  
Sample Output 0
  
  
200
  
  
Explanation 0
  
  
{"A", "B", "AE"}
  
  
Strings

A and AE are prefix neighbors, so they cannot both be in Mark's subset of S. String B

 has no prefix neighbor, so we include it in Mark's subset.
  
  
To maximize the benefit value, we choose

AE and B

 for our subset. We then calculate the following benefit values for the chosen subset:
  
  
Benefit value of

AE = 65 + 69 = 134


  
  
Benefit value of

B = 66


  
  
We then calculate and print the total benefit value of our subset, which is

134 + 66 = 200`.

My introduction of the algorithm

This is the algorithm to apply trie knowledge and also use maximum w

Solution

Indentation

Consistent indentation helps readability. The indentation of this code seems to mix tabstops of 1, 2, 3, and 4 spaces.

Prefer interfaces to implementations

public Dictionary Children { get; set; }

Is there any good reason why this should not be an IDictionary?

Try to expose an API rather than the entire implementation

public Dictionary Children { get; set; }
public bool   IsInDictionary    { get; set; }
public string LastVisitedWord   { get; set; }

Looking past the syntactic sugar, that's six public methods (three getters and three setters). How many of them should really be public? I think that at minimum the setters should all be private, and I see no good reason to expose LastVisitedWord at all or to allow an external class to get Children and then mutate it. I would prefer to publicly expose just the equivalent (see below...) of

public bool IsInDictionary { get; }
public IEnumerable> Children { get; }

public Tuple AddWordToTrieByOneCharATime(
    int    scanIndex, 
    char[] charArray, 
    string word, 
    string neighbour)

exposes far more than is necessary. If you make that private then a public method AddWord(string) can call it with the appropriate initial values.

Names

public bool   IsInDictionary    { get; set; }
public string LastVisitedWord   { get; set; }

The first of these names makes sense in the right context, but when you're reading the code you see it in the context of Dictionary Children { get; set; }. How about IsWord?

The second makes sense only in the middle of a call to AddWordToTrieByOneCharATime. I think it should be LongestPrefix or something similar.

Separation of responsibilities

If the trie is intended for general use, I think that the way finding the prefix is integrated into it violates the principle of separation of responsibilities. If it's intended to be single-purpose and separation of responsibilities is intentionally sacrificed for better performance, it doesn't go far enough: the benefit calculation could be rolled into the Trie constructor to avoid recomputing it for each common prefix.

I think that you should refactor to separate trie building from prefix graph building for one simple reason: the documentation states that the complexity is O(N*max_length_of_string), but that's not true because the merging of responsibilities in trie building means that the strings have to be sorted before insertion, giving Theta(N lg N).

KISS

var groupped = dict.GroupBy(x => x[0]);

// maximum 26 groups, 'A','B', ..., 'Z'
foreach (var group in groupped)   
{
    // sort by string's length in ascending order
    var sortedStrings = group.OrderBy(x => x.Length);

    var trie = new Trie();

Why? What's wrong with inserting them all into one trie?

Code Snippets

public Dictionary<char, Trie> Children { get; set; }

public Dictionary<char, Trie> Children { get; set; }
public bool   IsInDictionary    { get; set; }
public string LastVisitedWord   { get; set; }

public bool IsInDictionary { get; }
public IEnumerable<KeyValuePair<char, Trie>> Children { get; }

public Tuple<string, string> AddWordToTrieByOneCharATime(
    int    scanIndex, 
    char[] charArray, 
    string word, 
    string neighbour)

public bool   IsInDictionary    { get; set; }
public string LastVisitedWord   { get; set; }

Context

StackExchange Code Review Q#155307, answer score: 5

Revisions (0)

No revisions yet.