HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Minimal substring with all characters contained in string

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
allwithsubstringcontainedcharactersminimalstring

Problem

I was given following task in the interview:


Given a string, find shortest substring in it that contains all of the different characters contained in the original string.

Here is my solution in emacs lisp:

```
(defun add-to-hash (c ht)
"Add an instance of character C to hash table HT"
(puthash c (1+ (gethash c ht 0)) ht))

(defun remove-from-hash (c ht &optional try)
"Remove an instance of character C from hash table HT
Return nil if no instances of C remain in HT"
(let ((old (gethash c ht 0)))
(if (and try ( 1 (puthash c (1- (gethash c ht 0)) ht))
(remhash c ht)
t))))

(defun find-min-substring (sstr)
"Find minimal substring that contains all the characters in a given string"
;; get all characters
(let* ((all-chars (make-hash-table))
(slen (length sstr))
(fcnt (progn
(mapc (lambda (c) (add-to-hash c all-chars)) sstr)
(hash-table-count all-chars)))
(beg 0) (end fcnt)
(res sstr))
(if (= end slen)
res
(let* ((cand (substring sstr beg end))
(cand-chars (make-hash-table))
(ccnt (progn
(mapc (lambda (c) (add-to-hash c cand-chars)) cand)
(hash-table-count cand-chars))))
;; find first candidate, that is a substring with all the characters
(while (< ccnt fcnt)
(add-to-hash (aref sstr end) cand-chars)
(setq end (1+ end))
(setq ccnt (hash-table-count cand-chars)))
(setq cand (substring sstr beg end))
;; shorten it as much as possible
(while (remove-from-hash (aref sstr beg) cand-chars t)
(setq beg (1+ beg)))
(setq cand (substring sstr beg end))
(setq res cand)
;; check other variants
(while (< end slen)
;; advance both ends
(add-to-hash (aref sstr end) cand-chars)
(setq end (1+ end))
(remove-from-hash (aref sstr beg) cand-chars)

Solution

I know this is old, but I will write some remarks anyways, hope you will find it useful.

  • This code lacks knowledge of usual language shortcuts, eg. (if x nil y) is equivalent to (unless x y). Similarly, (setq x (1+ x)) is the same as (incf x), (setq x y) (setq z q) is the same as (setq x y z q) inside (implicit) progn. (puthash c (1- (gethash c ht 0)) ht) is the same as (decf (gethash c ht 0)).



  • Language offers specialized data-structure for working with characters--char-table.



  • It is better to express algorithmic code in terms it is easy for humans to interpret (i.e. leave the low-level details of implementation to helper functions, while giving helpers meaningful names. For example add-to-hash isn't particularly better than puthash. Perhaps better partitioning of your code would involve functions like unique-characters-of, generate-candidate etc.



Finally, and most importantly, your code doesn't really solve the problem, try this for instance:

(find-min-substring "aaaaabacbaaaaac") ; => "acbaaaa"


Below is an alternative solution. It is very straight-forward (probably inefficient), but (hopefully) correct:

(defun string-cost (s)
  (cl-loop
   with mask = (make-vector 128 0)
   for c across s do
   (setf (aref mask c) 1)
   finally (cl-return (cl-reduce '+ mask))))

(defun generate-candidates (s)
  (list (substring s 0 (1- (length s)))
        (substring s 1)))

(defun choose-candidate (a b min-cost)
  (let ((a-candidate (min-common-substring a min-cost))
        (b-candidate (min-common-substring b min-cost)))
    (if (= a-cost min-cost)
             (>= b-cost min-cost))
        (choose-candidate a b min-cost))
       ((>= a-cost min-cost)
        (min-common-substring a min-cost))
       ((>= b-cost min-cost)
        (min-common-substring b min-cost))
       (t s)))))


Where more involved, but slightly more efficient version of string-cost could look like this:

(defun string-cost (s)
  (cl-loop
   with mask = (make-char-table t 0)
   for c across s do
   (set-char-table-range mask c 1)
   finally
   (cl-return
    (let ((cost 0))
      (map-char-table
       #'(lambda (k v)
           (when (and (/= v 0) (consp k))
             (setf v (- (cdr k) (1- (car k)))))
           (incf cost v))
       mask)
      cost))))

Code Snippets

(find-min-substring "aaaaabacbaaaaac") ; => "acbaaaa"
(defun string-cost (s)
  (cl-loop
   with mask = (make-vector 128 0)
   for c across s do
   (setf (aref mask c) 1)
   finally (cl-return (cl-reduce '+ mask))))

(defun generate-candidates (s)
  (list (substring s 0 (1- (length s)))
        (substring s 1)))

(defun choose-candidate (a b min-cost)
  (let ((a-candidate (min-common-substring a min-cost))
        (b-candidate (min-common-substring b min-cost)))
    (if (< (length a-candidate) (length b-candidate))
        a-candidate
      b-candidate)))

(defun min-common-substring (s &optional min-cost)
  (unless min-cost (setf min-cost (string-cost s)))
  (cl-destructuring-bind (a b)
      (generate-candidates s)
    (let ((a-cost (string-cost a))
          (b-cost (string-cost b)))
      (cond
       ((and (>= a-cost min-cost)
             (>= b-cost min-cost))
        (choose-candidate a b min-cost))
       ((>= a-cost min-cost)
        (min-common-substring a min-cost))
       ((>= b-cost min-cost)
        (min-common-substring b min-cost))
       (t s)))))
(defun string-cost (s)
  (cl-loop
   with mask = (make-char-table t 0)
   for c across s do
   (set-char-table-range mask c 1)
   finally
   (cl-return
    (let ((cost 0))
      (map-char-table
       #'(lambda (k v)
           (when (and (/= v 0) (consp k))
             (setf v (- (cdr k) (1- (car k)))))
           (incf cost v))
       mask)
      cost))))

Context

StackExchange Code Review Q#162974, answer score: 2

Revisions (0)

No revisions yet.