HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

split-string for R7RS

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
r7rssplitstringfor

Problem

A split-string function seems to be missing in R7RS.

I came up with the following implementation today:

(define (split-string delim str)
  (let ((add (lambda (current output)
               (cons (list->string (reverse current)) output))))
    (let loop ((input (string->list str))
               (output #f)
               (current #f))
      (if (null? input)
          (if output
              (reverse (if current 
                           (add current output)
                           output))
              '())
          (let ((char (car input))
                (input (cdr input)))
            (if (char=? char delim)
                (loop input (add current output) '())
                (loop input output (cons char current))))))))


Is there anything which can be optimized due to run time or memory usage?

This test case I have used:

```
(import (scheme write))

(define-syntax test
(syntax-rules ()
((test function index (predicate (argument ...) result))
(let ((value (function argument ...)))
(let ((test-result (predicate value result)))
(display 'function)
(if index
(begin
(display " ")
(display index)))
(display ": ")
(if test-result
(display "ok")
(begin
(display "FAILURE")
(newline)
(display " Expression: ")
(write '(function argument ...))
(newline)
(display " returns: ")
(display value)
(newline)
(display " expecting: ")
(write result)))
(newline)
test-result)))
((test function index ((argument ...) result))
(test function index (equal? (argument ...) result)))
((test function (predicate (argument ...) result))
(test function #f (predicate (argument ...) result)))
((test function ((argument ...) result))
(test function #f (equal

Solution

Converting strings into lists of characters, then reassembling strings, is quite wasteful of memory. The usual implementation technique for this function should be to scan for the character you're looking for, and then use substring to extract the substring.

However, as the OP correctly points out, in R7RS, string-ref should be avoided because it's O(n). The only sensible fast solutions involve string cursors (which are not portable) and string ports. The below implementation uses the latter:

(define (string-split str delim)
  (define in (open-input-string str))
  (let recur ((out (open-output-string)))
    (define c (read-char in))
    (cond ((eof-object? c)
           (list (get-output-string out)))
          ((char=? c delim)
           (cons (get-output-string out)
                 (recur (open-output-string))))
          (else
           (write-char c out)
           (recur out)))))


That's a left-unfolding (non-tail-recursive) solution. It's easy enough to adapt to a right-unfolding solution using an accumulator:

(define (string-split str delim)
  (define in (open-input-string str))
  (let loop ((rv '()) (out (open-output-string)))
    (define c (read-char in))
    (cond ((eof-object? c)
           (reverse (cons (get-output-string out) rv)))
          ((char=? c delim)
           (loop (cons (get-output-string out) rv)
                 (open-output-string)))
          (else
           (write-char c out)
           (loop rv out)))))


For reference, here's the "ideal" solution using Chibi's string cursor interface:

(import (chibi string))
(define (string-split str delim)
  (define start (string-cursor-start str))
  (let loop ((rv '())
             (cur (string-cursor-end str))
             (last (string-cursor-end str)))
    (if (string-cursor=? cur start)
        (cons (substring-cursor str cur last) rv)
        (let ((prev (string-cursor-prev str cur)))
          (if (char=? (string-cursor-ref str prev) delim)
              (loop (cons (substring-cursor str cur last) rv) prev prev)
              (loop rv prev last))))))


Notice that I iterate through the string backwards so that I can use a right-unfolding implementation (tail-recursive) that does not require reversing at the end. This is the ideal no-compromises implementation, but of course it's Chibi-specific.

Code Snippets

(define (string-split str delim)
  (define in (open-input-string str))
  (let recur ((out (open-output-string)))
    (define c (read-char in))
    (cond ((eof-object? c)
           (list (get-output-string out)))
          ((char=? c delim)
           (cons (get-output-string out)
                 (recur (open-output-string))))
          (else
           (write-char c out)
           (recur out)))))
(define (string-split str delim)
  (define in (open-input-string str))
  (let loop ((rv '()) (out (open-output-string)))
    (define c (read-char in))
    (cond ((eof-object? c)
           (reverse (cons (get-output-string out) rv)))
          ((char=? c delim)
           (loop (cons (get-output-string out) rv)
                 (open-output-string)))
          (else
           (write-char c out)
           (loop rv out)))))
(import (chibi string))
(define (string-split str delim)
  (define start (string-cursor-start str))
  (let loop ((rv '())
             (cur (string-cursor-end str))
             (last (string-cursor-end str)))
    (if (string-cursor=? cur start)
        (cons (substring-cursor str cur last) rv)
        (let ((prev (string-cursor-prev str cur)))
          (if (char=? (string-cursor-ref str prev) delim)
              (loop (cons (substring-cursor str cur last) rv) prev prev)
              (loop rv prev last))))))

Context

StackExchange Code Review Q#75172, answer score: 4

Revisions (0)

No revisions yet.