patternMinor
Netstring parser in common lisp
Viewed 0 times
netstringparsercommonlisp
Problem
Below is a netstring parser I wrote in Common Lisp. The docstring contains the usage and return.
(defun parse-netstring (netstring acc)
"recursively parses netstring of the form
size1:string1,size2:string2,size3:string3,
acc can be any list, or nil
returns -> (string3 string2 string1)"
(if (equalp netstring "")
acc
(let* ((netstring-split
(split-on-first-char #\: netstring))
(comma-index
(parse-integer (first netstring-split)))
(netstring-body
(second netstring-split)))
(if (not (char= #\, (aref netstring-body comma-index)))
(error 'malformed-netstring-error :netstring netstring)
(parse-netstring
(subseq netstring-body (1+ comma-index))
(cons (subseq netstring-body 0 comma-index) acc))))))Solution
First a few minor points:
-
You should give the definition of
-
You should use
-
The second parameter of
Then, let’s go to the main point: your function does not check many error conditions, and this is very bad for a function like this, that could get its input from the outside of the system (for instance, from a stream or a file). So, I think that you should do much more checks, for instance:
-
when the string does not start with a number,
-
when the first string part is less than the size,
-
when the size is greater than the whole string, etc.
The objective is to catch all the possible errors arising from calling primitive functions.
So I propose to rewrite the function in the following way, and in rewriting it I propose also a different approach to the recursion: use an auxiliary function, and recur only on the start index of the string, this make the code simpler and avoid a lot of costly
-
You should give the definition of
split-on-first-char since it is an important function in this context.-
You should use
equal instead of equalp to compare a value with a string (assuming that you cannot use string= to allow the first argument not to be a string).-
The second parameter of
parse-netstring could be defined as &optional, so that the function could be called with only the first parameter, and the second is automatically initialized with the empty list.Then, let’s go to the main point: your function does not check many error conditions, and this is very bad for a function like this, that could get its input from the outside of the system (for instance, from a stream or a file). So, I think that you should do much more checks, for instance:
-
when the string does not start with a number,
-
when the first string part is less than the size,
-
when the size is greater than the whole string, etc.
The objective is to catch all the possible errors arising from calling primitive functions.
So I propose to rewrite the function in the following way, and in rewriting it I propose also a different approach to the recursion: use an auxiliary function, and recur only on the start index of the string, this make the code simpler and avoid a lot of costly
subseq calls. Moreover, working with the indexes allow to get rid of the function split-on-first-char (again, with costly subseq calls). This is made possible also by the use of parse-integer with the junk-allowed parameter, which stops on the first non-numeric character, and by the fact that parse-integer returns two values, the number (which is NIL if it is not present), and the first position after the number. Here is the function:(defun parse-netstring (netstring &optional acc)
"recursively parses netstring of the form
size1:string1,size2:string2,size3:string3,
acc can be any list, or nil
returns -> (string3 string2 string1)"
(check-type netstring string "a string")
(check-type acc list "a list")
(let ((end (length netstring)))
(labels ((parse-aux (start acc)
(if (= start end)
acc
(multiple-value-bind (string-size semicolon-position)
(parse-integer netstring :start start :junk-allowed t)
(let ((comma-position
(when string-size (+ string-size semicolon-position 1))))
(unless (and string-size
(< comma-position end)
(<= string-size end)
(char= (char netstring semicolon-position) #\:)
(char= (char netstring comma-position) #\,))
(error 'malformed-netstring-error :netstring netstring))
(parse-aux
(1+ comma-position)
(cons (subseq netstring (1+ semicolon-position) comma-position) acc)))))))
(parse-aux 0 acc))))Code Snippets
(defun parse-netstring (netstring &optional acc)
"recursively parses netstring of the form
size1:string1,size2:string2,size3:string3,
acc can be any list, or nil
returns -> (string3 string2 string1)"
(check-type netstring string "a string")
(check-type acc list "a list")
(let ((end (length netstring)))
(labels ((parse-aux (start acc)
(if (= start end)
acc
(multiple-value-bind (string-size semicolon-position)
(parse-integer netstring :start start :junk-allowed t)
(let ((comma-position
(when string-size (+ string-size semicolon-position 1))))
(unless (and string-size
(< comma-position end)
(<= string-size end)
(char= (char netstring semicolon-position) #\:)
(char= (char netstring comma-position) #\,))
(error 'malformed-netstring-error :netstring netstring))
(parse-aux
(1+ comma-position)
(cons (subseq netstring (1+ semicolon-position) comma-position) acc)))))))
(parse-aux 0 acc))))Context
StackExchange Code Review Q#131261, answer score: 3
Revisions (0)
No revisions yet.