HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

String tokenization and replacing a numbered field

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
tokenizationfieldnumberedreplacingandstring

Problem

I created various functions for string manipulation in Lua, where the strings are composed of tokens separated by an ASCII character (which may be the point, comma, semicolon etc.). One of these is the function puttok()

local function tokenize(C, strng)
    local sInput = strng or ""
    local sChar = string.format('%c', C)
    local tReturn = {}
    for sWord in string.gmatch(sInput, "[^"..sChar.."]+") do
        table.insert(tReturn, tonumber(sWord) or sWord)
    end
    return tReturn
end

local function puttok(text,token,N,C)
    local char = string.format("%c", C)
    local n
    local result
    local tokens = tokenize(C, text)
    if (N == 0) or (N > #tokens) then 
        result = text
    end
    if (N) then
        n = (N > 0) and N or #tokens + N + 1
        table.remove(tokens,n)
        table.insert(tokens,n,token)
        result = table.concat(tokens,char)
    end
    return result
end


Where

  • text = string to manipulate



  • token = token (string) we want insert



  • N = position in which will be added to the token



  • C = ASCII code of the token separator



This function is to replace a token with another. Here's an example

local text = "Violets.are.white"
text = puttok(text,"blue",3,46)
print(text)
—› Violets.are.blue


The token "blue" was inserted in position 3 in place of "white," and "Violets" and "are" are respectively in position 1 and 2. The tokens are separated by a dot (.) with ASCII code 46.

Is there a way to improve this code?

Would metatables be usefull in this case?

Solution

-
Don't use the expensive string.format where string.char can do it all.

-
You should accept a string (of length 1) as separator in tokenize and puttok, and have a default.

-
Swapping the middle two arguments of puttok allows you to default the replacement to "".

-
Your second conditional is useless, as it will always be taken: The first one results in a runtime-error if N is not a number.

I wonder why you don't use an early return there anyway?

-
First removing an element and then inserting a different one at the same index is really inefficient, resulting in a need to move all higher-numbered elements first down and then back.

Just replace it already:

tokens[n] = token


-
Try for better names, so you don't need to explain so much in comments:

tokenize(C, strng) => explode(delimiter, text)
puttok(text,token,N,C) => replace_token(text, which, replacement, delimiter)
-- Switched the middle two parameters of puttok/replace_token

Code Snippets

tokens[n] = token
tokenize(C, strng) => explode(delimiter, text)
puttok(text,token,N,C) => replace_token(text, which, replacement, delimiter)
-- Switched the middle two parameters of puttok/replace_token

Context

StackExchange Code Review Q#109774, answer score: 2

Revisions (0)

No revisions yet.