HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Was there an attempt to make reusable regular expressions?

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
makeregularexpressionsattemptreusablewasthere

Problem

In everyday practice I often encounter tasks which would benefit from being able to define aliases for chunks of regular expressions to reuse them later. Typical examples include: parsing a floating point number, a zip code or a telephone number. (Zip code may become particularly unwieldy).

Suppose you could define a grammar for parsing floating point numbers, and then give it an alias, once defined you would be able to reuse this in regular expressions you use afterwards. In some fictional language this could look like this:

float := /[+-]?(\d+|\d*\.d+)(e\d+)?/
sum := /(?R)+(?R)/


This, in principle, wouldn't change anything in terms of expressive power (unless you allow self-reference), but it seems like it could make the practical implementations of regular expressions more concise?

So, my question is: has any language implemented anything like that? Or, and if not, then what would be the reason for not doing so?

Solution

Lex, the lexical analyser generator, supports this notation. In the "definitions" section, you can define patterns which you can then use in the "rules" section:

DIGIT    [0-9]

%%

{DIGIT}+"."{DIGIT}*  { return T_FNUM; }


Most lexer generators have something similar; ANTLR has "fragments", for example.

Code Snippets

DIGIT    [0-9]

%%

{DIGIT}+"."{DIGIT}*  { return T_FNUM; }

Context

StackExchange Computer Science Q#35557, answer score: 5

Revisions (0)

No revisions yet.