patternModerate
Reverse Polish notation based compiler
Viewed 0 times
polishreversenotationbasedcompiler
Problem
Description
bhathiforth.pl
`#!/usr/bin/perl
use strict;
use warnings;
use feature qw(say);
sub tokenize {
my $fullcode = shift;
if ( not defined $fullcode ) {
die "Invalid Arguments";
}
my @tokens;
while ( $fullcode =~ /([0-9]+|\+|\-|\*|\/|\.)/g ) {
push @tokens, $1;
}
return @tokens;
}
sub generate_assembly {
my @tokens = @{ $_[0] };
if ( not @tokens ) {
die "Invalid Arguments";
}
my $assembly = "section .text\nglobal main\nextern printf\nmain:\n";
say "Tokens";
say "==================";
foreach (@tokens) {
say "";
if ( $_ =~ /[0-9]+/ ) {
$assembly .= "push $_\n";
}
elsif ( $_ eq "+" ) {
$assembly .= "pop ebx\npop eax\nadd eax,ebx\npush eax\n";
}
elsif ( $_ eq "-" ) {
$assembly .= "pop ebx\npop eax\nsub eax,ebx\npush eax\n";
}
elsif ( $_ eq "/" ) {
$assembly .= "mov edx,0\npop ecx\npop eax\ndiv ecx\npush eax\n";
}
elsif ( $_ eq "*" ) {
$assembly .= "mov edx,0\npop ecx\npop eax\nmul ecx\npush eax\n";
}
elsif ( $_ eq "." ) {
$assembly .= "push message\ncall printf\nadd esp, 8\n";
}
}
$assembly .= "ret\nmessage db \"%d\", 10, 0;";
say "==================";
return $assembly;
}
my $version = "0.1";
say "Welcome to BhathiFoth compiler v$version";
say "========================================";
my $source = shift @ARGV;
my $output = shift @ARGV;
if ( not defined $source or not defined $output ) {
say
"Invalid Commandline arguments.\n\nUSAGE:\n% ./bhathiforth.pl ";
exit;
}
open my $CODE, " ) {
$fullcode .= $line;
}
close $
- Very small subset of Forth
- This is a proof of concept level compiler, no optimizations or over/underflow checking
- See the embedded POD for more information
- NASM is used as assembler
- gcc is used to link with glibc
- 32bit ELF Binary is generated
bhathiforth.pl
`#!/usr/bin/perl
use strict;
use warnings;
use feature qw(say);
sub tokenize {
my $fullcode = shift;
if ( not defined $fullcode ) {
die "Invalid Arguments";
}
my @tokens;
while ( $fullcode =~ /([0-9]+|\+|\-|\*|\/|\.)/g ) {
push @tokens, $1;
}
return @tokens;
}
sub generate_assembly {
my @tokens = @{ $_[0] };
if ( not @tokens ) {
die "Invalid Arguments";
}
my $assembly = "section .text\nglobal main\nextern printf\nmain:\n";
say "Tokens";
say "==================";
foreach (@tokens) {
say "";
if ( $_ =~ /[0-9]+/ ) {
$assembly .= "push $_\n";
}
elsif ( $_ eq "+" ) {
$assembly .= "pop ebx\npop eax\nadd eax,ebx\npush eax\n";
}
elsif ( $_ eq "-" ) {
$assembly .= "pop ebx\npop eax\nsub eax,ebx\npush eax\n";
}
elsif ( $_ eq "/" ) {
$assembly .= "mov edx,0\npop ecx\npop eax\ndiv ecx\npush eax\n";
}
elsif ( $_ eq "*" ) {
$assembly .= "mov edx,0\npop ecx\npop eax\nmul ecx\npush eax\n";
}
elsif ( $_ eq "." ) {
$assembly .= "push message\ncall printf\nadd esp, 8\n";
}
}
$assembly .= "ret\nmessage db \"%d\", 10, 0;";
say "==================";
return $assembly;
}
my $version = "0.1";
say "Welcome to BhathiFoth compiler v$version";
say "========================================";
my $source = shift @ARGV;
my $output = shift @ARGV;
if ( not defined $source or not defined $output ) {
say
"Invalid Commandline arguments.\n\nUSAGE:\n% ./bhathiforth.pl ";
exit;
}
open my $CODE, " ) {
$fullcode .= $line;
}
close $
Solution
tokenizeThe
tokenize subroutine could be simplified:sub tokenize {
my ($code) = @_;
die "Invalid Arguments" unless defined $code;
return $code =~ m!\d+|[-+*/.]!g;
}Changes include:
- Shorter parameter name
- One-line validation
- Use global match in list context to produce a list of all matches
- Simpler regex that avoids leaning toothpick syndrome
Note that any unrecognized token is treated as a comment, which is quite lenient.
generate_assemblyFor readability, I would just pass the tokens as a list rather than as a reference to a list.
I don't recommend printing output as as side-effect: it hinders code reuse.
The assembly code for the operators could be produced by a hash lookup.
main
A convention for declaring version numbers is
our $VERSION = 0.1;An
double_underline() subroutine could be useful.sub double_underline {
my ($text) = @_;
return $text . "\n" . ('=' x len($text));
}
say double_underline("Welcome to BhathiForth compiler v$VERSION"); # Fixed typo "Foth"To read a file fully, you don't need a loop. Use "slurp mode":
local $/ = undef;
my $code = ;Code Snippets
sub tokenize {
my ($code) = @_;
die "Invalid Arguments" unless defined $code;
return $code =~ m!\d+|[-+*/.]!g;
}our $VERSION = 0.1;sub double_underline {
my ($text) = @_;
return $text . "\n" . ('=' x len($text));
}
say double_underline("Welcome to BhathiForth compiler v$VERSION"); # Fixed typo "Foth"local $/ = undef;
my $code = <$CODE>;Context
StackExchange Code Review Q#67480, answer score: 11
Revisions (0)
No revisions yet.