patterncMinor

Encoding data into integer handles for language interpreter

Submitted by: @import:stackexchange-codereview·Mar 10, 2026·

Viewed 0 times

interpreterhandlesintolanguageforencodingdatainteger

Problem

As an integral part (pun intended) of my APL interpreter (previous questions: 1 2 3 4), this suite of functions converts any data into integer handles. APL operates on arrays of data cells, so with this encoding my cells are all int but may semantically contain other types. This enables me to have heterogeneous arrays with a homogeneous representation. Elements can even be arrays nested to arbitrary depths.

common.h contains definitions that are needed all over the place in the source. Some are not strictly relevant here, but the master enum of tags and the various implicit pointers, ie. typedef struct verb *verb, are important to be aware of. Hiding pointers in typedefs is questionable, to be sure. But I hope that the very short list of them compensates somewhat. array, verb, xverb, symtab, and magic are all (the) implicit pointers-to-struct types. I omit the contents of ../ppnarg.h because it is not relevant here (but common enough to be common).

common.h:

``


/*
 *  The central concept of encoding data is the use of the basic

int

 type
   for "everything". We chop the 32 bits into an 8 bit tag[] and 24 bit value.
 *  So we can't deal with literal numbers that are larger than 16.7 million 
 *  or so. 
 *
 *  An

int

 which contains one of our encoded-integer values should be 
 *  declared

object` to convey this semantics to the reader.
* Conversely, having determined that an object's tag is LITERAL,
* code may feel free to treat it as a restricted-range integer value.
*
[] since we treat negative numbers as encoding to themselves, in essence
* we only have a 7bit tag to play with.
*/

#ifndef COMMON_H_
#define COMMON_H_

#include
#include
#include
#include "../ppnarg.h"

#define MODE1(x) (x+(1<<7)) //add hi bit of ascii char

typedef int object;

typedef union integer {
uint32_t uint32;
int32_t int32;
} integer;

enum tag {
LITERAL, / val is a 24-bit 2's comp integer /
CHAR, /* val is a 21-bit Unic

Solution

Any problem spots or infelicities? Any "too clever" spots?

Overall: Recommend to not causally mix int/unsigned, int32_t/uint32_t and size_t. Too many cases where code would fail by using both pairs. Suggest using int/unsigned instead of int32_t/uint32_t with a #if INT_MAX >= 2147483647 test. (or fix/test the casual conversions.)

-
Disagree with not detailing ../ppnarg.h. If the contents of ../ppnarg.h are common enough to be common, why is not common.h also considered common enough to be not posted? Lacking this tile, a review cannot completely compile the code. Without compilation, automated tools are prevented from use. Without automated tools, this review request becomes unnecessarily extra burdensome.

-
common.h is too common. Consider your successful code being used on a larger effort. Perhaps APL_common.h.?

-
What/where is #include "array.h"?

-
"retrived" --> "retrieved"

-
Unclear double fallthru with the comment /fallthru/ at the end of a switch statement. It appears to apply to the case PCHAR: which then falls though to the default: Strange to see a case after default. Code uses a /fallthru/ comment there but lacks 2 such comments after case 0x00d7: and case 0x00f7:. This inconsistent style confuses. Top level switch/case indents the case. Nested switch/case does not indent the case. Inconsistent style unnecessarily confuses the review. This implies OP is not using automated formatting. Coding styles that depend on manual formatting are high maintenance. Use an automated formatter.

switch(gettag(d)){
    case PCHAR:
        switch(getval(d)){
        case '+':
                          return 0;
        case 0x00d7:  // Times
        case 0x00f7:  // Divided-By
        case '*':
             return 1;
        } /*fallthru*/
    default:
    case LITERAL:

-
Incomplete prototype declaration. Using () allows following code to wrongly pass parameters and the compiler does not warn.

// void init_en();
void init_en(void);

-
Defeat of data abstraction. integer does not imply size, bit of some size. int32 implies valuable is 32-bit. I'd expect either 1) integer to be integer32 if it is always going to be 32-bit or 2) if integer is abstract, int32 have a name without an implied size.

int gettag(object d) {
  integer int32;
  int32.int32 = d;

-
Possible data loss. d is of type object (presently int) and int32.int32 is of type int32_t. This inconsistency should be resolved. Recommend typedef union integer { unsigned u32; int i32; } integer;

int32.int32 = d;  // Potential narrowing of type.

int getval(object d) {
  if (d < 0) return d;  // Potential narrowing of type.

-
Even discussing the type array and struct array is confusing in context with code that uses arrays. Consider a different name.

// typedef struct array *array;
// Suggest
typedef struct APL_array *APL_array;

-
Not a fan of using the same name for a type and struct/union. Although these exist in different name spaces in C, they readily collide in a reviewers understanding. Suggest typedef struct APL_array_S *APL_array; or drop the typedef and only use struct APL_array.

-
Pedantic: Not portable when int

-
Pedantic: Printing int values with %x outside the [0...INT_MAX] range is UB.

int x = ...
// DEBUG(3,"newdata %x(%d %d)\n", x, tag, val);
DEBUG(3,"newdata %x(%d %d)\n", (unsigned) x, tag, val);

-
Possible loss of info in changing types.

size_t *used
...
int z = (*used)++;

-
The type name object misleads. In C, an object can be a struct, union, complex double, etc. Much code depends on object being a real value.

void *getptr(object d) {
  if (d < 0) return NULL;

-
Adding common name to global space. These will certainly collide on a large project. Assume your good code will be re-used on larger applications. Suggest putting all the following in a struct variable, maybe called APL.

integer nulldata; // = { .data = { .tag = NULLOBJ, .val = 0 } };
object null /* = nulldata.int32 */;
integer markdata; // = { .data = { .tag = MARKOBJ, .val = 0 } };
object mark /* = markdata.int32 */;
object nil;
object blank;

Code Snippets

switch(gettag(d)){
    case PCHAR:
        switch(getval(d)){
        case '+':
                          return 0;
        case 0x00d7:  // Times
        case 0x00f7:  // Divided-By
        case '*':
             return 1;
        } /*fallthru*/
    default:
    case LITERAL:

// void init_en();
void init_en(void);

int gettag(object d) {
  integer int32;
  int32.int32 = d;

int32.int32 = d;  // Potential narrowing of type.


int getval(object d) {
  if (d < 0) return d;  // Potential narrowing of type.

// typedef struct array *array;
// Suggest
typedef struct APL_array *APL_array;

Context

StackExchange Code Review Q#140994, answer score: 4

Revisions (0)

No revisions yet.