HiveBrain v1.2.0
Get Started
← Back to all entries
patterncsharpMinor

Composing CPU instructions by merging four short hex strings

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
instructionscomposingmergingshortfourhexcpustrings

Problem

My code shares some values with its neighbor. The names "Instructions, Registers" can be ignored in the comments; just see them as "names". What's important is how they are shared, which can be seen in the calculations.

What I want is to improve this code, because it looks awful and one can barely understand what's going on.

hexShared = {"80","3E","14"}; //lwz r31, -0x0018(r20)


//Merge Hex Values that are shared (Basically every other is shared with the next one except the Address)
    private static string mergeHex(string[] hexShared)
    {

        //[ ][ ][ ][ ]
        //[0  1][2  3]
        string s1 = hexShared[0]; //Instruction
        string s2 = hexShared[1]; //Register 1
        string s3 = hexShared[2]; //Register 2

        char c1 = s1[1]; //Instruction Shared with Register 1
        char c2 = s2[0]; //Register 1 Shared with Instruction
        char c3 = s2[1]; //Register 1 Shared with Register 2
        char c4 = s3[0]; //Register 2 Shared with Register 1
        char c5 = s3[1]; //Register 2

        string hex = AddHex(c1, c2);

        string hex2 = AddHex(c3, c4);

        hex = s1[0] + hex + hex2 + c5;

        return hex;
    }


I will try explain with an example (though I barely get it myself).

We have a Hexdecimal of 8 characters (that's the structure always).

83340247


Now we can split it up, the last 4 are the "Address" and it can be taken out.
So what remains are Hexdecimals that share values.

8334


Now for example the code:

lwz r0, 0x0000(r0)


will translate to: 80000000
So lwz == "8" here.

lwz r1, 0x0000(r0) == 80200000


So the first "r1" equals "2" right?

lwz r1, 0x0000(r1) == 80210000


It all looks fine, everything is separated, the other "r1" is simply "1".

Now here is the dilemma, when they reach values higher than one Hexdecimal can represent.

lwz r1, 0x0000(r31) == 803F0000
lwz r31, 0x0000(r31) == 83FF0000


So as you can see, when they increase size, they

Solution

Assembly opcodes are not constructed with string manipulations. They are very carefully designed with bit positioning, so a simple add operation between two integers might not be the best way to describe it.

So your mergeHex method that god-knows-what does with strings should become a method that might do some bit shifting, masks, ...

I sampled the behavior of your mergeHex with two calls (which is not enough to know what it does for all your possible scenarios), but anyway I reached the following conclusion:

  • The output is a 16bits (word) hex string.



  • The first 8 bits are given by hexShared[0] | ((hexshared[1] & 0xF0) >> 4)



  • The last 8 bits are given by `((hexshared[1] & 0x0F)



Turning this into an algorithm becomes now trivial, let me suggest an implementation with some simplifications:

private static string ToWord(string[] hexTokens){
    var values = hexTokens
        .Select(t => HexToInt(t))
        .ToArray();

    var result = values[0] << 8 | values[1] << 4 | values[2];
    return result.ToString("X1");
}

Code Snippets

private static string ToWord(string[] hexTokens){
    var values = hexTokens
        .Select(t => HexToInt(t))
        .ToArray();

    var result = values[0] << 8 | values[1] << 4 | values[2];
    return result.ToString("X1");
}

Context

StackExchange Code Review Q#147660, answer score: 5

Revisions (0)

No revisions yet.