HiveBrain v1.2.0
Get Started
← Back to all entries
patternrustMinor

Matasano cryptopals: Conversion to base 64 (with generic types)

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
conversionmatasanogenericcryptopalswithtypesbase

Problem

My main interest is in how I have handled the generics and if it's idiomatic. I want the user to be able to specify what kind of integer type they would like to use for the encoding (u8, u16, u32, u64).

I want to keep the hex string -> vector of numbers
and vector of numbers -> base 64 string conversions separate.

My first worry is around specifying the traits of my generic. I've done so with:



Perhaps I should of created my own custom type listing all the required traits explicitly and used that?

My second worry is how I interact with raw primitive types:

FromPrimitive::from_u64(0).unwrap();


and:

ToPrimitive::to_u8(&to_encode).unwrap()


It would obviously be nicer to not have this messy conversion in cases where I know it's OK:

let a:T = 0;//vs FromPrimitive::from_u64(0).unwrap();


Feedback on these issues is most important to me, but more general comments also welcome! I do wonder if in real life the use of generics here is even useful (violates KISS?).

http://cryptopals.com/sets/1/challenges/1/ is the specific challenge this code is for.

```
extern crate num;

use self::num::traits::PrimInt;
use self::num::{ToPrimitive, FromPrimitive};
use std::mem::size_of;
use std::char;

pub fn hex_decode(hex:&str) -> Vec{
let num_of_4_bits = (size_of::()*8) / 4;
let mut hex_bytes:Vec = Vec::new();
let zero:T = FromPrimitive::from_u64(0).unwrap();
hex_bytes.resize(hex.len()/num_of_4_bits, zero);
for (index, character) in hex.chars().enumerate(){
let modulus = index % num_of_4_bits;
let shift = 4 * ((num_of_4_bits - 1) - modulus);
let decoded_hex:T = FromPrimitive::from_u32(character.to_digit(16).unwrap()).unwrap();
let hex_index = index/num_of_4_bits;
hex_bytes[hex_index] = hex_bytes[hex_index] | (decoded_hex (raw_binary: &mut Vec){
while raw_binary.len() % 3 != 0{
raw_binary.push(FromPrimitive::from_u64(0).unwrap());
}
}

fn handle_carried_bits(carried_over

Solution


  • [Style] Space after :



  • [Style] Spaces around binary operators like / or *.



  • [Style] Only one space after a comma.



  • [Style] Space before the { that opens a block.



  • 4 bits is often referred to as a nybble (because it's half a byte and programmers think we are funny).



  • [Performance] Why clone the incoming Vec?? If the function is going to take ownership of the item anyway, it might as well make use of it and change it in place.



  • [Idiomatic] There are multiple places with redundant type annotations (let foo: SomeType). Type inference can handle most of those cases. Leave them off for easier reading and refactoring.



  • [Idiomatic, Performance] Should not iterate over the indices of an array when you can just iterate over each value.



  • [UX] Split production assert!s with && into multiple for better error reporting.



  • [UX] Use assert_eq! in production code for better error reporting.



  • [UX] Use the optional last argument to assert! / assert_eq! in production code to provide better error reporting.



  • [Idiomatic, Performance] Hoist constants out of loops.



  • [Style] I was happy to see smaller functions and multiple intermediate variables and I like that, but the bodies of some functions are still dense. Perhaps some spare newlines between lines would help.



  • [Typo] based_64_encode_test should be base_64_encode_test.



  • [Idiomatic] Don't use an explicit return at the end of a function.



  • [Tests] The tests have multiple assertions per test method. Since the first error stops the test, this forces more cycles of testing to see all errors. Also means programmer has to go to the line of code to know what failed instead of reading the name of the test.



  • [Idiomatic] Memorize Iterator methods for things like map and collect. Use them all the time unless there's a good reason not to. See hex_decode for examples.



  • [Style] Small helper functions give names to concepts like "number of bits" or "zero for this type". They can also consolidate error handling like unwrap.



  • [Idiomatic] String::new is more obvious (and slightly faster for now) than "".to_string.



  • [Tests] Why is the test output transformed with format!?



```
extern crate num;

use self::num::traits::PrimInt;
use self::num::{ToPrimitive, FromPrimitive};
use std::mem::size_of;
use std::char;

fn n_bits() -> usize { size_of::() * 8 }
fn n_nybbles() -> usize { n_bits::() / 4 }
fn zero() -> T { FromPrimitive::from_u64(0).unwrap() }

pub fn hex_decode(hex: &str) -> Vec {
let zero = zero();

hex.as_bytes().chunks(n_nybbles::()).map(|chunk| {
let mut v = zero;
for &byte in chunk {
v = v (raw_binary: &mut Vec) {
let zero = zero();
while raw_binary.len() % 3 != 0 {
raw_binary.push(zero);
}
}

fn handle_carried_bits(carried_over_bits: T, previous_bits_left: usize, binary_element: T) -> (u8, usize) {
let type_length = n_bits::();
let bits_left = type_length - (6 - previous_bits_left);
let bottom_bits = binary_element >> bits_left;
let to_encode = carried_over_bits | bottom_bits;
(ToPrimitive::to_u8(&to_encode).unwrap(), bits_left)
}

// we encode 6 bits at a time, thus we need a multiple of 6 to encode
// as long as T has 2^x bits, and vector has length 3*y:
// (2^x)3y = (2^(x-1))y6 will always be a multiple of 6 so check
// size of T is a power of 2, then ensure number of elements is a
// multiple of 3
pub fn encode_to_base64(mut local_raw_binary: Vec) -> String {
assert_eq!(size_of::().count_ones(), 1);
assert!(n_bits::() > 1);

make_length_multiple_of_3(&mut local_raw_binary);

let mut base_64 = String::new();
let mut carried_over_bits = zero();
let mut previous_bits_left = 0;
let six_bit_mask = T::from_u64(0b111111).unwrap();

for element in local_raw_binary {
let (to_encode, mut bits_left) = handle_carried_bits(carried_over_bits, previous_bits_left, element);
base_64.push(encode_to_char(ToPrimitive::to_u8(&to_encode).unwrap()));
while bits_left >= 6 {
let to_encode = six_bit_mask & (element >> (bits_left - 6));
bits_left -= 6;
base_64.push(encode_to_char(ToPrimitive::to_u8(&to_encode).unwrap()));
}
carried_over_bits = six_bit_mask & (element char {
match bits {
0 ... 25 => char::from_u32('A' as u32 + bits as u32).unwrap(),
26 ... 51 => char::from_u32('a' as u32 + (bits - 26) as u32).unwrap(),
52 ... 61 => char::from_u32('0' as u32 + (bits - 52) as u32).unwrap(),
62 => '+',
63 => '/',
_ => panic!("{} cannot be encoded to a base 64 character", bits),
}
}

#[test]
fn hex_decode_test() {
let hex = "49276d206b696c6c";
let u8_bin = vec![0b01001001,0b00100111,0b01101101,0b00100000,0b01101011,0b01101001,0b01101100,0b01101100];
assert_eq!(u8_bin, hex_decode::(hex));
let u16_bin = vec![0b0100100100100111,0b0110110100100000,0b0110101101101001,0b0110110001

Code Snippets

extern crate num;

use self::num::traits::PrimInt;
use self::num::{ToPrimitive, FromPrimitive};
use std::mem::size_of;
use std::char;

fn n_bits<T>() -> usize { size_of::<T>() * 8 }
fn n_nybbles<T>() -> usize { n_bits::<T>() / 4 }
fn zero<T: FromPrimitive>() -> T { FromPrimitive::from_u64(0).unwrap() }

pub fn hex_decode<T: PrimInt + FromPrimitive + ToPrimitive>(hex: &str) -> Vec<T> {
    let zero = zero();

    hex.as_bytes().chunks(n_nybbles::<T>()).map(|chunk| {
        let mut v = zero;
        for &byte in chunk {
            v = v << 4;
            let decoded = (byte as char).to_digit(16).unwrap();
            v = v | FromPrimitive::from_u32(decoded).unwrap();
        }
        v
    }).collect()
}

pub fn make_length_multiple_of_3<T: PrimInt + FromPrimitive>(raw_binary: &mut Vec<T>) {
    let zero = zero();
    while raw_binary.len() % 3 != 0 {
        raw_binary.push(zero);
    }
}

fn handle_carried_bits<T: PrimInt + FromPrimitive + ToPrimitive>(carried_over_bits: T, previous_bits_left: usize, binary_element: T) -> (u8, usize) {
    let type_length = n_bits::<T>();
    let bits_left = type_length - (6 - previous_bits_left);
    let bottom_bits = binary_element >> bits_left;
    let to_encode = carried_over_bits | bottom_bits;
    (ToPrimitive::to_u8(&to_encode).unwrap(), bits_left)
}

// we encode 6 bits at a time, thus we need a multiple of 6 to encode
// as long as T has 2^x bits, and vector has length 3*y:
// (2^x)*3*y = (2^(x-1))*y*6 will always be a multiple of 6 so check
// size of T is a power of 2, then ensure number of elements is a
// multiple of 3
pub fn encode_to_base64<T: PrimInt + FromPrimitive + ToPrimitive>(mut local_raw_binary: Vec<T>) -> String {
    assert_eq!(size_of::<T>().count_ones(), 1);
    assert!(n_bits::<T>() > 1);

    make_length_multiple_of_3(&mut local_raw_binary);

    let mut base_64 = String::new();
    let mut carried_over_bits = zero();
    let mut previous_bits_left = 0;
    let six_bit_mask = T::from_u64(0b111111).unwrap();

    for element in local_raw_binary {
        let (to_encode, mut bits_left) = handle_carried_bits(carried_over_bits, previous_bits_left, element);
        base_64.push(encode_to_char(ToPrimitive::to_u8(&to_encode).unwrap()));
        while bits_left >= 6  {
            let to_encode = six_bit_mask & (element >> (bits_left - 6));
            bits_left -= 6;
            base_64.push(encode_to_char(ToPrimitive::to_u8(&to_encode).unwrap()));
        }
        carried_over_bits = six_bit_mask & (element << (6 - bits_left));
        previous_bits_left = bits_left;
    }

    base_64
}

fn encode_to_char(bits: u8) -> char {
    match bits {
        0 ... 25 => char::from_u32('A' as u32 + bits as u32).unwrap(),
        26 ... 51 => char::from_u32('a' as u32 + (bits - 26) as u32).unwrap(),
        52 ... 61 => char::from_u32('0' as u32 + (bits - 52) as u32).unwrap(),
        62 => '+',
        63 => '/',
        _ => panic!("{} cannot be encoded to a base 64 character", bits),
    }
}
struct Constants<T> {
    zero: T,
    six: T,
    six_mask: T,
}

impl Constants<T>
    where T: FromPrimitive
{
    fn new_from_primitive() -> Option<Constants> {
         ...
    }
}
MyIterator::new(bytes).map(encode_to_char).collect()

Context

StackExchange Code Review Q#120692, answer score: 4

Revisions (0)

No revisions yet.