patternrustMinor
Conversion of Haskell CBOR decoding code into Rust
Viewed 0 times
conversionintocborhaskelldecodingcoderust
Problem
I was interested in converting the following Haskell code, that does decoding of CBOR into Rust:
#[derive(Debug)]
enum Value {
I(i64),
}
fn read_int(buf: &[u8], size: usize) -> Result {
fn f(buf: &[u8], size: usize, acc: i64) -> Result {
if size == 0 {
Ok(I(acc))
} else if buf.len() == 0 {
Err(())
} else {
f(&buf[1..], size - 1, acc Result {
match buf[0] {
0x00 ... 0x17 => Ok(I(buf[0] as i64)),
0x18 ... 0x1b => read_int(&buf[1..], (buf[0] - 0x17) as usize),
_ => Err(()),
}
}
fn main() {
let x = vec!(0x1a, 1, 0, 0);
let y = cbor_decode(&x);
println!("{:?}", y);
}
`
Is this considered good Rust?
module Main where
import Data.Word
import Data.Bits
import Control.Monad.Trans.Except
data Value = I Int deriving (Show)
read_int xs size =
f xs size 0
where
f _ 0 acc = return (I acc)
f [] _ acc = throwE ()
f (x:xs) n acc = f xs (n - 1) (acc shiftL 8 .|. fromIntegral x)
cbor_decode (x:xs)
| x >= 0x00 && x = 0x18 && x
I ended up with the following:
use Value::*;#[derive(Debug)]
enum Value {
I(i64),
}
fn read_int(buf: &[u8], size: usize) -> Result {
fn f(buf: &[u8], size: usize, acc: i64) -> Result {
if size == 0 {
Ok(I(acc))
} else if buf.len() == 0 {
Err(())
} else {
f(&buf[1..], size - 1, acc Result {
match buf[0] {
0x00 ... 0x17 => Ok(I(buf[0] as i64)),
0x18 ... 0x1b => read_int(&buf[1..], (buf[0] - 0x17) as usize),
_ => Err(()),
}
}
fn main() {
let x = vec!(0x1a, 1, 0, 0);
let y = cbor_decode(&x);
println!("{:?}", y);
}
`
Is this considered good Rust?
Solution
-
I'd expect some kind of error type to describe what the problem was. There's at least two types of error I see, but the user cannot distinguish between them.
-
Rust does not perform tail-call optimization, so it's usually better to write things iteratively, especially when there's no fixed bound.
-
Using
-
It seems very strange to read arbitrary integer lengths. Since the
-
Would encode the unit of
-
The
-
There's no reason to allocate a vector here anyway; an array works fine.
A good programming practice is to not reimplement things for no reason. To that end, there's already crates for reading numbers out of byte slices:
Even better, there's already crates for reading and writing CBOR data, such as serde_cbor. This plugs in to the de facto Rust serialization library, allowing you to operate at a higher level, defining structs that can be transformed to and from CBOR automatically.
I'd expect some kind of error type to describe what the problem was. There's at least two types of error I see, but the user cannot distinguish between them.
-
Rust does not perform tail-call optimization, so it's usually better to write things iteratively, especially when there's no fixed bound.
-
Using
is_empty is more immediately obvious as to the desired behavior.-
It seems very strange to read arbitrary integer lengths. Since the
size is a usize, that means you could read an integer that takes 4 billion bytes or more! This is especially strange considering you can only return an i64.-
Would encode the unit of
size somehow. The easiest is to add _in_bytes to the argument name.-
The
vec! macro idiomatically uses square brackets to look like arrays.-
There's no reason to allocate a vector here anyway; an array works fine.
use Value::*;
#[derive(Debug)]
enum Value {
I(i64),
}
#[derive(Debug)]
enum Error {
NotEnoughData,
InvalidInteger,
}
fn read_int(mut buf: &[u8], size_in_bytes: usize) -> Result {
let mut acc = 0;
for _ in 0..size_in_bytes {
if buf.is_empty() { return Err(Error::NotEnoughData) }
acc Result {
match buf[0] {
0x00 ... 0x17 => Ok(I(buf[0] as i64)),
0x18 ... 0x1b => read_int(&buf[1..], (buf[0] - 0x17) as usize),
_ => Err(Error::InvalidInteger),
}
}
fn main() {
let x = [0x1a, 1, 0, 0];
let y = cbor_decode(&x);
println!("{:?}", y);
}A good programming practice is to not reimplement things for no reason. To that end, there's already crates for reading numbers out of byte slices:
extern crate byteorder;
use byteorder::{ByteOrder, BigEndian};
fn read_int(buf: &[u8], size_in_bytes: usize) -> Result {
if buf.len() < size_in_bytes {
Err(Error::NotEnoughData)
} else {
Ok(I(BigEndian::read_int(buf, size_in_bytes)))
}
}Even better, there's already crates for reading and writing CBOR data, such as serde_cbor. This plugs in to the de facto Rust serialization library, allowing you to operate at a higher level, defining structs that can be transformed to and from CBOR automatically.
Code Snippets
use Value::*;
#[derive(Debug)]
enum Value {
I(i64),
}
#[derive(Debug)]
enum Error {
NotEnoughData,
InvalidInteger,
}
fn read_int(mut buf: &[u8], size_in_bytes: usize) -> Result<Value, Error> {
let mut acc = 0;
for _ in 0..size_in_bytes {
if buf.is_empty() { return Err(Error::NotEnoughData) }
acc <<= 8;
acc |= buf[0] as i64;
buf = &buf[1..];
}
Ok(I(acc))
}
fn cbor_decode(buf: &[u8]) -> Result<Value, Error> {
match buf[0] {
0x00 ... 0x17 => Ok(I(buf[0] as i64)),
0x18 ... 0x1b => read_int(&buf[1..], (buf[0] - 0x17) as usize),
_ => Err(Error::InvalidInteger),
}
}
fn main() {
let x = [0x1a, 1, 0, 0];
let y = cbor_decode(&x);
println!("{:?}", y);
}extern crate byteorder;
use byteorder::{ByteOrder, BigEndian};
fn read_int(buf: &[u8], size_in_bytes: usize) -> Result<Value, Error> {
if buf.len() < size_in_bytes {
Err(Error::NotEnoughData)
} else {
Ok(I(BigEndian::read_int(buf, size_in_bytes)))
}
}Context
StackExchange Code Review Q#158807, answer score: 2
Revisions (0)
No revisions yet.