HiveBrain v1.2.0
Get Started
← Back to all entries
snippetrustMinor

Rust function to read the first line of a file, strip leading hashes and whitespace

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
leadingthefilelinereadwhitespaceandfunctionfirsthashes

Problem

I’m writing a Rust function for getting a title based on the first line of a file.

The files are written in Markdown, and the first line should be a heading that starts with one or more hashes, followed by some text. Examples:

# This is a top-level heading
## This is a second-level heading
#### Let's jump straight to the fourth-level heading


I want to throw away the leading hashes, discard any leading/trailing whitespace, and return the remaining string. Example outputs:

"This is a top-level heading"
"This is a second-level heading"
"Let's jump straight to the fourth-level heading"


Assume that, for now, I’m not worried about edge cases like a first line that’s only whitespace and hashes, or a file whose first line is pathologically long.

This is the program I’ve written to do it:

use std::fs;
use std::io::{BufRead, BufReader};
use std::path::PathBuf;

/// Get the title of a Markdown file.
///
/// Reads the first line of a Markdown file, strips any hashes and
/// leading/trailing whitespace, and returns the title.    
fn title_string(path: PathBuf) -> String {

    // Read the first line of the file into `title`.
    let file = match fs::File::open(&path) {
        Ok(file) => file,
        Err(_) => panic!("Unable to read title from {:?}", &path),
    };
    let mut buffer = BufReader::new(file);
    let mut first_line = String::new();
    let _ = buffer.read_line(&mut first_line);

    // Where do the leading hashes stop?
    let mut last_hash = 0;
    for (idx, c) in first_line.chars().enumerate() {
        if c != '#' {
            last_hash = idx;
            break
        }
    }

    // Trim the leading hashes and any whitespace
    let first_line: String = first_line.drain(last_hash..).collect();
    let first_line = String::from(first_line.trim());

    first_line
}

fn main() {
    let title = title_string(PathBuf::from("./example.md"));
    println!("The title is '{}'", title);
}


I’m fairly new to Rust, and I’m sure I’m doing stuff th

Solution

-
Clippy returns a helpful suggestion:

warning: returning the result of a let binding from a block.
         Consider returning the expression directly.
         #[warn(let_and_return)] on by default
   |>
   |>     first_line
   |>     ^^^^^^^^^^
note: this expression can be directly returned
   |>
   |>     let first_line = String::from(first_line.trim());
   |>                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^


-
Don't ignore Results by using let _ =! You should always return them or use expect or unwrap.

-
The loop could be simplified by using take_while:

let last_hash = first_line.chars().take_while(|&c| c == '#').count();


-
However, treating one character as one byte is a bad idea because strings are UTF-8 encoded. UTF-8 is a variable-length encoding. You can use char_indices instead.

-
It's slightly more efficient to take a slice of the string, instead of drain and collect here. It avoids one extra allocation.

-
Change your function to accept any type that implements BufRead; this will allow you to write easier unit tests.

-
Actually add those unit tests!

-
There's no need to create a PathBuf, you aren't pushing path components on. You could just make a &Path, but most functions accept any type that can be converted to a Path (AsRef). &str implements that.

-
There's no need to take a reference to something being passed to println! or panic!. These macros automatically take a reference.

use std::fs;
use std::io::BufReader;
use std::io::prelude::*;

/// Get the title of a Markdown file.
///
/// Reads the first line of a Markdown file, strips any hashes and
/// leading/trailing whitespace, and returns the title.
fn title_string(mut rdr: R) -> String
    where R: BufRead,
{
    let mut first_line = String::new();

    rdr.read_line(&mut first_line).expect("Unable to read line");

    // Where do the leading hashes stop?
    let last_hash = first_line
        .char_indices()
        .skip_while(|&(_, c)| c == '#')
        .next()
        .map_or(0, |(idx, _)| idx);

    // Trim the leading hashes and any whitespace
    first_line[last_hash..].trim().into()
}

/// Read the first line of the file into `title`.
fn main() {
    let path = "./example.md";

    let file = match fs::File::open(path) {
        Ok(file) => file,
        Err(_) => panic!("Unable to read title from {}", path),
    };
    let buffer = BufReader::new(file);

    let title = title_string(buffer);

    println!("The title is '{}'", title);
}

#[cfg(test)]
mod test {
    use super::title_string;

    #[test]
    fn top_level_heading() {
        assert_eq!(title_string(b"# This is a top-level heading".as_ref()),
                   "This is a top-level heading")
    }

    #[test]
    fn second_level_heading() {
        assert_eq!(title_string(b"## This is a second-level heading".as_ref()),
                   "This is a second-level heading");
    }

    #[test]
    fn fourth_level_heading() {
        assert_eq!(title_string(b"#### Let's jump straight to the fourth-level heading".as_ref()),
                   "Let's jump straight to the fourth-level heading");
    }
}


You should also investigate using a real Markdown parser to avoid nasty pitfalls.

Code Snippets

warning: returning the result of a let binding from a block.
         Consider returning the expression directly.
         #[warn(let_and_return)] on by default
   |>
   |>     first_line
   |>     ^^^^^^^^^^
note: this expression can be directly returned
   |>
   |>     let first_line = String::from(first_line.trim());
   |>                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
let last_hash = first_line.chars().take_while(|&c| c == '#').count();
use std::fs;
use std::io::BufReader;
use std::io::prelude::*;

/// Get the title of a Markdown file.
///
/// Reads the first line of a Markdown file, strips any hashes and
/// leading/trailing whitespace, and returns the title.
fn title_string<R>(mut rdr: R) -> String
    where R: BufRead,
{
    let mut first_line = String::new();

    rdr.read_line(&mut first_line).expect("Unable to read line");

    // Where do the leading hashes stop?
    let last_hash = first_line
        .char_indices()
        .skip_while(|&(_, c)| c == '#')
        .next()
        .map_or(0, |(idx, _)| idx);

    // Trim the leading hashes and any whitespace
    first_line[last_hash..].trim().into()
}

/// Read the first line of the file into `title`.
fn main() {
    let path = "./example.md";

    let file = match fs::File::open(path) {
        Ok(file) => file,
        Err(_) => panic!("Unable to read title from {}", path),
    };
    let buffer = BufReader::new(file);

    let title = title_string(buffer);

    println!("The title is '{}'", title);
}

#[cfg(test)]
mod test {
    use super::title_string;

    #[test]
    fn top_level_heading() {
        assert_eq!(title_string(b"# This is a top-level heading".as_ref()),
                   "This is a top-level heading")
    }

    #[test]
    fn second_level_heading() {
        assert_eq!(title_string(b"## This is a second-level heading".as_ref()),
                   "This is a second-level heading");
    }

    #[test]
    fn fourth_level_heading() {
        assert_eq!(title_string(b"#### Let's jump straight to the fourth-level heading".as_ref()),
                   "Let's jump straight to the fourth-level heading");
    }
}

Context

StackExchange Code Review Q#135013, answer score: 6

Revisions (0)

No revisions yet.