snippetrustMinor
Rust function to read the first line of a file, strip leading hashes and whitespace
Viewed 0 times
leadingthefilelinereadwhitespaceandfunctionfirsthashes
Problem
I’m writing a Rust function for getting a title based on the first line of a file.
The files are written in Markdown, and the first line should be a heading that starts with one or more hashes, followed by some text. Examples:
I want to throw away the leading hashes, discard any leading/trailing whitespace, and return the remaining string. Example outputs:
Assume that, for now, I’m not worried about edge cases like a first line that’s only whitespace and hashes, or a file whose first line is pathologically long.
This is the program I’ve written to do it:
I’m fairly new to Rust, and I’m sure I’m doing stuff th
The files are written in Markdown, and the first line should be a heading that starts with one or more hashes, followed by some text. Examples:
# This is a top-level heading
## This is a second-level heading
#### Let's jump straight to the fourth-level heading
I want to throw away the leading hashes, discard any leading/trailing whitespace, and return the remaining string. Example outputs:
"This is a top-level heading"
"This is a second-level heading"
"Let's jump straight to the fourth-level heading"
Assume that, for now, I’m not worried about edge cases like a first line that’s only whitespace and hashes, or a file whose first line is pathologically long.
This is the program I’ve written to do it:
use std::fs;
use std::io::{BufRead, BufReader};
use std::path::PathBuf;
/// Get the title of a Markdown file.
///
/// Reads the first line of a Markdown file, strips any hashes and
/// leading/trailing whitespace, and returns the title.
fn title_string(path: PathBuf) -> String {
// Read the first line of the file into `title`.
let file = match fs::File::open(&path) {
Ok(file) => file,
Err(_) => panic!("Unable to read title from {:?}", &path),
};
let mut buffer = BufReader::new(file);
let mut first_line = String::new();
let _ = buffer.read_line(&mut first_line);
// Where do the leading hashes stop?
let mut last_hash = 0;
for (idx, c) in first_line.chars().enumerate() {
if c != '#' {
last_hash = idx;
break
}
}
// Trim the leading hashes and any whitespace
let first_line: String = first_line.drain(last_hash..).collect();
let first_line = String::from(first_line.trim());
first_line
}
fn main() {
let title = title_string(PathBuf::from("./example.md"));
println!("The title is '{}'", title);
}I’m fairly new to Rust, and I’m sure I’m doing stuff th
Solution
-
Clippy returns a helpful suggestion:
-
Don't ignore
-
The loop could be simplified by using
-
However, treating one character as one byte is a bad idea because strings are UTF-8 encoded. UTF-8 is a variable-length encoding. You can use
-
It's slightly more efficient to take a slice of the string, instead of
-
Change your function to accept any type that implements
-
Actually add those unit tests!
-
There's no need to create a
-
There's no need to take a reference to something being passed to
You should also investigate using a real Markdown parser to avoid nasty pitfalls.
Clippy returns a helpful suggestion:
warning: returning the result of a let binding from a block.
Consider returning the expression directly.
#[warn(let_and_return)] on by default
|>
|> first_line
|> ^^^^^^^^^^
note: this expression can be directly returned
|>
|> let first_line = String::from(first_line.trim());
|> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^-
Don't ignore
Results by using let _ =! You should always return them or use expect or unwrap.-
The loop could be simplified by using
take_while:let last_hash = first_line.chars().take_while(|&c| c == '#').count();-
However, treating one character as one byte is a bad idea because strings are UTF-8 encoded. UTF-8 is a variable-length encoding. You can use
char_indices instead.-
It's slightly more efficient to take a slice of the string, instead of
drain and collect here. It avoids one extra allocation.-
Change your function to accept any type that implements
BufRead; this will allow you to write easier unit tests.-
Actually add those unit tests!
-
There's no need to create a
PathBuf, you aren't pushing path components on. You could just make a &Path, but most functions accept any type that can be converted to a Path (AsRef). &str implements that.-
There's no need to take a reference to something being passed to
println! or panic!. These macros automatically take a reference.use std::fs;
use std::io::BufReader;
use std::io::prelude::*;
/// Get the title of a Markdown file.
///
/// Reads the first line of a Markdown file, strips any hashes and
/// leading/trailing whitespace, and returns the title.
fn title_string(mut rdr: R) -> String
where R: BufRead,
{
let mut first_line = String::new();
rdr.read_line(&mut first_line).expect("Unable to read line");
// Where do the leading hashes stop?
let last_hash = first_line
.char_indices()
.skip_while(|&(_, c)| c == '#')
.next()
.map_or(0, |(idx, _)| idx);
// Trim the leading hashes and any whitespace
first_line[last_hash..].trim().into()
}
/// Read the first line of the file into `title`.
fn main() {
let path = "./example.md";
let file = match fs::File::open(path) {
Ok(file) => file,
Err(_) => panic!("Unable to read title from {}", path),
};
let buffer = BufReader::new(file);
let title = title_string(buffer);
println!("The title is '{}'", title);
}
#[cfg(test)]
mod test {
use super::title_string;
#[test]
fn top_level_heading() {
assert_eq!(title_string(b"# This is a top-level heading".as_ref()),
"This is a top-level heading")
}
#[test]
fn second_level_heading() {
assert_eq!(title_string(b"## This is a second-level heading".as_ref()),
"This is a second-level heading");
}
#[test]
fn fourth_level_heading() {
assert_eq!(title_string(b"#### Let's jump straight to the fourth-level heading".as_ref()),
"Let's jump straight to the fourth-level heading");
}
}You should also investigate using a real Markdown parser to avoid nasty pitfalls.
Code Snippets
warning: returning the result of a let binding from a block.
Consider returning the expression directly.
#[warn(let_and_return)] on by default
|>
|> first_line
|> ^^^^^^^^^^
note: this expression can be directly returned
|>
|> let first_line = String::from(first_line.trim());
|> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^let last_hash = first_line.chars().take_while(|&c| c == '#').count();use std::fs;
use std::io::BufReader;
use std::io::prelude::*;
/// Get the title of a Markdown file.
///
/// Reads the first line of a Markdown file, strips any hashes and
/// leading/trailing whitespace, and returns the title.
fn title_string<R>(mut rdr: R) -> String
where R: BufRead,
{
let mut first_line = String::new();
rdr.read_line(&mut first_line).expect("Unable to read line");
// Where do the leading hashes stop?
let last_hash = first_line
.char_indices()
.skip_while(|&(_, c)| c == '#')
.next()
.map_or(0, |(idx, _)| idx);
// Trim the leading hashes and any whitespace
first_line[last_hash..].trim().into()
}
/// Read the first line of the file into `title`.
fn main() {
let path = "./example.md";
let file = match fs::File::open(path) {
Ok(file) => file,
Err(_) => panic!("Unable to read title from {}", path),
};
let buffer = BufReader::new(file);
let title = title_string(buffer);
println!("The title is '{}'", title);
}
#[cfg(test)]
mod test {
use super::title_string;
#[test]
fn top_level_heading() {
assert_eq!(title_string(b"# This is a top-level heading".as_ref()),
"This is a top-level heading")
}
#[test]
fn second_level_heading() {
assert_eq!(title_string(b"## This is a second-level heading".as_ref()),
"This is a second-level heading");
}
#[test]
fn fourth_level_heading() {
assert_eq!(title_string(b"#### Let's jump straight to the fourth-level heading".as_ref()),
"Let's jump straight to the fourth-level heading");
}
}Context
StackExchange Code Review Q#135013, answer score: 6
Revisions (0)
No revisions yet.