HiveBrain v1.2.0
Get Started
← Back to all entries
snippetcsharpModerate

Parsing CSV to specific format

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
csvparsingformatspecific

Problem

I've written this code to read CSV files written in a specific format. I would like to gather some feedback on where it could be improved. I'm trying to get into the Test-Driven-Development ideology so any way I could make the code play better with unit tests would be of particular interest to me.

An aspect I am not overly pleased with is the switch statement in the MatchHeaders function. I did try using enum representing the property names but I ran into an issue with not having constant value, and frankly it did not make the code read cleaner.

I am more interested in being pointed at what to look at and where, rather than being given code.

```
public class Contact
{
public string FirstName { get; set; }
public string LastName { get; set; }
public DateTime DateOfBirth { get; set; }
public string FullAddress { get; set; }
public string PostCode { get; set; }
public string TelephoneNumber { get; set; }
public string EmailAddress { get; set; }
}

public interface ICsvReader
{
List ParseCsv(string filePath);
}

public class Reader : ICsvReader
{
public List ParseCsv(string filePath)
{
List rows = GetRows(filePath);
string[] headers = rows[0];

return MakeContacts(headers, rows);
}

private List GetRows(string filePath)
{
List users = new List();
string[] lines = File.ReadAllLines(filePath);
var csv = from line in lines
select (line.Split(',')).ToArray();

return csv.ToList();
}

private List MakeContacts(string[] headers, List rows)
{
List contacts = new List();
// If we didn't know the headers we could contain values with their properties in a collection
// List>> csv = new List>>();
for (int i = 1; i < rows.Count; i++)
{
string[] row = rows[i];
Contact contact = new Contact();
for (int e = 0; e < row.Length; e++)
{
string

Solution

Few comments:

-
Make your code more abstract, use IEnumerable instead of List

public interface ICsvReader
 {
    IEnumerable ParseCsv(string filePath);
 }


-
Provide an asynchronous version

public interface ICsvReader
{
    IEnumerable ParseCsv(string filePath);
    Task> ParseCsvAsync(string filePath);
}


-
Use StreamReader instead of File.ReadAllLines, it has an async version that you can use in your ParseCsvAsync method

using(var stream = new StreamReader(filePath)){
 // read all lines
}


-
Try to use the var keyword for local variables, it makes refactoring easier

var rows = GetRows(filePath);


-
Use the beauty of deferred execution in C#

private IEnumerable MakeContacts(string[] headers, List rows)
{
    for (int i = 1; i < rows.Count; i++)
    {
        string[] row = rows[i];
        Contact contact = new Contact();
        for (int e = 0; e < row.Length; e++)
        {
            string header = headers[e];
            string value  = row[e];

            contact = MatchHeader(contact, header, value);
        }
        yield return contact;
     }
}


-
You can convert ToList immediately without calling ToArray

private IEnumerable> GetRows(string filePath)
  {
    // why you creating this List here???
    List users = new List();
    lines = File.ReadAllLines(filePath);
    var csv = (from line in lines
              select line.Split(',')).ToList();

    return csv;
  }


  • Consider @Snowbody suggestions about using a Dictionary instead of a massive switch.

Code Snippets

public interface ICsvReader
 {
    IEnumerable<Contact> ParseCsv(string filePath);
 }
public interface ICsvReader
{
    IEnumerable<Contact> ParseCsv(string filePath);
    Task<IEnumerable<Contact>> ParseCsvAsync(string filePath);
}
using(var stream = new StreamReader(filePath)){
 // read all lines
}
var rows = GetRows(filePath);
private IEnumerable<Contact> MakeContacts(string[] headers, List<string[]> rows)
{
    for (int i = 1; i < rows.Count; i++)
    {
        string[] row = rows[i];
        Contact contact = new Contact();
        for (int e = 0; e < row.Length; e++)
        {
            string header = headers[e];
            string value  = row[e];

            contact = MatchHeader(contact, header, value);
        }
        yield return contact;
     }
}

Context

StackExchange Code Review Q#55339, answer score: 14

Revisions (0)

No revisions yet.