debugcsharpMinor
Reading data from Excel sheet with ExcelDataReader
Viewed 0 times
readingexceldatareaderexcelwithsheetfromdata
Problem
Objective:
I want to import an Excel file, and read the rows of certain columns. For this, I use
Major concerns:
Because this is code for a company, I changed the names of a couple of classes, so I know they aren't the best of names.
The entry point of my code:
In this code I create a new
Concern here: is it okay to pass those things within that file?
The
```
public class ReadInData
{
private string path;
private string worksheetName;
public ReadInData(string path, string worksheetName)
{
this.
I want to import an Excel file, and read the rows of certain columns. For this, I use
ExcelDataReader. I've implemented a low-level class called ExcelData which uses the ExcelDataReader and does things like figuring out if it is an ".xls" of ".xslx" file (or maybe something completely unrelated!) etc. On top of that class I made a ReadInData class, which will get the specific columns I want from the Excel sheet.Major concerns:
- The list of catches in my main program
- Throwing of exceptions
- The overall construction of the code and code quality
- Should I encapsulate my
ExcelDataclass withinReadInData, or keep it like it is?
- Passing of the
isFirstRowAsColumnNamesparameter
Because this is code for a company, I changed the names of a couple of classes, so I know they aren't the best of names.
The entry point of my code:
class Program
{
static void Main(string[] args)
{
try
{
ReadInData readInData = new ReadInData(@"C:\SC.xlsx", "sc_2014");
IEnumerable recipients = readInData.GetData();
}
catch (FileNotFoundException ex)
{
Console.WriteLine(ex.Message);
}
catch (ArgumentException ex)
{
Console.WriteLine(ex.Message);
}
catch (WorksheetDoesNotExistException ex)
{
Console.WriteLine(ex.Message);
}
catch (FileToBeProcessedIsNotInTheCorrectFormatException ex)
{
Console.WriteLine(ex.Message);
}
Console.ReadLine();
}
}In this code I create a new
ReadInData class, to which I pass the path, the name of the file and the name of the Excel sheet that I want to read.Concern here: is it okay to pass those things within that file?
The
ReadInData class:```
public class ReadInData
{
private string path;
private string worksheetName;
public ReadInData(string path, string worksheetName)
{
this.
Solution
Simplify your catch-chain
I see no reason to have
You can do this same thing for your
If you don't want/need someone using your
As mentioned in the comments, you are not accounting for all Excel filetypes.
If it wasn't for the
static void Main(string[] args)
{
try
{
ReadInData readInData = new ReadInData(@"C:\SC.xlsx", "sc_2014");
IEnumerable recipients = readInData.GetData();
}
catch (Exception ex)
{
if(!(ex is FileNotFoundException || ex is ArgumentException || ex is FileToBeProcessedIsNotInTheCorrectFormatException))
throw;
Console.WriteLine(ex.Message);
}
Console.Write(Press any key to continue...);
Console.ReadKey(true);
}I see no reason to have
ReadInData be a non-static class. You are not taking advantage of the fact that you are remembering the path and worksheet name, and you aren't keeping some sort of open connection to the workbook. And always feel free to make your code look more complex by removing variables that are only being used once (optional). There is no reason to ToList() this, because you are returning an IEnumerable anyway.public static class ReadInData
{
public static IEnumerable GetData(string path, string worksheetName, bool isFirstRowAsColumnNames = true)
{
return new ExcelData(path).GetData(worksheetName, isFirstRowAsColumnNames)
.Select(dataRow => new Recipient()
{
Municipality = dataRow["Municipality"].ToString(),
Sexe = dataRow["Sexe"].ToString(),
LivingArea = dataRow["LivingArea"].ToString()
});
}
}You can do this same thing for your
ExcelData class, that is.. making these functions instead of methods on a class. You don't have to, but again, you aren't taking much advantage of the OO, so there is need for you to make it OO.If you don't want/need someone using your
ExcelData class, than encapsulate it. However I feel that you will at some point want to be able to re-use this excel reader, in which case I would move these methods into a Static ExcelHelperClass. And your first class ReadInData I would just make a method in your original program.private static IExcelDataReader GetExcelDataReader(string path, bool isFirstRowAsColumnNames)
{
using (FileStream fileStream = File.Open(path, FileMode.Open, FileAccess.Read))
{
IExcelDataReader dataReader;
if (path.EndsWith(".xls"))
dataReader = ExcelReaderFactory.CreateBinaryReader(fileStream);
else if (path.EndsWith(".xlsx"))
dataReader = ExcelReaderFactory.CreateOpenXmlReader(fileStream);
else
throw new FileToBeProcessedIsNotInTheCorrectFormatException("The file to be processed is not an Excel file");
dataReader.IsFirstRowAsColumnNames = isFirstRowAsColumnNames;
return dataReader;
}
}
private static DataSet GetExcelDataAsDataSet(string path, bool isFirstRowAsColumnNames)
{
return GetExcelDataReader(path, isFirstRowAsColumnNames).AsDataSet();
}
private static DataTable GetExcelWorkSheet(string path, string workSheetName, bool isFirstRowAsColumnNames)
{
DataTable workSheet = GetExcelDataAsDataSet(path, isFirstRowAsColumnNames).Tables[workSheetName];
if (workSheet == null)
throw new WorksheetDoesNotExistException(string.Format("The worksheet {0} does not exist, has an incorrect name, or does not have any data in the worksheet", workSheetName));
return workSheet;
}
private static IEnumerable GetData(string path, string workSheetName, bool isFirstRowAsColumnNames = true)
{
return from DataRow row in GetExcelWorkSheet(path, workSheetName, isFirstRowAsColumnNames).Rows select row;
}As mentioned in the comments, you are not accounting for all Excel filetypes.
If it wasn't for the
bool isFirstRowAsColumnNames I would suggest you actually go forward with the whole OOP thing, and have the workbook load when you construct it, and take advantage of OO design by actually having internal data besides just the path. I do think its is fine that you give them the option to specify isFirstRowAsColumnNames, that could be very handy.Code Snippets
static void Main(string[] args)
{
try
{
ReadInData readInData = new ReadInData(@"C:\SC.xlsx", "sc_2014");
IEnumerable<Recipient> recipients = readInData.GetData();
}
catch (Exception ex)
{
if(!(ex is FileNotFoundException || ex is ArgumentException || ex is FileToBeProcessedIsNotInTheCorrectFormatException))
throw;
Console.WriteLine(ex.Message);
}
Console.Write(Press any key to continue...);
Console.ReadKey(true);
}public static class ReadInData
{
public static IEnumerable<Recipient> GetData(string path, string worksheetName, bool isFirstRowAsColumnNames = true)
{
return new ExcelData(path).GetData(worksheetName, isFirstRowAsColumnNames)
.Select(dataRow => new Recipient()
{
Municipality = dataRow["Municipality"].ToString(),
Sexe = dataRow["Sexe"].ToString(),
LivingArea = dataRow["LivingArea"].ToString()
});
}
}private static IExcelDataReader GetExcelDataReader(string path, bool isFirstRowAsColumnNames)
{
using (FileStream fileStream = File.Open(path, FileMode.Open, FileAccess.Read))
{
IExcelDataReader dataReader;
if (path.EndsWith(".xls"))
dataReader = ExcelReaderFactory.CreateBinaryReader(fileStream);
else if (path.EndsWith(".xlsx"))
dataReader = ExcelReaderFactory.CreateOpenXmlReader(fileStream);
else
throw new FileToBeProcessedIsNotInTheCorrectFormatException("The file to be processed is not an Excel file");
dataReader.IsFirstRowAsColumnNames = isFirstRowAsColumnNames;
return dataReader;
}
}
private static DataSet GetExcelDataAsDataSet(string path, bool isFirstRowAsColumnNames)
{
return GetExcelDataReader(path, isFirstRowAsColumnNames).AsDataSet();
}
private static DataTable GetExcelWorkSheet(string path, string workSheetName, bool isFirstRowAsColumnNames)
{
DataTable workSheet = GetExcelDataAsDataSet(path, isFirstRowAsColumnNames).Tables[workSheetName];
if (workSheet == null)
throw new WorksheetDoesNotExistException(string.Format("The worksheet {0} does not exist, has an incorrect name, or does not have any data in the worksheet", workSheetName));
return workSheet;
}
private static IEnumerable<DataRow> GetData(string path, string workSheetName, bool isFirstRowAsColumnNames = true)
{
return from DataRow row in GetExcelWorkSheet(path, workSheetName, isFirstRowAsColumnNames).Rows select row;
}Context
StackExchange Code Review Q#47227, answer score: 8
Revisions (0)
No revisions yet.