HiveBrain v1.2.0
Get Started
← Back to all entries
patterncsharpMinor

Extracting Excel data out of an existing Excel file

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
fileexcelextractingexistingdataout

Problem

I've made a method to extract Excel data out of an existing Excel file. It contains 3 tabs where info should be extracted from. The data should be stored to use local so it's always fast available. It is currently working but it seems like I have a lot duplicate code that could most likely be written a lot shorter.

I'm using 9 different Properties.settings (all of the type list), 3 for each column in each of the tabs. The tabs I'm using are MPU, AAUX and ACTRL. I'm thinking of making a single list, where type is a class that contains 9 properties, but how do I accomplice this? Or isn't it even a good idea to do that? All the tabs contain the same columns only with different values.

```
public void ExcelExtract(string filePath)
{
Microsoft.Office.Interop.Excel.Application excelApp = new Microsoft.Office.Interop.Excel.Application();
Excel.Workbook excelWorkbook = excelApp.Workbooks.Open(filePath);
excelSheets = excelWorkbook.Worksheets;
excelWorksheet = (Excel.Worksheet)excelSheets.get_Item("MPU");

Excel.Range eersteColumn = excelWorksheet.UsedRange.Columns[1];
Excel.Range derdeColumn = excelWorksheet.UsedRange.Columns[3];
Excel.Range vierdeColumn = excelWorksheet.UsedRange.Columns[4];
Excel.Range vijfdeColumn = excelWorksheet.UsedRange.Columns[5];

System.Array eersteColumnWaarden = (System.Array)eersteColumn.Cells.Value;
System.Array derdeColumnWaarden = (System.Array)derdeColumn.Cells.Value;
System.Array vierdeColumnWaarden = (System.Array)vierdeColumn.Cells.Value;
System.Array vijfdeColumnWaarden = (System.Array)vijfdeColumn.Cells.Value;

var MPU = new List();
foreach (var a in eersteColumnWaarden) { MPU.Add(a); }
Properties.Settings.Default.listFoutcodeMPU = MPU.Select(o => o == null ? String.Empty : o.ToString()

Solution

This section is duplicated the most:

var cellValues = new List();
foreach (var a in eersteColumnWaarden) { cellValues.Add(a); }
Properties.Settings.Default.listFoutcodeMPU = cellValues.Select(o => o == null ? String.Empty : o.ToString()).ToList();


I think all it is doing is getting the cells from the columns. You need to extract to it's own method:

public List GetCellValues(Array cells)
{
    var cellValues = new List();
    foreach (var a in cells)
    {
        cellValues.Add(a);
    }
    return cellValues.Select(o => o == null ? String.Empty : o.ToString()).ToList();
}


These two lines also get duplicated:

Range eersteColumn = excelWorksheet.UsedRange.Columns[1];
System.Array eersteColumnWaarden = (System.Array)eersteColumn.Cells.Value;


You can create another method for that like this:

public Array GetCells(Worksheet worksheet, int columnNumber)
{
    Range column = worksheet.UsedRange.Columns[columnNumber];
    return (Array)column.Cells.Value;
}


Actually, you can even make GetCellValues better by calling GetCells directly:

public List GetCellValues(Worksheet worksheet, int columnNumber)
{
    var cellValues = new List();
    foreach (var a in GetCells(worksheet, columnNumber))
    {
        cellValues.Add(a);
    }
    return cellValues.Select(o => o == null ? String.Empty : o.ToString()).ToList();
}


Now, you can simplify the ExcelExtract method so that instead of doing this:

var cellValues = new List();
foreach (var a in eersteColumnWaarden) { cellValues.Add(a); }
Properties.Settings.Default.listFoutcodeMPU = cellValues.Select(o => o == null ? String.Empty : o.ToString()).ToList();
Properties.Settings.Default.Save();


You just need this for each column:

Properties.Settings.Default.listFoutcodeMPU = GetCellValues(excelWorksheet, 1);
Properties.Settings.Default.Save();


This is how it should look like after all those refactorings:

class ExcelReader
{
    private List GetCellValues(Worksheet worksheet, int columnNumber)
    {
        var cellValues = new List();
        foreach (var a in GetCells(worksheet, columnNumber))
        {
            cellValues.Add(a);
        }
        return cellValues.Select(o => o == null ? String.Empty : o.ToString()).ToList();
    }

    private Array GetCells(Worksheet worksheet, int columnNumber)
    {
        Range column = worksheet.UsedRange.Columns[columnNumber];
        return (Array)column.Cells.Value;
    }

    public void ExcelExtract(string filePath)
    {
        var excelApp = new Application();
        try
        {
            var excelWorkbook = excelApp.Workbooks.Open(filePath);
            var excelSheets = excelWorkbook.Worksheets;

            var excelWorksheet = (Worksheet)excelSheets["MPU"];
            Properties.Settings.Default.listFoutcodeMPU = GetCellValues(excelWorksheet, 1);
            Properties.Settings.Default.Save();
            Properties.Settings.Default.listGraadMPU = GetCellValues(excelWorksheet, 3);
            Properties.Settings.Default.Save();
            Properties.Settings.Default.listProbleemMPU = GetCellValues(excelWorksheet, 4);
            Properties.Settings.Default.Save();
            Properties.Settings.Default.listOplossingMPU = GetCellValues(excelWorksheet, 5);
            Properties.Settings.Default.Save();

            excelWorksheet = (Worksheet) excelSheets["AAUX"];
            Properties.Settings.Default.listFoutcodeAAUX = GetCellValues(excelWorksheet, 1);
            Properties.Settings.Default.Save();
            Properties.Settings.Default.listGraadAAUX = GetCellValues(excelWorksheet, 3);
            Properties.Settings.Default.Save();
            Properties.Settings.Default.listProbleemAAUX = GetCellValues(excelWorksheet, 4);
            Properties.Settings.Default.Save();
            Properties.Settings.Default.listOplossingAAUX = GetCellValues(excelWorksheet, 5);
            Properties.Settings.Default.Save();

            excelWorksheet = (Worksheet) excelSheets["ACTRL"];
            Properties.Settings.Default.listFoutcodeCTRL = GetCellValues(excelWorksheet, 1);
            Properties.Settings.Default.Save();
            Properties.Settings.Default.listGraadCTRL = GetCellValues(excelWorksheet, 3);
            Properties.Settings.Default.Save();
            Properties.Settings.Default.listProbleemCTRL = GetCellValues(excelWorksheet, 4);
            Properties.Settings.Default.Save();
            Properties.Settings.Default.listOplossingCTRL = GetCellValues(excelWorksheet, 5);
            Properties.Settings.Default.Save();
        }
        finally 
        {
            excelApp.Quit();   
        }
    }
}


There are still a few issues that you need to fix:

  • Handle exceptions - like what will happen if the file doesn't exist? Or if the worksheet doesn't exist?



  • Why do you need to call Save() after setting the list? Can it be called only once at the end the method?

Code Snippets

var cellValues = new List<object>();
foreach (var a in eersteColumnWaarden) { cellValues.Add(a); }
Properties.Settings.Default.listFoutcodeMPU = cellValues.Select(o => o == null ? String.Empty : o.ToString()).ToList();
public List<string> GetCellValues(Array cells)
{
    var cellValues = new List<object>();
    foreach (var a in cells)
    {
        cellValues.Add(a);
    }
    return cellValues.Select(o => o == null ? String.Empty : o.ToString()).ToList();
}
Range eersteColumn = excelWorksheet.UsedRange.Columns[1];
System.Array eersteColumnWaarden = (System.Array)eersteColumn.Cells.Value;
public Array GetCells(Worksheet worksheet, int columnNumber)
{
    Range column = worksheet.UsedRange.Columns[columnNumber];
    return (Array)column.Cells.Value;
}
public List<string> GetCellValues(Worksheet worksheet, int columnNumber)
{
    var cellValues = new List<object>();
    foreach (var a in GetCells(worksheet, columnNumber))
    {
        cellValues.Add(a);
    }
    return cellValues.Select(o => o == null ? String.Empty : o.ToString()).ToList();
}

Context

StackExchange Code Review Q#122547, answer score: 2

Revisions (0)

No revisions yet.