patterncsharpMinor
Looping through an Excel document in C#
Viewed 0 times
loopingdocumentexcelthrough
Problem
Before looping through an Excel document using a library I found, and wanted to know how long it would take to loop through the whole thing.
There are 40k rows.
I looped through only the first 5k and it took just 2 minutes. So, I thought, that looping through 40k would take just 16 minutes total.
However, the program has been running for over an hour now, and it still hasn't completed.
What would be slowing this thing down? Granted, I didn't program it for high efficiency, but there must be something that is slowing this down... Why wasn't my initial estimate accurate?
Here is my method:
```
public static void Validate(ExcelConfiguration config)
{
var ctx = new IntakeEntities();
Stopwatch sw = new Stopwatch();
sw.Start();
HashSet missingEntities = new HashSet();
Console.WriteLine("Validating...");
using (ExcelPackage xlPackage = new ExcelPackage(config.file))
{
ExcelWorksheet worksheet = xlPackage.Workbook.Worksheets[config.WorkSheet];
// loop through each row
for (int iRow = config.DataRowStart; iRow x.Name == homeCountryName).First().Id;
//Console.WriteLine(homeCountryId + " - " + homeCountryName);
}
catch (Exception ex)
{
//Console.WriteLine("No record found for " + homeCountryName);
missingEntities.Add(homeCountryName);
}
// get host country from database
string hostCountryName = worksheet.Cell(iRow, config.HostCountryColumnIndex).Value.Trim();
Guid? hostCountryId = null;
try
{
hostCountryId = ctx.Countries.AsNoTracking().Where(x => x.Name == hostCountryName).First().Id;
//Console.WriteLine(hostCountryId + " - " + hostCountryName);
}
catch (Exception ex)
{
//Console.WriteLine("No record found for " + hostCountryName);
missingEntities.Add(hostC
There are 40k rows.
I looped through only the first 5k and it took just 2 minutes. So, I thought, that looping through 40k would take just 16 minutes total.
However, the program has been running for over an hour now, and it still hasn't completed.
What would be slowing this thing down? Granted, I didn't program it for high efficiency, but there must be something that is slowing this down... Why wasn't my initial estimate accurate?
Here is my method:
```
public static void Validate(ExcelConfiguration config)
{
var ctx = new IntakeEntities();
Stopwatch sw = new Stopwatch();
sw.Start();
HashSet missingEntities = new HashSet();
Console.WriteLine("Validating...");
using (ExcelPackage xlPackage = new ExcelPackage(config.file))
{
ExcelWorksheet worksheet = xlPackage.Workbook.Worksheets[config.WorkSheet];
// loop through each row
for (int iRow = config.DataRowStart; iRow x.Name == homeCountryName).First().Id;
//Console.WriteLine(homeCountryId + " - " + homeCountryName);
}
catch (Exception ex)
{
//Console.WriteLine("No record found for " + homeCountryName);
missingEntities.Add(homeCountryName);
}
// get host country from database
string hostCountryName = worksheet.Cell(iRow, config.HostCountryColumnIndex).Value.Trim();
Guid? hostCountryId = null;
try
{
hostCountryId = ctx.Countries.AsNoTracking().Where(x => x.Name == hostCountryName).First().Id;
//Console.WriteLine(hostCountryId + " - " + hostCountryName);
}
catch (Exception ex)
{
//Console.WriteLine("No record found for " + hostCountryName);
missingEntities.Add(hostC
Solution
- Have performance issues? Use a profiler. (There is a built-in in certain versions of Visual Studio and a bunch of third-party tools, with evaluation times). Find out what is taking time and we can give you an advice for optimization after that.
- As already mentioned, do not use exceptions for normal program flow. Use
FirstOrDefault()method instead ofFirst()so you don't get an exception for missing items.
- If you want a guess - you have 4 database queries per row => 160k queries. Either adjust database indexes, or load whole tables into
Dictionary<>if they are small enough (Countries table, for example, is probably small)
Context
StackExchange Code Review Q#47368, answer score: 6
Revisions (0)
No revisions yet.