HiveBrain v1.2.0
Get Started
← Back to all entries
patterncsharpMinor

Optimizing FirstOrDefault

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
optimizingfirstordefaultstackoverflow

Problem

I'm working on an application in which it takes quite a bit of time to initialize the data.

Some background: I'm creating a sort of pivot table in which I turn Figure 1 below into Figure 2. I've gone through with Stopwatchs and have been able to isolate a single line of code that takes 2 seconds during program load (it is in a loop called for roughly 2-300 rows of data, 2 seconds is the total time across all iterations). That line is marked in the code example below, Figure 3.

My question is if there is a better way to structure the example method such that I can avoid whatever overhead that line is causing. It is worth noting that when that line executes, mChar contains sometimes 0 elements, but up to around 5 or 1 (see edit below), so I can't imagine what would cause this to take so much time to execute.

Figure 1

SAMPLENUMBER | ANALYSIS_NAME | ANALYSIS_STATUS
----------------------------------------------
1234 | NO3 | I
1234 | SO4 | C
5678 | NO3 | C


Figure 2

SAMPLENUMBER | NO3 | SO4
----------------------------------------------
1234 | I | C
5678 | C |


Figure 3

private List GetPivotData(DataRow sourceRow, DataTable sourceTable)
{
int pivotStartIndex = sourceTable.Columns.IndexOf(Cols.HotDate) + 1;
var retList = new List(sourceTable.Columns.Count - pivotStartIndex);
for (int i = pivotStartIndex; i (Cols.Samplenumber) == sourceRow.Field(Cols.Samplenumber)
&& s.Field(Cols.ElementName) == elementTest[0].Trim()
&& s.Field(Cols.AnalysisNumber) == elementTest[1].Trim()
select s.Field(Cols.AnalysisStatus);
retList.Add(mChars.FirstOrDefault()); //

I've tried things like having
retList as an array rather than a list, as well as calling .ToList() on my LINQ result and using .Find(x => true) and they really haven't had any effect. The method in Figure 3 is itself inside a for`

Solution

The line with from only defines the enumeration, but it doesn't actually run over the list and apply your select clause. In your case, that one is only applied when you call FirstOrDefault.

Therefore, to speed up the call to FirstOrDefault you need to speed up your where condition. I mean this one:

s.Field(Cols.Samplenumber) == sourceRow.Field(Cols.Samplenumber)
      && s.Field(Cols.ElementName) == elementTest[0].Trim()
      && s.Field(Cols.AnalysisNumber) == elementTest[1].Trim()


The most expensive parts about this are probably the calls to s.Field(...), followed by the Trim() calls. All of these can be cached. Just convert the maindata into something that's cheap to access, and only do the conversion once, outside of the for loop. Something more or less like this (this won't run as is, only use it as inspiration):

struct TempItem
{
    public string Samplenumber;
    public string ElementName;
    public string AnalysisNumber;
    public string AnalysisStatus;
}
string sourceSampleNumber = sourceRow.Field(Cols.Samplenumber);
var lookUpList = mainData.AsEnumerable().Where(i => i.Samplenumber == sourceSampleNumber).Select(i=> new TempItem {
        Samplenumber = s.Field(Cols.Samplenumber),
        ElementName = s.Field(Cols.ElementName),
        AnalysisNumber = s.Field(Cols.AnalysisNumber),
        AnalysisStatus = s.Field(Cols.AnalysisStatus)
    }).ToList(); // the ToList() part here forces application of the linq expressions

for (int i = pivotStartIndex; i  i.ElementName == elementName && i.AnalysisNumber == analysisNumber).Select(i=>i.AnalysisStatus);
    retList.Add(mChars.FirstOrDefault()); 
}

Code Snippets

s.Field<string>(Cols.Samplenumber) == sourceRow.Field<string>(Cols.Samplenumber)
      && s.Field<string>(Cols.ElementName) == elementTest[0].Trim()
      && s.Field<string>(Cols.AnalysisNumber) == elementTest[1].Trim()
struct TempItem
{
    public string Samplenumber;
    public string ElementName;
    public string AnalysisNumber;
    public string AnalysisStatus;
}
string sourceSampleNumber = sourceRow.Field<string>(Cols.Samplenumber);
var lookUpList = mainData.AsEnumerable().Where(i => i.Samplenumber == sourceSampleNumber).Select(i=> new TempItem {
        Samplenumber = s.Field<string>(Cols.Samplenumber),
        ElementName = s.Field<string>(Cols.ElementName),
        AnalysisNumber = s.Field<string>(Cols.AnalysisNumber),
        AnalysisStatus = s.Field<string>(Cols.AnalysisStatus)
    }).ToList(); // the ToList() part here forces application of the linq expressions

for (int i = pivotStartIndex; i < sourceTable.Columns.Count; i++)
{
    string[] elementTest = sourceTable.Columns[i].ColumnName.Split('-');
    string elementName = elementTest[0].Trim();
    string analysisNumber = elementTest[1].Trim();
    var mChars = lookUpList.Where(i => i.ElementName == elementName && i.AnalysisNumber == analysisNumber).Select(i=>i.AnalysisStatus);
    retList.Add(mChars.FirstOrDefault()); 
}

Context

StackExchange Code Review Q#62751, answer score: 4

Revisions (0)

No revisions yet.