patterncsharpMinor
Optimisation XML handling within vb.net application
Viewed 0 times
handlingapplicationoptimisationxmlwithinnet
Problem
I need to optimize this snippet to be faster:
This is the XML structure:
I'm asking how I can optimize my code because it takes a lot of time (50 sec) in the case where the list contains 20000 elements.
Dim lst = (From t In DocElet.ChildNodes Select ID = t.item("ID").outerxml).Distinct().ToList()
Parallel.For(0, lst.Count, Sub(i)
Dim P As XmlElement = GetElement(lst(i))
Dim ls = (From t In DocElet.ChildNodes Where t.item("ID").innerText = P.InnerText Select t)
Parallel.ForEach(ls, Sub(D)
Dim verif_date As String = D.Item("DAD").InnerText
Sej.ID = D.Item("ID").InnerText
End Sub)
End Sub)This is the XML structure:
I'm asking how I can optimize my code because it takes a lot of time (50 sec) in the case where the list contains 20000 elements.
Solution
The indentation makes your snippet quite hard to read - the VB.NET syntax for inline delegates is fairly bulky (at least compared to C#), I'd give it more air; being able to read the code is step one.
Ok. So you're running a
What's the purpose of
Now the outer loop is making yet another selection:
I'd try to flatten that. I'm not very familiar with the LINQ syntax in VB.NET, but what you want is a
If the XML is massive, it could be worth trying some PLINQ:
Now you can iterate
As an aside, I have to say your naming is inconsistent and hard to follow:
Dim lst = (From t In DocElet.ChildNodes
Select ID = t.item("ID").outerxml).Distinct()
.ToList()
Parallel.For(0, lst.Count,
Sub(i)
Dim P As XmlElement = GetElement(lst(i))
Dim ls = (From t In DocElet.ChildNodes
Where t.item("ID").innerText = P.InnerText
Select t)
Parallel.ForEach(ls,
Sub(D)
Dim verif_date As String = D.Item("DAD").InnerText
Sej.ID = D.Item("ID").InnerText
End Sub)
End Sub)Ok. So you're running a
Parallel.For, and inside that, you're running a Parallel.ForEach. That's quite a lot of overhead just there, it's likely that you're shooting yourself in the foot here, I'd first remove the inner Parallel.ForEach:Parallel.For(0, lst.Count,
Sub(i)
Dim P As XmlElement = GetElement(lst(i))
Dim ls = (From t In DocElet.ChildNodes
Where t.item("ID").innerText = P.InnerText
Select t)
For Each foo In ls
Dim verif_date As String = D.Item("DAD").InnerText
Sej.ID = D.Item("ID").InnerText
Next
End Sub)What's the purpose of
verif_date? If it's not used, remove it:For Each foo In ls
Sej.ID = D.Item("ID").InnerText
NextNow the outer loop is making yet another selection:
Dim ls = (From t In DocElet.ChildNodes
Where t.item("ID").innerText = P.InnerText
Select t)I'd try to flatten that. I'm not very familiar with the LINQ syntax in VB.NET, but what you want is a
SelectMany, where you'd select the child nodes where "ID"'s inner text matches the element's inner text for the node you're iterating - in other words the ls query gets outside the outer loop and into the lst query, so you only need a single loop (which can probably run in parallel).If the XML is massive, it could be worth trying some PLINQ:
Dim lst = (From t In DocElet.ChildNodes
Select ID = t.item("ID").outerxml)
.Distinct()
.AsParallel()
.SelectMany(...)
.ToList()Now you can iterate
lst with Parallel.ForEach (time it with a normal For Each first) and do your thing.As an aside, I have to say your naming is inconsistent and hard to follow:
Sub(i) looks good (Sub(index) would be better), Sub(D) doesn't. I realize VB.NET isn't case-sensitive, but if you're going to have a convention to make function parameters camelCase, by all means stick to it. lst would probably be better off as elements or whatever makes sense - use real, readable words for your identifiers, not just arbitrary abbreviations.Code Snippets
Dim lst = (From t In DocElet.ChildNodes
Select ID = t.item("ID").outerxml).Distinct()
.ToList()
Parallel.For(0, lst.Count,
Sub(i)
Dim P As XmlElement = GetElement(lst(i))
Dim ls = (From t In DocElet.ChildNodes
Where t.item("ID").innerText = P.InnerText
Select t)
Parallel.ForEach(ls,
Sub(D)
Dim verif_date As String = D.Item("DAD").InnerText
Sej.ID = D.Item("ID").InnerText
End Sub)
End Sub)Parallel.For(0, lst.Count,
Sub(i)
Dim P As XmlElement = GetElement(lst(i))
Dim ls = (From t In DocElet.ChildNodes
Where t.item("ID").innerText = P.InnerText
Select t)
For Each foo In ls
Dim verif_date As String = D.Item("DAD").InnerText
Sej.ID = D.Item("ID").InnerText
Next
End Sub)For Each foo In ls
Sej.ID = D.Item("ID").InnerText
NextDim ls = (From t In DocElet.ChildNodes
Where t.item("ID").innerText = P.InnerText
Select t)Dim lst = (From t In DocElet.ChildNodes
Select ID = t.item("ID").outerxml)
.Distinct()
.AsParallel()
.SelectMany(...)
.ToList()Context
StackExchange Code Review Q#55475, answer score: 2
Revisions (0)
No revisions yet.