patterncsharpMinor
Refactor C# code for creating XML to not use intermediate dictionary
Viewed 0 times
creatingxmlforintermediatecodedictionaryuserefactornot
Problem
If have the following code that via an SQL query retrieves data from a database in the form:
And I want to turn this into an XML like:
I do this by using a dictionary but it is the first time I'm using C# and I'm sure there must be a more efficient way of doing this in one go?
```
string subschemaSQL = "SELECT subschema.Id, subschema.name as subschemaName, locus.Name as locusName " +
"FROM subschema " +
"LEFT JOIN subschemamembers ON subschemamembers.SubSchemaID = subschema.PrimKey " +
"LEFT JOIN locus ON subschemamembers.LocusID = locus.ID " +
"WHERE subschema.OrganismID = ? " +
"ORDER BY subschema.Id, locusName";
XElement rootNode = new XElement("ResponseSubschemas",
new XElement("organism", organismID),
new XElement("Subschemas"));
XElement subschemasNode = rootNode.Element("Subschemas");
DbDataReader subschemaReader = conn.Query(subschemaSQL, organismID);
Dictionary> dictionary = new Dictionary>();
while (subschemaReader.Read())
{
string subschemaDbID = (string)subschemaReader["Id"];
XElement subschemaNode = new XElement("Subschema",
new XElement("id", subschemaDbID),
new XElement("name", subschemaReader["subschemaName"])
);
subschemasNode.Add(subschemaNode);
string locusName = (string)subschemaReader["locusName"];
if (dictionary.ContainsKey(subschemaDbID))
{
dictionary[subschemaDbID].Add(locusName);
}
else
{
dictionary.Add(
Id subschemaName locusName
MLST MLST LMO0558
MLST MLST LMO0563
MLVST MLVST LMO1305
MLVST MLVST LMO1089
And I want to turn this into an XML like:
LMO
MLST
MLST
LMO0558
LMO0563
MLVST
MLVST
LMO1305
LMO1089
I do this by using a dictionary but it is the first time I'm using C# and I'm sure there must be a more efficient way of doing this in one go?
```
string subschemaSQL = "SELECT subschema.Id, subschema.name as subschemaName, locus.Name as locusName " +
"FROM subschema " +
"LEFT JOIN subschemamembers ON subschemamembers.SubSchemaID = subschema.PrimKey " +
"LEFT JOIN locus ON subschemamembers.LocusID = locus.ID " +
"WHERE subschema.OrganismID = ? " +
"ORDER BY subschema.Id, locusName";
XElement rootNode = new XElement("ResponseSubschemas",
new XElement("organism", organismID),
new XElement("Subschemas"));
XElement subschemasNode = rootNode.Element("Subschemas");
DbDataReader subschemaReader = conn.Query(subschemaSQL, organismID);
Dictionary> dictionary = new Dictionary>();
while (subschemaReader.Read())
{
string subschemaDbID = (string)subschemaReader["Id"];
XElement subschemaNode = new XElement("Subschema",
new XElement("id", subschemaDbID),
new XElement("name", subschemaReader["subschemaName"])
);
subschemasNode.Add(subschemaNode);
string locusName = (string)subschemaReader["locusName"];
if (dictionary.ContainsKey(subschemaDbID))
{
dictionary[subschemaDbID].Add(locusName);
}
else
{
dictionary.Add(
Solution
Since your data is already sorted you definitely don't need to use a dictionary. You can just accumulate
It would be easier to figure out what's going on by separating those things out from each other. It might also make your life easier if you ever want to do anything more complex with this data than just dumping it to XML. So let's just do a little rewrite.
To start with, let's create a class to hold each of those
Now, let's write a function to parse the database records out into a list of
So now we have a nice, clean list of the data you need to output. Basically we've handled the first and second bullets from my list, so all that's left is turning that into XML. There's probably a cleaner way of doing this than using
Pretty straightforward here. We could've written this more succinctly, probably even in a (really big) one liner, but this strikes a reasonable balance with clarity, I think.
So, now we just have to put those two together...
That last function isn't really a thing of beauty either - it's got kind of a lot going on. But it's definitely better. I also added a
By the way, I put up a DotNetFiddle of the whole thing. It doesn't work because I don't have a database to use, but it might be easier to read there.
loci and then whenever the id or subschemaId rolls over you write them all out. That wouldn't be a huge change from what you've got. But honestly, what you've got is a bit confusing because you're trying to do several things at once. Namely:- Read from the database and parse the data
- Determine when you've read an entire set of data
- Write it all to XML
It would be easier to figure out what's going on by separating those things out from each other. It might also make your life easier if you ever want to do anything more complex with this data than just dumping it to XML. So let's just do a little rewrite.
To start with, let's create a class to hold each of those
Subschema elements. (Honestly Subschema seems like a confusing name to me, it seems like maybe OrganismLocations or OrganismSightings might be more descriptive, but let's just stick with Subschema.)public class Subschema{
public string Id {get;set;}
public string Name {get;set;}
public List Loci {get;set;}
public Subschema(){
Loci = new List();
}
}Now, let's write a function to parse the database records out into a list of
Subschema objects. This is basically doing what I said earlier - tracking loci until we hit a new id or subschemaName. We're using C# iterators (that's the yield return syntax) to make this a little bit cleaner.private IEnumerable ParseSubschemas(DbDataReader subschemaReader){
if(!subschemaReader.HasRows){
yield break;
}
Subschema subschema = null;
while(subschemaReader.Read()){
//If we've rolled over to a new ID or SubschemaName, start a new set of sightings
if(subschema != null && (subschema.Id != (string)subschemaReader["id"] || subschema.Name != (string)subschemaReader["subschemaName"])){
yield return subschema;
subschema = null;
}
if(subschema == null){
subschema = new Subschema(){
Id = (string)subschemaReader["id"],
Name = (string)subschemaReader["subschemaName"]
};
}
subschema.Loci.Add((string)subschemaReader["locusName"]);
}
//yield the last subschema
yield return subschema;
}So now we have a nice, clean list of the data you need to output. Basically we've handled the first and second bullets from my list, so all that's left is turning that into XML. There's probably a cleaner way of doing this than using
XElements directly, but let's just stick with it for the moment. But again, let's stick all the logic in a nice clean function.private XElement SerializeSubSchemasToXml(string organismId, IEnumerable sightings){
XElement rootNode = new XElement("ResponseSubschemas",
new XElement("organism", organismId),
new XElement("Subschemas"));
XElement subschemasNode = rootNode.Element("Subschemas");
//Turn each Sighting into an XElement, and append to the subschemas
foreach(var sighting in sightings){
var subschemaElem = new XElement("Subschema",
new XElement("id", sighting.Id),
new XElement("name", sighting.Name),
new XElement("Loci", sighting.Loci.Select(l => new XElement("locus", l)))
);
subschemasNode.Add(subschemaElem);
}
return rootNode;
}Pretty straightforward here. We could've written this more succinctly, probably even in a (really big) one liner, but this strikes a reasonable balance with clarity, I think.
So, now we just have to put those two together...
private string ReadSubschemasAsXml(){
string organismId = "FOO";
string subschemaSQL = "SELECT subschema.Id, subschema.name as subschemaName, locus.Name as locusName " +
"FROM subschema " +
"LEFT JOIN subschemamembers ON subschemamembers.SubSchemaID = subschema.PrimKey " +
"LEFT JOIN locus ON subschemamembers.LocusID = locus.ID " +
"WHERE subschema.OrganismID = ? " +
"ORDER BY subschema.Id, locusName";
using(DbDataReader subschemaReader = conn.Query(subschemaSQL, organismId)){
IEnumerable sightings = ParseSubschemas(subschemaReader);
XElement xelem = SerializeSubSchemasToXml(organismId, sightings);
return xelem.ToString();
}
}That last function isn't really a thing of beauty either - it's got kind of a lot going on. But it's definitely better. I also added a
using statement around your DbDataReader, which is just a handy way of making sure that it gets Dispose()d properly. By the way, I put up a DotNetFiddle of the whole thing. It doesn't work because I don't have a database to use, but it might be easier to read there.
Code Snippets
public class Subschema{
public string Id {get;set;}
public string Name {get;set;}
public List<string> Loci {get;set;}
public Subschema(){
Loci = new List<string>();
}
}private IEnumerable<Subschema> ParseSubschemas(DbDataReader subschemaReader){
if(!subschemaReader.HasRows){
yield break;
}
Subschema subschema = null;
while(subschemaReader.Read()){
//If we've rolled over to a new ID or SubschemaName, start a new set of sightings
if(subschema != null && (subschema.Id != (string)subschemaReader["id"] || subschema.Name != (string)subschemaReader["subschemaName"])){
yield return subschema;
subschema = null;
}
if(subschema == null){
subschema = new Subschema(){
Id = (string)subschemaReader["id"],
Name = (string)subschemaReader["subschemaName"]
};
}
subschema.Loci.Add((string)subschemaReader["locusName"]);
}
//yield the last subschema
yield return subschema;
}private XElement SerializeSubSchemasToXml(string organismId, IEnumerable<Subschema> sightings){
XElement rootNode = new XElement("ResponseSubschemas",
new XElement("organism", organismId),
new XElement("Subschemas"));
XElement subschemasNode = rootNode.Element("Subschemas");
//Turn each Sighting into an XElement, and append to the subschemas
foreach(var sighting in sightings){
var subschemaElem = new XElement("Subschema",
new XElement("id", sighting.Id),
new XElement("name", sighting.Name),
new XElement("Loci", sighting.Loci.Select(l => new XElement("locus", l)))
);
subschemasNode.Add(subschemaElem);
}
return rootNode;
}private string ReadSubschemasAsXml(){
string organismId = "FOO";
string subschemaSQL = "SELECT subschema.Id, subschema.name as subschemaName, locus.Name as locusName " +
"FROM subschema " +
"LEFT JOIN subschemamembers ON subschemamembers.SubSchemaID = subschema.PrimKey " +
"LEFT JOIN locus ON subschemamembers.LocusID = locus.ID " +
"WHERE subschema.OrganismID = ? " +
"ORDER BY subschema.Id, locusName";
using(DbDataReader subschemaReader = conn.Query(subschemaSQL, organismId)){
IEnumerable<Subschema> sightings = ParseSubschemas(subschemaReader);
XElement xelem = SerializeSubSchemasToXml(organismId, sightings);
return xelem.ToString();
}
}Context
StackExchange Code Review Q#40353, answer score: 5
Revisions (0)
No revisions yet.