HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Transforming XML as it is being generated on a scanner

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
generatedtransformingxmlbeingscanner

Problem

At work I've recently been tasked with creating an XSLT to transform some XML as it is being generated on a scanner. The point being to disregard some pages that we are not interested in for further processing, and this is what I've come up with:


       

     
     

    
    
      
        
      
    

    
    

    
    
       
          0">
            
              
            
          
        
         
           
         
       
     
    
   


Running on XML files with the following simplified structure, the actual files have around 100-3000 pages on average with some 40 fields under Fields:


  
    
      
        |||||||||||
      
    
    
      
        |RETURN|||||||||
      
    
    
      
        ||5454|||||||||
      
    
  


It is working, but I'm a bit worried about the Barcode template running too slow as it must be \$O(n^2)\$. A quick profiling showed my concern to be correct.

Hot path during execution

As this will be running on somewhat limited hardware, does anyone have any suggestions for improvements?

Solution

I'd go with a ` index, like so (at top level):



and then rewrite your hotspot to:

 0">
  ...


The idea is to partition Barcode nodes into two sets, the ones that contain 'RETURN' and the ones that don't. The
key function as shown then evaluates to a nodeset that only contains Barcode nodes that contain 'RETURN' and hence count` works as expected.

Code Snippets

<xsl:key name="contains_RETURN" match="Barcode" use="contains(text(), 'RETURN')"/>
<xsl:when test="count(key('contains_RETURN', 'true')) > 0">
  ...
</xsl:when>

Context

StackExchange Code Review Q#14715, answer score: 2

Revisions (0)

No revisions yet.