HiveBrain v1.2.0
Get Started
← Back to all entries
patternhtmlMinor

Using XSLT to turn content of WYSIWYG stored in XML into HTML

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
turnstoredxsltwysiwygintoxmlusingcontenthtml

Problem

The CMS I use uses the Xalan XSLT processor which is a XSLT version 1.0 processor. Editors who use WYSIWYG fields within the CMS can save content that will look like either:


    Story Text which may have formatting.


or:

Story Text which may have formatting.


or even:


    Story Text which may have formatting.


In order to handle those varying cases I've been using:


 

    
        
            
            
                
                
            
            
            
                
                    
                    
                
            
        
    


Is there a better way of producing well formed HTML from this data? Is there a way to simply this?

Solution

You should change your mindset when working with XSLT: look at it as a functional language, where you declare rules for the transformation of elements. Think about it in terms of events: when a p element is found in input, I expect this and that in the output.

First, regarding the definition of your data. You should ask for (or create) a more formal description of the data. There are different formats available to describe the structure (grammar), DTD and XML Schema being the more common. There are more simple alternatives such as RELAX NG and Schematron.

My guess, from your stylesheet and your descriptions:

  • there is always a ` at the top



  • there is a list of elements in



  • there is mixed HTML contents in , with


,,
,` and text allowed

I am not sure about the last part: is it possible to have both text and div or paragraph at the top level in a story?


  Text before a Paragraph in between or Div even after ?


You should break your unique template rule into several template rules, one or several for each element. Then to direct the flow from rule to rule, do not use for-each but rather apply-templates:


  
   

 
   

 
  
    
  


Note that I used the mode "copy" above: you can now create different rules for the same nodes depending on the mode, for example:

 

    


Also, you will probably get into trouble because you only declared a namespace for the xsl prefix. You typically need a prefix/namespace declaration for the input and output formats. You will then need to use the prefix in match/select expressions and in the tag names for the output, for example, with input prefix bound to your input namespace and html bound to the html namespace:

 
  
    
  


To go further, here are pointers that you will probably find useful:

  • Namespaces and XSLT Stylesheets



  • Identity transformation

Code Snippets

<story>
  Text before a <p>Paragraph</p> in between or <div>Div</div> even after ?
</story>
<xsl:template match="system-index-block">
  <!-- Note:
       You probably need to create the basic HTML structure Here: html, head, body
  -->
  <xsl:apply-templates /> <!-- you can specify select="story" but it is not required -->
</xsl:template>

<xsl:template match="story[p|div]"> <!-- If wrapped in an element -->
   <xsl:apply-templates mode="copy" />
</xsl:template>

<xsl:template match="story"> <!-- wrap the content in a paragraph tag -->
  <p>
    <xsl:apply-templates mode="copy" />
  </p>
</xsl:template>
<xsl:template match="text()" /> <!-- ignore text in normal mode, probably whitespace -->

<xsl:template match="text()" mode="copy">
    <xsl:copy />
</xsl:template>
<xsl:template match="input:story"> <!-- wrap the content in a paragraph tag -->
  <html:p>
    <xsl:apply-templates mode="copy" />
  </html:p>
</xsl:template>

Context

StackExchange Code Review Q#735, answer score: 5

Revisions (0)

No revisions yet.