PageBox: servlet running in sandbox on J2EE PageBox

for
support@pagebox.net PageBox: servlet running in sandbox on J2EE Word version of this document Cuckoo generated
Presentation FAQ Reference Customization Runtime Forms Demo Math Verification Downloads Legal

Cuckoo customization guide

Table of Content
Conventions
XML
XML
XHTML
XSL
Cuckoo model
Default xsl file
How to customize the layout
How to generate server pages
Batch generation
ASP generation
JSP
PHP
CSS
Implementation
Cuckoo object model
WordReactor
ParWrapper
XmlProcessor
TagStack
Sequence diagram
Glossary

Conventions

Cuckoo is designed to work without configuration. As a consequence:

  1. You are asked for a target HTML file name, let say D:/mySite/myProduct/xxxx.html and for a Style directory, let say D:/myInstall

  2. If you don't set the target HTML file name, let say D:/myDir/xxxx.doc, Cuckoo sets the HTML file name after the name of your Word file, here D:/myDir/xxxx.html.

  3. Cuckoo expects to find an XSL file named cuckoo.xsl in your target directory, here in D:/mySite/myProduct/cuckoo.xsl. If Cuckoo doesn't find cuckoo.xsl there it copies cuckoo.xsl from the Style directory.

  4. Cuckoo also expects to find a CSS file named cuckoo.css in your target directory, here in D:/mySite/myProduct/cuckoo.css. If Cuckoo doesn't find cuckoo.css there it copies cuckoo.css from the Style directory.

  5. Cuckoo also expects to find a Javascript file named cuckoo.js in your target directory, here in D:/mySite/myProduct/cuckoo.js. If Cuckoo doesn't find cuckoo.js there it copies cuckoo.js from the Style directory.

  6. Cuckoo generates a intermediate XML file named xxxxW.xml in your target directory, here in D:/mySite/myProduct/xxxxW.xml

  7. Cuckoo also generates images in the target directory named xxxxi.gif or xxxxi.jpg where i is the image number in the document.

XML

If you already use XML and XSL you can skip this section. If you don't use it yet, we recommend you to start now.

XML

XML is just another markup language. What makes it really cool is the support of the industry that delivered excellent XML parsers such as Xerces or MSXML3 for free. However XML handling through DOM or SAX is still programming.

An XML document is mainly made of elements identified by starting and ending tags:

<tag1>element-content</tag1>

An element can have attributes:

<tag1 attribute1="xxxx">element-content</tag1>

You can also collapsed elements like this:

<tag2/>

XHTML

XML looks like HTML just because both XML and HTML are subset of an older language, SGML.

However:

  1. HTML has a more relaxed syntax. You don't need to close elements such as <br> and <input>.

  2. HTML elements have a meaning, at least for browsers

XHTML is an XML dialect with basically the same tags and meaning as HTML. Because browsers are mistake-tolerant they can even accept well-formed XML documents. You can manipulate and transform XHTML just like regular HTML.

Cuckoo generates files that combine XML and XHTML.

To create a site file, you just do the same. Here is an example:

<header>

<h1>This is my site!</h1>

</header>

<map>

<a href="ratata.html">My cat, Ratata</a><br/>

<a href="hobby.html">My hobbies</a><br/>

<a href="friends.html">My friends</a><br/>

</map>

<footer>

Alexis

</footer>

XSL

eXtended Stylesheet Language is a way to manipulate XML with an XML flavor. More specifically Cuckoo uses XSL Transformations (XSLT), which is a language for transforming XML documents into other XML documents. XSLT is not hard to code if you already have some knowledge about grammars. We quote here the XSLT specification: "A transformation expressed in XSLT describes rules for transforming a source tree into a result tree. The transformation is achieved by associating patterns with templates. A pattern is matched against elements in the source tree. A template is instantiated to create part of the result tree."

To summarize in XSLT you write templates (<xsl:template>) that include <xsl:apply-templates> elements to invoke other templates.

Cuckoo model

Cuckoo generates an intermediate XML file containing three XHTML elements:

  1. <info> that contains the page title and meta tags

  2. <content> that contains the core of the Word conversion

  3. <toc> that contains a table of content

Construction of the document from a Site XML file and from the document produced by Cuckoo

Beside this content, the page should contain site parts, which are the same for all pages on your site, typically:

The default cuckoo.xsl assumes that your site file contains three XHTML elements:

  1. <header> that contains the header

  2. <map> that contains a site map and similar content

  3. <footer> that contains the footer

cuckoo.xsl merges the content and site data and structures the document in tables and other elements.

Default xsl file

<?xml version="1.0" encoding="windows-1252"?>

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<xsl:template match="cuckoo">

<html>

<head>

<xsl:copy>

<xsl:apply-templates select="info"/>

</xsl:copy>

<link rel="stylesheet" href="cuckoo.css" type="text/css"/>

<script src="cuckoo.js"></script></head><body>

<div id="tooltip" style="position:absolute;visibility:hidden;border:1px solid black;font-size:14px;layer-background-color:lightyellow;background-color:lightyellow;padding:1px">

</div>

<xsl:copy>

<xsl:apply-templates select="document('pagebox.xml')/site/header"/>

</xsl:copy>

<table><tr><td valign="top">

<xsl:copy>

<xsl:apply-templates select="content"/>

</xsl:copy>

<p align="center">

<xsl:copy>

<xsl:apply-templates select="document('pagebox.xml')/site/footer"/>

</xsl:copy>

</p>

</td><td valign="top" align="left" width="250">

<xsl:copy>

<xsl:apply-templates select="toc"/>

</xsl:copy>

<br/>

<xsl:copy>

<xsl:apply-templates select="document('pagebox.xml')/site/map"/>

</xsl:copy>

</td></tr>

</table></body></html>

</xsl:template>

...

</xsl:stylesheet>

You can modify this file.

The part starting at <link and ending at /div> requires special care:

  1. You should include a Cascading Style Sheet file but you can give it a different name

  2. You must include a JavaScript part at least for mouse over handling but you can give it a different name

  3. You must include the <div> element for mouse over handling

How to customize the layout

Let's assume that you want to display pages like this:

Alternate document format with three columns

You simply need to add a new column:

<?xml version="1.0" encoding="windows-1252"?>

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<xsl:template match="cuckoo">

<html>

<head>

<xsl:copy>

<xsl:apply-templates select="info"/>

</xsl:copy>

<link rel="stylesheet" href="cuckoo.css" type="text/css"/>

<script src="cuckoo.js"></script></head><body>

<div id="tooltip" style="position:absolute;visibility:hidden;border:1px solid black;font-size:14px;layer-background-color:lightyellow;background-color:lightyellow;padding:1px">

</div>

<xsl:copy>

<xsl:apply-templates select="document('pagebox.xml')/site/header"/>

</xsl:copy>

<table><tr><td valign="top">

<xsl:copy>

<xsl:apply-templates select="document('pagebox.xml')/site/map"/>

</xsl:copy>

<br/>

<xsl:copy>

<xsl:apply-templates select="document('cuckoo-news.xml')/cuckoo/content"/>

</xsl:copy>

</td>< td valign="top">

<xsl:copy>

<xsl:apply-templates select="content"/>

</xsl:copy>

<p align="center">

<xsl:copy>

<xsl:apply-templates select="document('pagebox.xml')/site/footer"/>

</xsl:copy>

</p>

</td><td valign="top" align="left" width="250">

<xsl:copy>

<xsl:apply-templates select="toc"/>

</xsl:copy>

</td></tr>

</table></body></html>

</xsl:template>

...

</xsl:stylesheet>

We move the map (coming from pagebox.xml) and we add a news part (coming from cuckoo-news.xml) in the new column. cuckoo-news.xml has been authored in Word and saved in Xml with Cuckoo. We can retrieve its content with document('cuckoo-news.xml')/cuckoo/content. As you can see, we can build a HTML page from two or more Word documents. You can use this feature to reuse content.

How to generate server pages

Suppose now that you want to support dynamic update.

It implies that your page includes another page. You can use frames. It is often inconvenient for users:

They cannot bookmark the page

They often don't know how to save your page

Beside frames you can use <iframe> with Internet Explorer, Netscape 6, Mozilla, Opera or <layer> with Netscape 4. You can use both in your page but it is almost impossible to get the same look and feel with all browsers. Try this to see if it fits your need:

<ilayer><layer src="cuckoo-news.html"></layer></ilayer>

<iframe src="cuckoo-news.html" align="bottom"></iframe>

Most of the time it is better to use server page technology. We will show you now how to do that with ASP, JSP and PHP.

Our recommendation is to keep a static version for WYSIWYG display when you author the document.

Once you are ready to publish, run a batch generation with an XSL file specific to the server technology that you use.

We present below three examples:

You can choose a more radical option and generate your pages at run time.

Pros:

Cons:

Batch generation

A batch generation script named cuckoo-gen.js is included in the deliveries.

It can also be used to regenerate a static site after a change of the site file:

Batch generation using a WSH script, cuckoo-gen.js

Usage

cuckoo-gen.js /dir:source-directory|/file:file [/toDir:target-directory|/toFile:target-file] [/xsl:xsl-file]

dir: directory of Xml source files (ending with W.xml)

file: Xml file name. It must end with W.xml when combined with toDir option.

xsl: XSLT file

toDir: directory of target HTML files

If a source file is named xxxxW.xml then the target HTML file is named xxxx.html.

toFile: target HTML file name.

Example:

cuckoo-gen.js /dir:D:\cuckoo /toDir:D:\cuckoo /xsl:D:\cuckoo\cuckoo.xsl

ASP generation

In cuckoo-asp.xsl we use ASP #include to include a news file in the HTML page:

<?xml version="1.0" encoding="windows-1252"?>

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<xsl:template match="cuckoo">

<html>

<head>

<xsl:copy>

<xsl:apply-templates select="info"/>

</xsl:copy>

<link rel="stylesheet" href="cuckoo.css" type="text/css"/>

<script src="cuckoo.js"></script></head><body>

<div id="tooltip" style="position:absolute;visibility:hidden;border:1px solid black;font-size:14px;layer-background-color:lightyellow;background-color:lightyellow;padding:1px">

</div>

<xsl:copy>

<xsl:apply-templates select="document('pagebox.xml')/site/header"/>

</xsl:copy>

<table><tr><td valign="top">

<xsl:copy>

<xsl:apply-templates select="content"/>

</xsl:copy>

<p align="center">

<xsl:copy>

<xsl:apply-templates select="document('pagebox.xml')/site/footer"/>

</xsl:copy>

</p>

</td><td valign="top" align="left" width="250">

<div style="background-color:#99ff99">

<xsl:comment>#include file="cuckoo-news.html"</xsl:comment>

</div>

<p> </p>

<xsl:copy>

<xsl:apply-templates select="toc"/>

</xsl:copy>

<br/>

<xsl:copy>

<xsl:apply-templates select="document('pagebox.xml')/site/map"/>

</xsl:copy>

<table><tr><td><img src="cuckoo.gif" alt="Cuckoo mark"/></td><td width="5"> </td>

<td width="120" style="font-family:Verdana;color:#ff8080;background-color:#99ff99;font-weight:bold;margin-top:5px;margin-botton:5px">

Cuckoo generated</td></tr></table>

</td></tr>

</table></body></html>

</xsl:template>

We generated an asp file with this command:

E:\cuckoo\cuckoo-gen.js /file:E:\cuckoo\cuckoo-customW.xml /toFile:E:\cuckoo\cuckoo-custom.asp /xsl:E:\cuckoo\cuckoo-asp.xsl

As the ASP doesn't process the included file, it has to be in HTML format.

We include in the deliveries a cuckoo-min.xsl file to convert from the XML format to a minimal HTML translation where only <content> element is used.

JSP

In cuckoo-jsp.xsl we use JSP <jsp:include> to include a news file in the HTML page:

<?xml version="1.0" encoding="windows-1252"?>

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"

xmlns:jsp="http://java.sun.com/products/jsp/dtd/jsp_1_0.dtd">

<xsl:template match="cuckoo">

<html>

<head>

<xsl:copy>

<xsl:apply-templates select="info"/>

</xsl:copy>

<link rel="stylesheet" href="cuckoo.css" type="text/css"/>

<script src="cuckoo.js"></script></head><body>

<div id="tooltip" style="position:absolute;visibility:hidden;border:1px solid black;font-size:14px;layer-background-color:lightyellow;background-color:lightyellow;padding:1px">

</div>

<xsl:copy>

<xsl:apply-templates select="document('pagebox.xml')/site/header"/>

</xsl:copy>

<table><tr><td valign="top">

<xsl:copy>

<xsl:apply-templates select="content"/>

</xsl:copy>

<p align="center">

<xsl:copy>

<xsl:apply-templates select="document('pagebox.xml')/site/footer"/>

</xsl:copy>

</p>

</td><td valign="top" align="left" width="250">

<div style="background-color:#99ff99">

<xsl:element name="jsp:include">

<xsl:attribute name="page">cuckoo-news.html</xsl:attribute>

<xsl:attribute name="flush">true</xsl:attribute>

</xsl:element>

</div>

<p> </p>

<xsl:copy>

<xsl:apply-templates select="toc"/>

</xsl:copy>

<br/>

<xsl:copy>

<xsl:apply-templates select="document('pagebox.xml')/site/map"/>

</xsl:copy>

<table><tr><td><img src="cuckoo.gif" alt="Cuckoo mark"/></td><td width="5"> </td>

<td width="120" style="font-family:Verdana;color:#ff8080;background-color:#99ff99;font-weight:bold;margin-top:5px;margin-botton:5px">

Cuckoo generated</td></tr></table>

</td></tr>

</table></body></html>

</xsl:template>

We generated a jsp file with this command:

E:\cuckoo\cuckoo-gen.js /file:E:\cuckoo\cuckoo-customW.xml /toFile:E:\cuckoo\cuckoo-custom.jsp /xsl:E:\cuckoo\cuckoo-jsp.xsl.

It is slightly more complex than in ASP, just because JSP directive is itself XML. If you simply write <jsp:include page="cuckoo-news.html" flush="true"/>, XSLT thinks that it has something to do with a JSP namespace. To create XML we need to use <xsl:element> directive and to include the JSP namespace.

Note:

We made tests with Tomcat 3.2.2. Depending on your Application Server, minor changes can be needed.

PHP

In cuckoo-php.xsl we use php <?php include(...); ?> to include a news file in the HTML page:

<?xml version="1.0" encoding="windows-1252"?>

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<xsl:template match="cuckoo">

<html>

<head>

<xsl:copy>

<xsl:apply-templates select="info"/>

</xsl:copy>

<link rel="stylesheet" href="cuckoo.css" type="text/css"/>

<script src="cuckoo.js"></script></head><body>

<div id="tooltip" style="position:absolute;visibility:hidden;border:1px solid black;font-size:14px;layer-background-color:lightyellow;background-color:lightyellow;padding:1px">

</div>

<xsl:copy>

<xsl:apply-templates select="document('pagebox.xml')/site/header"/>

</xsl:copy>

<table><tr><td valign="top">

<xsl:copy>

<xsl:apply-templates select="content"/>

</xsl:copy>

<p align="center">

<xsl:copy>

<xsl:apply-templates select="document('pagebox.xml')/site/footer"/>

</xsl:copy>

</p>

</td><td valign="top" align="left" width="250">

<div style="background-color:#99ff99">

<xsl:processing-instruction name="php">

include("cuckoo-news.html"); ?</xsl:processing-instruction>

</div>

<p> </p>

<xsl:copy>

<xsl:apply-templates select="toc"/>

</xsl:copy>

<br/>

<xsl:copy>

<xsl:apply-templates select="document('pagebox.xml')/site/map"/>

</xsl:copy>

<table><tr><td><img src="cuckoo.gif" alt="Cuckoo mark"/></td><td width="5"> </td>

<td width="120" style="font-family:Verdana;color:#ff8080;background-color:#99ff99;font-weight:bold;margin-top:5px;margin-botton:5px">

Cuckoo generated</td></tr></table>

</td></tr>

</table></body></html>

</xsl:template>

We generated a php file with this command:

E:\cuckoo\cuckoo-gen.js /file:E:\cuckoo\cuckoo-customW.xml /toFile:E:\cuckoo\cuckoo-custom.php /xsl:E:\cuckoo\cuckoo-php.xsl

If you simply include <?php ...?>, the XSLT processor identifies an XML processing instruction and try to run it. To create a <?php ... ?> you need to use <xsl:processing-instruction>.

Note:

MSXML3 forgets the trailing question mark, so we add it to the instruction.

CSS

Cascading Style Sheets (CSS) is a simple mechanism for adding style (e.g. fonts, colors, spacing) to Web documents. A CSS file is a set of styles element that define how to display HTML elements.

A style element can apply to an HTML tag, for instance:

UL {

padding-top: 1px;

padding-bottom: 1px;

margin-top: 1px;

margin-bottom: 1px;

}

applies to <ul> elements.

A style element can also apply to a user-defined class, for instance:

.title {

font-size: 20pt;

font-family: Arial, Helvetica;

color: #336699;

}

applies to all elements with a class="title" attribute: <font class="title">...</font>.

cuckoo.css includes both kinds of style elements:

Style element

Use

Creation

Purpose

H1

All <h1> elements.

H2

All <h2> elements.

H3

All <h3> elements.

OL

All <ol> elements.

P

All <p> elements.

OL

All <ol> elements.

UL

All <ul> elements.

.cuckoo-table

<table class="cuckoo-table">

WordReactor VBA class

Word tables

.cuckoo-td

<td class="cuckoo-td">

WordReactor VBA class

Word tables

.mouse-over

<a class="mouse-over">

XmlProcessor VBA class

Mouse over placeholders

.toc-table

<table class="toc-table">

XmlProcessor VBA class

Table of content

.toc-title

<th class="toc-title">

XmlProcessor VBA class

Table of content

.toc-h1 to .toc-h6

<a href="..." class="toc-h1">

XmlProcessor VBA class

Table of content

.map-table

<table class="map-table">

cuckoo.xsl

Site map

.map-title

<th class="map-title">

cuckoo.xsl

Site map

.ttip

cuckoo.js

Mouse over text with Netscape

.test-style

<p class="test-style"> or

<font class="test-style">

XmlProcessor VBA class

Example

.test-style2

<p class="test-style2"> or <font class="test-style3">

XmlProcessor VBA class

Example

.test-style3

<p class="test-style3"> or <font class="test-style3">

XmlProcessor VBA class

Example

We defined three styles, test-style, test-style2 and test-style3 in cuckoo.dot Word template. Therefore cuckoo created corresponding class elements and we should create test-style, test-style2 and test-style3 in cuckoo.css.

In respect of look and feel you can change whatever you want. However we recommend creating another CSS: if you install a newer version of cuckoo you won't need to update the new cuckoo.css. You just need to update your cuckoo.xsl:

<link rel="stylesheet" href="cuckoo.css" type="text/css"/>

<link rel="stylesheet" href="my.css" type="text/css"/>

For more information about CSS you can visit http://www.blooberry.com/indexdot/css/index.html.

Implementation

Cuckoo object model

Cuckoo uses two main classes:

Therefore if you want to support another feature of Word, you must primarily update WordReactor whereas if you want to generate different XML/XHTML data you must primarily update XmlProcessor.

Let's see the different classes in more details.

Cuckoo has four classes, WordReactor, ParWrapper, TagStack and XmlProcessor

WordReactor

A Word document is made of Paragraphs that can contain Tables made of Cells containing other Paragraphs.

A Paragraph can contain text, Comments, images and hyperlinks.

The main operation of WordReactor is named process. It browses the Word document and invokes processParagraph for every Paragraph that it finds. When the Paragraph contains a Table, processParagraph invokes processTable. Eventually a processParagraph2 is invoked either because either for a root Paragraph or for a Table Cell Paragraph.

processParagraph2 invokes checkForm to handle Form styles, processImage to handle images, processComment to handle Comments. It also invokes the processParagraph2 of XmlProcessor to write the parsed data.

ParWrapper

ParWrapper is a convenient way to put together all parameters passed by WordReactor to XmlProcessor.

XmlProcessor

XmlProcessor has three main methods:

  1. init that checks the environment, opens the site file (sname) and the target XML file (fnameW.xml)

  2. processParagraph2 invoked by WordReactor

  3. The class destructor (Class_Terminate in VBA) that invokes the XSLT transformation and build the target HTML file (fname)

TagStack

The processParagraph2 method of XmlProcessor has to process bullets and numbers. Their implementation in HTML uses <ol> and <ul> and requires the use of a stack. push adds a new <ol> or <ul> on the stack and pop removes it.

Sequence diagram

Sequence diagram: toHTML macro creates a FileSelection form. FileSelection initializes a XmlProcessor and ask WordReactor to process the document.

When you click on the toHTML button, the toHTML macro creates a FileSelection form where you are prompted for the target name of your HTML file and for your style directory. FileSelection creates and initialize a XmlProcessor and a WordReactor. Then FileSelection calls the process method of WordReactor with the active Word document as parameter.

WordReactor invokes the processParagraph2 method of XmlProcessor for each content that it has to write.

Once process has completed its task, FileSelection terminates and XmlProcessor invokes the XSL processor and the default browser.

Glossary

Name

Meaning

XML

eXtended Markup Language.

XHTML

Well-formed HTML. Tags must be closed, for instance <br/>. You cannot write <p><font>something</p></font> but: <p><font>something</font></p>.

As XHTML is well formed XML processors and XSL translators can process it.

ASP

Active Server Page. Microsoft technology running on Internet Information Server (IIS).

JSP

Java Server Page. Java technology running on Java Application Servers such as Tomcat and Resin.

PHP

Hypertext Preprocessor. Server side HTML-embedded scripting language. Just like Cuckoo needs Word, PHP needs a Web Server (IIS, Apache, iPlanet...)

Meta tag

Field used to categorize a Page. Used by search engines.

Title

Field only displayed on Page properties and used by search engines

XSLT

XSL transformation is a XML language allowing transforming an XML document in another document, typically XML or XHTML.

CSS

Cascading Style Sheet. Allow defining how a document (HTML or XML) should be displayed.

Contact:support@pagebox.net
©2001 Alexis Grandemange. Last modified .