Using HTML templates in Web Applications: CHTML

This work was originally presented as a poster session by George J. Carrette and Robert Polansky at the Web Conference held in Boston during December of 1995. It describes techniques that were first used in early 1995. Chtml is a compound acronym which stands for Chunks of html.

Introduction	The authoring process
Server application Block diagram.	Perl programming.
Scheme programming.	C programming.
Examples.	Source and Binary Code Availability.
The generic Next script.	Limitations and future directions, support.
Credits.	Copyright.

Introduction

There are a quite a few ways to develop web applications, ranging from methods that involve coding multiple print statements that output hypertext directly, or through subroutine libraries that hide the hypertext issues in the same way that a print driver would hide specific printer escape sequences, all the way toward tools that behave like application source code generators. There are also specialized languages and or techniques that embed general SQL or other programming features into what appears to be hypertext. Server side includes, scripts, active server pages are all names for this kind of thing.

This paper describes Chtml, which is a way to abstract html tricks from programming tricks by considering html files, or portions of html files, to be templates or chunks of html upon which simple substitutions are made. A programmer can use techniques described here directly or through another layer of abstraction (such as a component doing result.write inside an active server page, or document.write inside a client-side scripting environment, hence Chtml is not in opposition to those techniques and instead is potentially complimentary).

In fact the methods here have to do with generating arbitrary text, as seen by the use of a template in the chtml compiler mode which generates C code that can be statically linked. By specifying a template in some other language, such as Java, Javascript, Visual Basic, or Perl, it becomes possible to create data structures suitable for use by runtime library components written in other languages.

The goals of Chtml are to:

prevent programs and libraries called by programs on the application server side from being forced to contain explicit <HTML> directives.
allow programmers, graphic artists and writers to work more independently and in parallel, so as to minimizing time to market for a product.
allow development authors and editorial staff to have maximum control over visible document structure and content, within clear limits imposed by program semantics.
facilitate component reuse.
facilitate tweaking of application look and feel without risking code changes that trigger lengthier QA testing cycles.

CHTML works by transforming static HTML documents residing on the server into dynamic HTML documents that are actually seen by the user. This is similar to but different from server-side-include (e.g. shtml) or web-sql/gsql mechanisms in that it:

avoids embedding into HTML the kinds of directives which only programmers would be comfortable with.
avoids overly extending HTML semantics in a way that prevents the documents from being edited by wysiwyg tools.
only uses templates that are complete and valid stand-alone HTML documents. Therefore the documents can be verified as to HTML level conformance independently from running the scripts that use them.

An ultimate implementation method for this goal could be made using tools capable of full semantic parsing of a formal application of ISO Standard 8879:1986 - Standard Generalized Markup Language (http://www.sil.org/sgml/sgml.html ), such as the proposed ISO/DIS 10179.2, Document Style Semantics and Specification Language, DSSL ( http://www.jclark.com/dsssl/) which provides a good candidate formalism with its STTP (SGML Tree Transformation Process).

However, the implementation of Chtml had to be designed to be simple enough to meet strict time-to-market goals, and the author could not assume that tools were available that could manipulate SGML, as only basic HTML tools were generally available at the time. By adhering to a minimal set of conventions the abstractions needed to acheive the goals can be obtained, through the employment of simple string substitution techniques which can be be implemented quickly.

Note from 19-DEC-1997. A more likely future implementation for 1998 would be based on XML and an api into the Document Object Model. See http://www.w3.org/xml. However, it turns out that XML does not obsolete the utility of this package, which has now found use in making it easier to produce various XML output formats, and is especially helpful in that applications which were producing HTML can be converted to XML output with little effort. The formatting of email messages, both in plain text and in compound mime formats is also facilitated.

Note the capability for chtml templates to be compiled into data structures that are compiled by the C compiler and linked into efficient programs. It still remains a fact of life that embedding printf/output statements containing markup in C code is ugly and potentially extremely inefficient due to the need to compute and buffer all output before an accurate content-length can be calculated. With the chunked-encoding technique available in HTTP/1.1 the accurate content-length calculation is no longer needed to support persistent connections, but having the expected size of the page before downloading it is still a good user-interface feature in situations involving large files and/or slow network links or servers.

The Authoring Process

The authoring process of a brand new Chtml Web application most efficiently consists of the following steps:

Story Boarding/demonstration of the application, perhaps using a web containing static HTML links.
Creation of functional specification including a description of the required data flow between screens and modules, and the requirements for dynamic and/or idiomatic object treatment.
Creation of base template files, with Chtml Interface comments created using information from the functional specification.
Simultaneous release of Chtml templates to:
- the editorial and artistic staff.
- the applications programming staff.
Simultaneous development of CGI scripts and HTML content.
Integration and testing of the combined application.

The authoring process of derivative works differs slightly in that html-capture techniques (such as the personal proxy server described in siod) might be used in reverse engineering an existing application in order to obtain the raw materials for the new html templates.

Derivative works may also be readily synthesized from idioms and components created for previous applications. In particular the chtml-link technique is used to build up libraries of reusable objects. See the description under the heading CHTML Link Example.

CHTML INTERFACE

The CHTML-INTERFACE comment is the most important distinguishing feature of an HTML template document used in a Chtml application. It provides a list of three items, an external name, an internal string (key) by which that name is presented in the HTML document, and an optional comment. The following example is an excerpt from the new user registration application:

<!--CHTML-INTERFACE
fname     ::   __fname_value__     :: user visible datum.
lname     ::   __lname_value__     :: user visible datum.
username  ::   __username_value__  :: user visible datum.
password  ::   __password_value__  :: passed along hidden input.
signature ::   __MD5_SIGNATURE__   :: hidden, verifies fields.
-->

The key strings are chosen so as to be easy to identify in an HTML document. The external names provide the handles by which the CGI scripts can operate on the document in the following ways:

Insert text or other objects into a document.
Cause a section of a document to be repeated an arbitrary number of times, including zero.
extract a subset of a document, treating it as an object to be subject to further manipulations.

If the external name is empty, and the internal name is the string include then the comment field is taken as the name of a file from which to read lines to encorporate into the interface specification. This is a bit of a kludge but has proven to be convenient when many files are to be parsed against a standard header.

<!--CHTML-INTERFACE
:: include :: standard-interface.txt
-->

Objects Internal to a Document

A nestable pair of comments of the form BEGIN-REPEATING-OBJECT, and END-REPEATING-OBJECT are used to indentify those portions of a document which are to be treated as an object. The following example shows how this can be used to construct a table with an arbitrary number of columns not known until runtime:

<TABLE BORDER CELLSPACING=0 CELLPADDING=5>
<TR>
<!--BEGIN-REPEATING-OBJECT-__COLUMNCOUNT__-->
<TH ALIGN=TOP VALIGN=TOP>__COLUMN__</TH>
<!--END-REPEATING-OBJECT-->
</TR>
<!--BEGIN-REPEATING-OBJECT-__ROWCOUNT__-->
<TR ALIGN=TOP>
   <!--BEGIN-REPEATING-OBJECT-__COLUMNCOUNT__-->
   <TD VALIGN=TOP>__ITEM__</TD>
   <!--END-REPEATING-OBJECT-->
</TR>
<!--END-REPEATING-OBJECT-->
</TABLE>

An object can also be bracketed using the pair BEGIN-OBJECT and END-OBJECT. This is used in the linking feature. Outside of that feature it has the effect of a region of text with a repeat count of 0.

In order to save on screen space and eyestrain the less verbose words REPEAT, /REPEAT, and OBJECT, /OBJECT may be used in place of BEGIN-REPEATING-OBJECT, END-REPEATING-OBJECT, and BEGIN-OBJECT, END-OBJECT.

In order to make it easier to balance nested repeating objects you are allowed (in the C and Scheme versions) to specify an object name in the END-REPEATING-OBJECT call:

<!--BEGIN-REPEATING-OBJECT-__COLUMNCOUNT__-->
<!--END-REPEATING-OBJECT-__COLUMNCOUNT__->

If you do not specify an ending object name then the most recently opened object will be closed, otherwise the object name must match the the currently opened object or an error will result. If you specify the verbose level of 2 to the chtml command then the line numbers of the various begin/end sections will be printed out.

$ chtml -v2 something.html

Block diagram

This diagram gives the logical organization of an Application Server Program. Arrows indicate the data flow after an HTTP request is made by the client web browser.


{Chtml Templates}------------> [Chtml Server Application]
                               [ such as:               ]       
{SQL style Servers} <--------> [  - registration/signup ]             
                               [  - polling, surveying  ]
{RPC style Servers} <--------> [  - role playing, games ]
                               [  - info search         ]
{Other Files} <--------------> [------------------------]
                                         ||
                                         ||
                                         \/
{HTML, GIF, etc Files} -----------> [HTTP SERVER]
                                         ||
                                         ||
                                         \/
                                 [WEB BROWSER CLIENT]

Application Server Programming in Perl.

The application server program is typically implemented using perl, or sybperl in the environment of a CGI script requiring the file chtml.pl. This program will in general process its query string and and content and then decide to generate some output to be sent back to the client. The internal processing should result in the computation of the name of an $html_file and an associative array %assoc_arraycontaining the bindings which specify the dynamic content and repeat count of objects in the html template. The actual output is generated by a call to the procedure &ChtmlFilterFile:


       &ChtmlFilterFile($html_file, %assoc_array);

Each key in the associative array is the name of an item specified on the left hand side of the CHTML-INTERFACE comment in the template file.

The value of each key in the associative array is the replacement text to be used in place of the internal key string from the template file (which may represent a repeating object count).

Note: When the value of a key (the replacement text) contains the ascii rubout character \177 (^?) ($chtml_rub) then it is specially handled during the filtering process. The value used in a single substitution will be the substring of everything from the beginning of the string up to the rubout character. The substring of everything after the rubout character is then stored back into the associative array as the value of the key to be used for the rest of the file processing. This mechanism is essential to the usefulness of the REPEATING-OBJECT construct, but can also be used in other situations.

Application Server Programming in Scheme

Chtml documents to be manipulated by CGI scripts programmed in Scheme are first loaded into memory by a process of parsing, string searching and optimization into a compact representation:

(define *doc1* (load-chtml "filename1"))

The representation of an optimized document is inductively defined to be:

a symbol
a string
an array of optimized documents, the first element of which is a name or repeat count.

The CGI script can then pick out a subset of the documents identified by name by fetching it from a loaded document:

(define *chunk1* (chtml-object 'table *doc1*))

Or it can output a document to a stream:

(write-chtml stream hash-table *doc2*)

The hash-table plays the same role as the associative array does in the perl implementation. It gives a mapping between external module interface names and substitution text and/or object repeat counts. The value of a key in this mapping may be a string, a number, or a chtml object which is handled in a recursive manner, allowing objects to be substituted into other objects. The value can also be a list or a lexically-scoped procedure that can be called upon to generate sequence of values. In this way a *chunk1* from *doc1* could be inserted into *doc2* to be used as an idiom once or multiple times; while the *chunk1* may itself have substitutions made into it from other bindings established in the hash table by the caller of write-chtml.

If the value of a key is a list then it is decremented as long as this would not make the list empty. An exception to the lookup convention is that if the key starts with the character "." then the rest of the name after the "." is used as the actual key and no list decrementing will take place.

As a consequences of the implementation in the lisp dialect SIOD there is available a canonical, fast loading, binary disk-file representation of compiled chtml; because any lisp object can be saved and restored using the fast-save and fast-load procedures. These are packed into a compilation command for Chtml:

(compile-chtml "filename" "output-filename")

The resulting output file may be loaded using:

(car (load "output-filename" t))

even in a SIOD environment without chtml.scm loaded. The swrite built-in-function may be instead of write-chtml.

The binary file format of data is based on one-byte opcodes followed by opcode-dependant arguments. Lengths are written as longs in the manner native to the cpu architecture. This format could be read by Chtml implementations in Perl and C/C++, in addition to being natural for scripts written in the SIOD dialect. Note that this binary format is not portable across different machine architectures.

opcode type format Description

2 number DATA DATA is a double, read sizeof(double).

3 symbol LENGTH STRING Print name of symbol follows as LENGTH more bytes

13 string LENGTH STRING string follows as LENGTH more bytes

16 array LENGTH * What follows are LENGTH more objects.

opcode	type	format	Description
2	number	DATA	DATA is a double, read sizeof(double).
3	symbol	LENGTH STRING	Print name of symbol follows as LENGTH more bytes
13	string	LENGTH STRING	string follows as LENGTH more bytes
16	array	LENGTH *	What follows are LENGTH more objects.

A compiled file might contain the following data, displayed using the modern lisp printing conventions:


#(1 "
" "<P><INPUT TYPE=submit VALUE=\"OK" title "\"></Center>
" "
" #(res_count "<TR><TD><INPUT TYPE=RADIO NAME=\"prinum\" VALUE=\"" title "\">
" "<TD>" fname "</TD><TD>" lname "</TD>
" "<TD>(" fname ") 
") "</TABLE>
" "<P><INPUT TYPE=submit VALUE=\"OK" title "\"></Center>
" "
")

Note: The scheme and therefore the chtml C versions support nested repeating objects, and also allow lines with an object marker comment to contain markup before and after the object marker. The Perl version is more restricted. Marker comments should be on an isolated line and nested objects are not supported.

Application Programming in C

The C program must include chtml.h and be linked with -lchtml.

The html templates are compiled using the chtml command, documention on the arguments is available using the unix man command or in the file chtml.txt.

The preparsed templates are loaded into memory at runtime, manipulated, and finally freed. Or the templates may be output as C code data structures to be compiled with the C compiler and statically or dynamically linked into the program.

Fundamental to using these templates is the concept of tabular sequenced data. There are many ways to represent tables and sequences and do output in C programs. If your programming environment already has suitable libraries and/or C++ classes for handling this then you can use those ways, the chtml api is flexible enough to handle them. But if you don't have strong support for strings and tables in your environment you should seriously consider using the string item library provided by the stritem.c module, the functions for which are also declared in the chtml.h file.

In general using chtml for output allows a program to do all "data/business/logical computations" before getting into the actual final output phase.

Here are three advantages of doing all the data computations needed for output ahead of the actual output phase:

It allows for an efficient size_scan procedure to accurately compute the Content-length that will be output without having to allocate a buffer big enough to hold the entire content.
It provides for all possible error conditions to present themselves before actual output is generated. Allowing for better presentation of errors to the user, instead of having them imbedded in partially constructed screens. For example, the error or logic conditions can be used to decide what template to use in generating output.
There is a potential for better locality of reference in an ultimately optimized program, especially one using utilizing copying/compacting storage management or templates compiled by the C compiler.

On the other hand, if you do not want to do all the data computations before going to the output phase, for example, because you are implementing a time-consuming operation and you want to send the browser partial results; there are two obvious implementation possibilities:

Define your main template with sequential sections, such as {start, middle, end} and use chtml_object to find the sections and output them as required.
Program using POSIX pthread_create, with one or more threads computing/fetching data and another thread doing the actual output. Symbols embedded in chtml templates should evaluate to streams connected to the various threads. Object repeat counts should be streams returning a sequence of booleans (false meaning stop) rather than returning an integer count.

struct chtml chtml_load(char filename,char *path)

Searchs path (NULL defaults to ".:/usr/local/lib/html") for filename and calls chtml_fload.

struct chtml chtml_fload(FILE f)

Reads data from the open file f, recursively allocating and returning the chtml structure. A reasonable optimization in an application would be to fopen a template file and then use fstat to determine if it has been modified since it was last floaded. For ultimate efficiency an application might copy the struct chtml to a memory mapped file so that it can be shared by other programs. Note that the chtml compiler can also output a file of C code, which may be compiled and linked (statically or dynamically) with an application. This is another way of avoiding the overhead of chtml_fload at runtime. The is available when the chtml command is given the argument :p=c to specify c output format. The output file defines a procedure which returns a struct chtml *object.

void chtml_free(struct chtml *obj)

Recursively frees all data in the structure obj.

void chtml_debug_print(struct chtml p,FILE f)

Prints, recursively, the chtml structure for debugging purposes.

struct chtml chtml_object(char name,struct chtml *obj)

Finds the first object with the given name inside obj.

void chtml_do_write(char (get)(char ,void ), void tbl, struct chtml p, void (fcn)(char ,void ), void arg)

This does output by calling fcn on strings from the chunks of html object p, and values returned by the get procedure, as in chtml_size_write.

long chtml_size_write(char (get)(char ,void ),void arg,struct chtml p)

Compute the size of output, returning -1 if there is some structural problem with the chunks of html object p. The get procedure is called on all symbols encountered, receiving the name and the callback argument arg. A symbol evaluating to a repeat count should return the decimal string representation of the number.

long chtml_size_write_limit(char (get)(char ,void ),void arg,struct chtml p,long lim)

Like chtml_size_write but will return -1 immediately if the total size of data measured exceeds the limit. Useful in situations where you need to protect against problems caused by absurd object repeat counts.

char chtml_url_encode(char str,char *end)

Provides the "foo bar" => "foo+bar" conversion from str to end, allocating a new string. If end is NULL it defaults to &str[strlen(str)].

char chtml_url_decode(char str,char *end)

The opposite conversion from url_encode. Returns NULL for illegal input formats.

char chtml_html_encode(char str,char *end)

Does the "<foo> &" => "<foo&gr; &" conversion which is required when data is to be inserted into an html document, such as the VALUE field of an INPUT, or into an HREF. Doublequote is converted into " but spaces are left untouched.

char chtml_url_encode_cb(char str,char end, void (*fcn)(size_t len,void ),void cb_arg)

Like chtml_url_encode, but the fcn serverse as an allocation callback.

char chtml_url_decode_cb(char str,char end, void (*fcn)(size_t len,void cb_arg),void cb_arg)

Like chtml_url_decode by with allocator callback.

char chtml_html_encode_cb(char str,char end, void (*fcn)(size_t len,void ),void cb_arg)

Like chtml_html_encode but with allocator callback.

CHTML_STRITEM chtml_stritem_init(char name, CHTML_STRITEM location,CHTML_STRITEM *table_location)

Initializes a string item location, usually a global variable. The item is linked up with the table_location.

void chtml_stritem_insert(CHTML_STRITEM obj,char start,char end)

Inserts of copy of the string from start to end as the last element of the sequence represented by the stritem object obj.

long chtml_stritem_len(CHTML_STRITEM obj)

Returns the number of items in the sequence obj.

void chtml_stritem_free(CHTML_STRITEM *table_location)

To be used on the table_location given as the argument in the calls to the init procedure. Frees all storage and resets the locations to NULL.

void chtml_stritem_rewind(CHTML_STRITEM)

Resets all lists to the beginning. Important when a chtml_size_write procedes the actual output produced by chtml_do_write.

char chtml_stritem_get(char key,CHTML_STRITEM table)

Looks up the key in the table and returns the first element of the cooresponding sequence. The sequence is incremented to the next element if one exists. If key starts with a "." then the rest of the key after the "." is used in the lookup and no sequence decrementing will take place.

CHTML_STRITEM chtml_stritem_getem(char *,CHTML_STRITEM)

Given a key and a table returns the string item structure.

void chtml_stritem_kinsert(CHTML_STRITEM tbl,char key,char start,char end)

This inserts into a table using a key, initializing the table if needed. Takes the place of using chtml_stritem_insert on a variable initialized using chtml_stritem_init. Start out with tbl = NULL. Then pass &tbl to chtml_stritem_kinsert.

void chtml_stritem_kupdate(CHTML_STRITEM ,char ,char ,char )

Same as kinsert but updates the current string value instead of inserting a new value at the end of the list.

void chtml_stritem_kinsert_qs(CHTML_STRITEM table,char content)

The content is a query string or url encoded form content.

char chtml_stritem_get_qs(CHTML_STRITEM table,char buffer,size_t buflen, ...)

The ... argument is NULL or a list of keywords terminating in NULL. Returns the query string encoding of the values from the table. Returns a newly allocated string of up to size buflen or stores data into the buffer. If no keywords are specified then the entire table is encoded. If the buffer is too small then the function returnes NULL.

void chtml_stritem_kinsertl(CHTML_STRITEM ,char ,long)

Same as kinsert but convenient for inserting a long value. Note that we would also have chtml_stritem_kprintf, but there is no vsnprintf generally available in unix, only the unsafe vsprintf.

int chtml_stritem_available(char *key,CHTML_STRITEM)

Returns 1 if key has a value in the table, otherwise 0.

int chtml_stritem_locate(char *value,CHTML_STRITEM list)

Gives the index of the value in the list, otherwise -1.

char chtml_stritem_eval(char key,CHTML_STRITEM table)

This is an alternative to chtml_stritem_get in calls to chtml_size_write or chtml_do_write. It allows the key to take on an expression syntax. The supported operators are:

!key to negate the value of the key, as a boolean.
'key to quote the key, using it as the value.
?key,value0,value1,value2... Evaluate key, then select (0-based) from the rest of the list, looking up the value in the table.
=key1,key2,value1,value2 If the value of key1 is equal to key2 then use value1, otherwise use value2.
|a,b,c is a logical or. Good for flags.
&a,b,c is a logical and, also good for flags.
@default,key,default_string if key is bound then return the value of the key otherwise return the default_string, which is not looked up in the table.
@debug_print,NNN return a string NNN long (default 1024) containing data like the chtml_debug_print procedure.
@url,value Lookup the value and url encode it.
@html,value Lookup the value and html encode it.
@length,key look up key and return the length of the list.
@dlsym,symbol if you compiled the chtml sources with -DUSE_DLSYM then the symbol is looked up in the dynamic runtime environment and invoked as a callback.

Obviously it is possible to extend this in arbitrary ways, but the negation and the quote are the most useful in practice. You might want to copy the source to chtml_stritem_eval and rename it for use in specific application purposes.

In common applications we have found these idioms to be useful:

 <!--CHTML-INTERFACE
?COLOR,GREEN,RED,BLUE :: __COLOR__
=.ITEM,.SELECTED_ITEM,SELECTED, :: xxxSELECTEDxxx
'0 :: __DEBUG__ :: Set to 1 for debug output 0 otherwise.
@debug_print,3000 :: __debug_print__ 
-->
<!--BEGIN-REPEATING-OBJECT-__DEBUG__-->
<!--
__debug_print__
-->
<!--END-REPEATING-OBJECT-->

void chtml_stritem_copy(CHTML_STRITEM target_table,CHTML_STRITEM source_table,char prefix)

Copies the source_table structure into the target_table value. The prefix string is added to the front of all keys in the source_table.

void chtml_stritem_kinsertlr(CHTML_STRITEM table,char k,long start,long end)

Inserts a sequence of integers from start to end inclusive.

long chtml_stritem_available_len(char *key,CHTML_STRITEM table)

Returns the length of the currently available list of items stored under key in the table.

Examples

The source distribution includes a pair of moderately complex examples in database programming. This paper includes a simple example of producing a dynamic list of user home pages, utilizing a unix function getpwent for iterating through the user authentication file. The example is presented in two parts:

the chtml template.
the scheme code for the CGI script.

Obtain the source distribution for examples in C and Perl.

homes.html

<!doctype html public "-//IETF//DTD HTML//EN//2.0">

<!-- name:    homes.html
     purpose: example use of Chunks of HTML.
     author:  george j. carrette
     $Id: chtml.html,v 1.40 1998/06/19 16:28:38 gjc Exp $
-->

<!--CHTML-INTERFACE
 sitename   ::__sitename__  :: name of web site.
 usercount  ::__usercount__ :: number of users to list.
 username   ::__username__  :: a list of user names.
 fullname   ::__fullname__  :: a list of full names.
 urlname    ::__urlname__   :: a list of urls.
 querytime  ::__querytime__ :: for performance measurements.
-->

<html>
<head><title>Home Pages on __sitename__</title></head>
<body>

<CENTER><H1>Home Pages on __sitename__</H1></CENTER>

<P><CENTER><TABLE BORDER=1>
<TR><TH ALIGN=LEFT>username</TH>
    <TH ALIGN=LEFT>Full Name</TH></TR>
<!--BEGIN-REPEATING-OBJECT-__usercount__-->
<TR>
<TD><A href="__urlname__">__username__</A></TD>
<TD>__fullname__</TD>
</TR>
<!--END-REPEATING-OBJECT-->
</TABLE></CENTER>

<!-- query took __querytime__ seconds cpu time. -->

</BODY>
</HTML>

homes-scm.cgi

#!/usr/local/bin/siod -v0,-m3 -*-mode:lisp-*-

;; name:    homes-scm.cgi
;; purpose: illustrate chunks of html cgi application
;; author:  george j. carrette
;; $Id: chtml.html,v 1.40 1998/06/19 16:28:38 gjc Exp $

(require'chtml.scm)

(define (get-homes)
  (let ((item nil)
	(homes nil)
	(gecos nil))
    (while (set! item (getpwent))
      (if (not (access-problem? (string-append (cdr (assq 'dir item))
					       "/public_html")
				"r"))
	  (begin (set! gecos (cdr (assq 'gecos item)))
		 (set! homes (cons (list (cdr (assq 'name item))
					 (substring
					  gecos
					  0
					  (string-search "," gecos)))
				   homes)))))
    (qsort homes string-lessp car)))

(define (main)
  (let ((h (cons-array 10))
	(l (get-homes))
	(form (load-chtml "homes.html")))
    (hset h 'usercount (length l))
    (hset h 'username (mapcar car l))
    (hset h 'fullname (mapcar cadr l))
    (hset h 'urlname (mapcar (lambda (x)
			       (string-append "/~" (car x)))
			     l))
    (hset h 'querytime (car (runtime)))
    (hset h 'sitename (getenv "SERVER_NAME"))
    (writes nil "Content-type: text/html\n\n")
    (write-chtml nil h form)))

CHTML Link Example

The following command parses parses three files then links the first file against objects that are defined in the two other files, then displays the result using the chtmlt command. Object linking is a matter of resolving references to interface defined symbols into objects that are defined elsewhere. You can think of it as a kind of partial evaluation. Some of the substitutions that could be made at runtime during the interpretation of a template are instead made earlier, during a linking phase. When you run the example note that linking is recursive.

chtml link.html        
chtml link1.html        
chtml link2.html        
chtml :action=link link.html-bin link1.html-bin link2.html-bin 
chtmlt link.html-bin-bin

link.html


<!--CHTML-INTERFACE
IDIOM1::__IDIOM1__
IDIOM2::__IDIOM2__
TITLE::__TITLE__
-->
<html>
<head>
<TITLE>Example link technique</TITLE>
</HEAD>
<BODY>

<H1>__TITLE__</H1>

<P>Here will pull in IDIOM1: __IDIOM1__

<P>Here will pull in IDIOM2: __IDIOM2__

<P>That is all.

</BODY>
</HTML>

link1.html


<!--CHTML-INTERFACE
IDIOM1::$$IDIOM1$$ :: we define this
IDIOM3::$$IDIOM3$$ :: we reference this
X::__X__
Y::__Y__
-->
<html>
<head>
<TITLE>Define Some Idioms</TITLE>
</HEAD>
<BODY>

<H1>Define Some Idioms</H1>

<!--BEGIN-OBJECT-$$IDIOM1$$-->
[Actually, we expand into references to __X__ and __Y__
and also expect IDIOM3 = $$IDIOM3$$ to be available.]
<!--END-OBJECT-->

</BODY>
</HTML>

link2.html


<!--CHTML-INTERFACE
IDIOM2::$$IDIOM2$$ :: we define this
IDIOM3::$$IDIOM3$$ :: we define this too
A::__A__
B::__B__
-->
<html>
<head>
<TITLE>Define Some Idioms</TITLE>
</HEAD>
<BODY>

<H1>Define Some Idioms</H1>
<P>

<!--BEGIN-OBJECT-$$IDIOM2$$-->
[Idiom 2 expands into references to __A__ and __B__]
<!--END-OBJECT-->

<P>
<!--BEGIN-OBJECT-$$IDIOM3$$-->
[Idiom 3 is just this chunk of text]
<!--END-OBJECT-->

</BODY>
</HTML>

Credits

The following employees of the MCI/News Corp. Internet Venture were responsible for the Perl implementation and first use of the subject matter of this paper:

Rob Polansky (string-substitution, CHTML-INTERFACE syntax).
Darren Dupre (repeating object implementation).
Mark Lavi (first non-implementor user).

The author would also like to thank Joan O'Brien and Laird Popkin for their helpful comments.

The C implementation of the CHTML api saw first use at Information Access Company, where Evan Morton, Lynne Dao, Tom Vancor, Leonid Gernovski, and Tim Strickland provided particularly good suggestions from the point of view of serious application writers under considerable time pressure to get new products out the door. The linking and inclusion features responded to some of their needs as application templates became larger and more complex over time.

The Generic Next Script

The generic next script, next.cgi can be used to link up a storyboard before a real application is available. It takes all cgi environment values plus form input values and query string values and makes them available as substitution sequences for the template file, which it obtains from the part of the url following its own cgi script name. For example (this will only work when the web server from which you have fetched this document has next.cgi enabled as a cgi script in the same directory, so it won't work in people.delphi.com):

<FORM action="next.cgi/homes.html-bin?fullname=somebody" METHOD="POST">
<INPUT TYPE="HIDDEN" NAME="sitename" VALUE="__SITE__">
<INPUT TYPE="HIDDEN" NAME="usercount" VALUE="2">
User1: <INPUT TYPE=TEXT NAME="username" VALUE="user1">
User2: <INPUT TYPE=TEXT NAME="username" VALUE="user2">
<INPUT TYPE="SUBMIT" VALUE="Click Here">
</FORM>

With a bit of javascript and java you could get a lot of real application work using next.cgi, even if that and perhaps a simple mailer are the only cgi scripts your web hosting service makes available to you.

The next.cgi can also be used to interactively test the output of a template given a set of inputs, for example, this tests the sgml validity of a result:

#!/bin/sh
QUERY_STRING="sitename=TEST_SITENAME&usercount=3&\
username=u1&username=u2&username=u3&fullname=f1&fullname=f2&\
fullname=f3&querytime=100"
REQUEST_METHOD="GET"
PATH_INFO="/homes.html-bin"
NEXT_NOTYPE=1
export QUERY_STRING REQUEST_METHOD PATH_INFO NEXT_NOTYPE
./next.cgi | nsgmls -s

Todo: Release netscape server nsapi and apache versions of next.

Also, document other scripts include, sql_sybase.cgi, cookie.cgi, sp_help.cgi provided with this distribution.

Limitations and Future Directions, Support

The CHTML technique tends to be oriented towards development environments involving hand-edited, or at least hand-tweaked HTML. With some WYSIWYG and/or structured editors one may find it difficult to insert the interface and object delineation comments exactly where needed. For example, when a row in a table is to be repeated. In fact I have found that both Microsoft Frontpage and Netscape Navigator Gold like to move around html comments at will when double nesting of repeated objects is present on input. What does MSWORD do? Hotmetal is fine. Creating psuedo-markup or special tags helps a little.

If the runtime overhead of page creation is a concern then simply use the techiques in make files and batch processes instead of using them in dynamic server applications.

It is important to use subroutines to hide the awkwardness of establishing sequences for substitution into the default selection values for <SELECT> style and other inputs. Or utilize the "?" construct with chtml_stritem_eval.

Finally the support for nested substitution of chunks of html is not fully developed.

This is unsupported software but if you have ideas or comments feel free to send email to gjc@delphi.com.

Source Code Availability

The sources include library support for Perl, Scheme, and C languages, plus documentation and examples.

The distribution is in the form of a compressed tar file: chtml.tgz. You must have the gnu gunzip to decompress and some unix compatible tar utility to extract the individual files. Unix sources for gzip and tar may be obtained from gatekeeper.dec.com under pub/GNU.

An INFO-ZIP archive of sources chtml.zip is provided too.

To use the template compiler for the C version you need SIOD. The chtml.sh script sometimes helps people deal with the inflexibilities of Unix directory structures for installing SIOD in nonstandard ways.

The file chtml_i386.zip contains binaries which may be used with windows SIOD dll's. Note that the command to create the chtml.exe is:

> csiod :o=chtml.exe chtml-cmp.scm chtml.scm

The existence of the binaries in chtml_i386.zip is mostly just proof of the ability to run this stuff under windows with the VC++ compiler, because anyone using it with the VC++ compiler would obviously be able to recompile from sources at will, and the use of the perl and scheme versions is a portable no-brainer. Note that the WIN32 port has shown some problems with the code, particularly when it is run in debug mode with the standard VC++ debug assertions.

Master location for all of these is under http://people.delphi.com/gjc.

Copyright


ENHANCEMENTS COPYRIGHT (c) 1997 BY INFORMATION ACCESS COMPANY.
ALL RIGHTS RESERVED. 

COPYRIGHT (c) 1995-1996 BY NEWS INTERNET SERVICES
ALL RIGHTS RESERVED.

Permission to use, copy, modify, distribute and sell this software and
its documentation for any purpose and without fee is hereby granted,
provided that the above copyright notice appear in all copies and that
both the copyright notice and this permission notice appear in
supporting documentation, and that the name of News Internet Services
not be used in advertising or publicity pertaining to distribution of
the software without specific, written prior permission.

THIS SOFTWARE IS MADE AVAILABLE WITHOUT CHARGE, AS-IS.  NEWS 
INTERNET SERVICES DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS 
SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND 
FITNESS, IN NO EVENT SHALL NEWS INTERNET BE LIABLE FOR ANY SPECIAL, 
INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER 
RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION
OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.

Using HTML templates in Web Applications: CHTML

Contents

Introduction

The Authoring Process

CHTML INTERFACE

Objects Internal to a Document

Block diagram

Application Server Programming in Perl.

Application Server Programming in Scheme

Application Programming in C

struct chtml *chtml_load(char *filename,char *path)

struct chtml *chtml_fload(FILE *f)

void chtml_free(struct chtml *obj)

void chtml_debug_print(struct chtml *p,FILE *f)

struct chtml *chtml_object(char *name,struct chtml *obj)

void chtml_do_write(char *(*get)(char *,void *), void *tbl, struct chtml *p, void (*fcn)(char *,void *), void *arg)

long chtml_size_write(char *(*get)(char *,void *),void *arg,struct chtml *p)

long chtml_size_write_limit(char *(*get)(char *,void *),void *arg,struct chtml *p,long lim)

char *chtml_url_encode(char *str,char *end)

char *chtml_url_decode(char *str,char *end)

char *chtml_html_encode(char *str,char *end)

char *chtml_url_encode_cb(char *str,char *end, void *(*fcn)(size_t len,void *),void *cb_arg)

char *chtml_url_decode_cb(char *str,char *end, void *(*fcn)(size_t len,void *cb_arg),void *cb_arg)

char *chtml_html_encode_cb(char *str,char *end, void *(*fcn)(size_t len,void *),void *cb_arg)

CHTML_STRITEM chtml_stritem_init(char *name, CHTML_STRITEM *location,CHTML_STRITEM *table_location)

void chtml_stritem_insert(CHTML_STRITEM obj,char *start,char *end)