I finally bit the bullet! I swear I tried! I pestered people! I failed. I.e., I failed to like any of the "major" HTML-producing libraries "out there" and so I rolled up my own. In the process, I (re)learned a few things and I believe I made good use of parts of the language that are usually overlooked.
My problems with the other libraries
This section must start with an apology.
All the libraries I tried, are very fine and sophisticated pieces of software that do solve problems. Alas, myself being a rotten Lisper, I found that I "needed something different" (read: "something I wrote"). Therefore, the comments you'll read below are not to be intended as general statements about such libraries, but only as testimony of my whims.
The libraires I looked at are CL-HTTP, CL-WHO and variations of TFEB's htout and Franz htmlgen, especially in the XHTML-GENERATOR version that comes with CXML.
As I said, my idiosyncrasies with the the whole business of CL programming found problems with each of these otherwise fine libraries. More specifically, I found CL-HTTP too heavy to use just to generate HTML. One gripe I had with CL-WHO is that it did not handle pretty printing of HTML well (indentation is off in "recursive" use); more or less the same can be said of htout and htmlgen. CXML XHTML-GENERATOR is essentially a "round-trip" utilities and it makes your life quite unhappy if you are trying to use simple HTML entities like - surprise - λ and Λ.
CL-WHO, htout, htmlgen and XHTML-GENERATOR all take the approach summarized as I will compile a SExp representing "HTML" and will generate - in line - a set of specialized writing calls (yes: mostly WRITE
and WRITE-STRING
). (Cfr. the examples in CL-WHO documentation; .)
There is nothing wrong with this approach, but it makes the resulting library and overall implementation more monolithic and it does not leverage some of the bells and whistles that you have available in CL.
Thus I rolled my own (and I called it XHTMΛ).
Yet anothern HTML generation library
My approach to HTML (or XML) generation is the following:
- HTML (or XML!) element need not be "lists" or "conses"; they can be bona-fide objects, i.e., structures.
print-object
, and, above all, the pretty printer are my friends.
*print-pretty*
, *print-readably*
etc., are more than useful.
There are a few consequences from this choices and they should be exposed. Before doing that, let's see what happens in the basic case.
The basic definition in the implementation of XHTMΛ is the representation of a HTML (or XML!) "element". It is very simple and it does accommodate the HTML5 bits and pieces.
(defstruct (element (:constructor %element))
(tag nil :type symbol)
(attributes () :type list)
(content () :type list))
tag
is ... the tag, attributes
is a p-list and content
is a possibly empty list of other element
s.
"Printing" an element
Let's forget a minute about the constructor and let's instead concentrate on an element "printing" process. The main entry point is a print-object
method.
(defmethod print-object ((e element) (s stream))
(let ((tag (element-tag e))
(attributes (element-attributes e))
(content (element-content e))
)
(cond (*print-pretty*
(pprint-xhtml s e))
(*print-readably*
(format s "#S(~S :TAG ~S :ATTRIBUTES ~S :CONTENT ~S)"
(type-of e)
tag
attributes
content))
(t
;; Format string showing-off!!!!
(format s "<~A~{ ~A=\"~A\"~}~:[ />~;>~:*~{~S~^ ~}</~3:*~A>~]"
(string-downcase tag)
attributes
content
)
))
))
The method is rather straightforward (apart from the last format
string, which does many things at once: (1) writes the attributes, (2) checks whether there is content and if not closes the tag, otherwise backs up to print it, and (3) finally it backs up again to the tag to print the proper closing element). Note that, in order to properly and nicely printing the element, if *print-pretty*
is non-NIL, then the function pprint-xhtml
is called.
Using the pretty printer
It may be just me, but I believe that the pretty printer is an under-used part of the CL standard. Therefore, I set out to use it heavily in order to get "properly indented" (meaning, the way I like it) (X)HTML. The function pprint-xhtml
does that.
(defun pprint-xhtml (s xhtml-element)
(declare (type stream s)
(type element xhtml-element))
(let ((tag (string-downcase (element-tag xhtml-element)))
(attrs (element-attributes xhtml-element))
(content (element-content xhtml-element))
)
(pprint-logical-block (s content) ; (1)
(pprint-logical-block (s content) ; (2)
(format s "<~A~@<~{~^ ~A=\"~S\"~^~_~}~:>" tag attrs) ; (3)
(when content
(write-char #\> s)
(pprint-newline :mandatory s)
(format s "~{~4,0:T ~:W~_~}" content)
))
(if content
(format s "~0I</~A>" tag)
(format s " />"))
)))
The function requires a few explanations (of course, if you are a "pretty printer black-belt" this may be a bit boring). First of all, a display of what I want to obtain.
<body style="color: red">
<p>
Some text here
<ul>
<li>
Line 1
</li>
</ul>
</p>
</body>
This indentation may not be the best possible and there are some pitfalls, but it is better than what you get with the other
libraries. But how does the function pprint-xhtml
achieve this result while interacting with the pretty printing machinery?
The function pprint-xhtml
uses three logical blocks. Two for the element and a third for the attributes. The logical block for the attributes is introduced in the format
string using the ~@< ... ~:>
directive. Note also the conditional newline ~_
in the list iteration construction ~{ ... ~}
. The other two pprint-logical-block
establish the fence for the whole element and for the "inside" of the same. The outer pprint-logical block
serves essentially to print the closing tag (if needed) correctly indented. The "inner"
pprint-logical-block
just serves to provide the correct indentation for the tag and the actual element content. The pprint-newline
and the indentation directive in the format
string, do the rest.
Once you wrap your head around it (it did take me some time!) it is very straightforward, and very powerful.
Bells and Whistles
The pretty printing machinery offers you more control over what you can do with it. For the time being my code just uses one simple hook into the pretty printer dispatch table in order to write strings "unquoted", but, potentially, this is the machine to provide fancier element layout.
The actual "printing" of an element is controlled by a specialized macro (provisionally) called with-html-syntax
which calls write
with an appropriately setup :pprint-dispatch
argument.
The variable
*xhtml-pd*
holds the modified pretty print dispatch table, which it is initialized as follows (at a minimum):
(set-pprint-dispatch 'element
'pprint-xhtml
0
*xhtml-pd*)
(set-pprint-dispatch 'string
(lambda (s xhtml-string)
(write-string xhtml-string s))
0
*xhtml-pd*)
This is the result:
XHTMLAMBDA 29 > (with-html-syntax (*standard-output* :print-pretty t)
(body (:style "color: red")
(p ()
"Some text here"
(ul ()
(li () "Line 1")))))
<body style="color: red">
<p>
Some text here
<ul>
<li>
Line 1
</li>
</ul>
</p>
</body>
<body style="color: red"><p>"Some text here" <ul><li>"Line 1"</li></ul></p></body> ; This is value returned!
XHTMLAMBDA 30 >
XHTMΛ Syntax
As you have noted in the previous example, the syntax of a XHTMΛ elements is
(tag attributes . content)
where each tag is implemented as a macro, which is essentially in charge of delaying the evaluation of the content plus some other massaging, mostly flattening of the content lists, this is achieved by having each macro calling a first parsing step, which generates an "intermediate" form that eventually calls the element
function (see below). The following example shows a pretty standard trick:
XHTMLAMBDA 33 > (with-html-syntax (*standard-output* :print-pretty t)
(body (:style "color: red")
(p ()
"Some text here"
(ul () (loop for i below 5
collect (li () (format nil "Line ~D" i)))))))
<body style="color: red">
<p>
Some text here
<ul>
<li>
Line 0
</li>
<li>
Line 1
</li>
<li>
Line 2
</li>
<li>
Line 3
</li>
<li>
Line 4
</li>
</ul>
</p>
</body>
<body style="color: red"><p>"Some text here" <ul><li>"Line 0"</li> <li>"Line 1"</li> <li>"Line 2"</li> <li>"Line 3"</li> <li>"Line 4"</li></ul></p></body>
XHTMLAMBDA 34 >
Thus XHTMΛ is unlike most other libraries which just discriminate on the first element of a SExp, usually a keyword. XHTMΛ wants more structure and it strives to be more easily extensible through "standard" and low-level machinery, cfr., the pretty printing machinery and CLOS. As an aside, the %element
constructor is there just to be called by a "factory" generic function called - you guessed it - element
.
Other Syntaxes and the HTMLIZE
Macro
Yet there is value in the widely used alternative SExp syntax for HTML (and XML):
(tag . content)
or
((tag . attributes) . content)
In order to accommodate such syntax (and also a "keyword-based" one), XHTMΛ provides a htmlize
macro which does some more rewriting from the syntax just above (termed :compact
) to the "operator-and-attributes" syntax (termed :standard
).
XHTMLAMBDA 39 > (htmlize
((body :style "color: red")
(p "Some text here"
(ul (loop for i below 5
collect (li () (format nil "Line ~D" i))))))
)
<body style="color: red"><p>"Some text here" <ul><li>"Line 0"</li> <li>"Line 1"</li> <li>"Line 2"</li> <li>"Line 3"</li> <li>"Line 4"</li></ul></p></body>
Availability
The XHTMΛ library will be available "very soon"™ in common-lisp.net. Stay tuned!
References
[W93] Richard C. Waters, Some Useful Lisp Algorithms: Part
2, Mitsubishi Electric Research Laboratories Technical Report
93-17, August, 1993.
[S90] Guy L. Steele Jr., Common Lisp, the Language,
2nd Edition, Digital Press, 1990.