F# Advent Day of Code: Taking advantage of the F# Type-System
Today is my date for F# Advent, and I want to discuss as pretty large topic,
which happens to be domain modeling with F# types, and how to model the
types to prevent bugs.
One of the "going jokes" in the F# community is that "if it compiles, it
works and bug free." This is obviously not entirely true, there's always a
possibility that it won't work properly, but it is actually proven fairly
often. Part of this is because of F#'s type system, the strong-typing
requisite, and the fact that it does not implicitly convert them. The
problem, and often the source of bugs that do manifest, is that without a
little thinking, it's hard to build a type-structure that creates this
"bug free" scenario. It can be done, but you have to forget about how you
think about types.
In this lesson we're going to take a library I'm preparing to open-source,
and discuss how to build a type-system that is mostly flawless, and doesn't
allow us to model incorrect structure or state.
Step 1: same as usual, identify the problem
The first step of any program is to identify the problem. In our situation,
the problem I'm proposing to solve is the enormous problem of generating
HTML code from a non-HTML language, and not writing it by hand. More often
than not I'm forced to look up intricacies of HTML, and subtleties that
really ought to be a lot easier. We're also going to see how we can generate
HTML that is 100% conformant to the HTML specification, and how we can even
embed compiler warnings into our system, such that if it compiles, it's
valid HTML.
Remember that F# is a functional language, and uses an algebraic
type-system. The language itself is designed to support these types of ideas
natively and out of the box, so we're going to go over the basics of
building this system, and we'll cover what the type-system gives us that
enables compile-time safety of a generated system.
Step 2: solve the problem
This is the last heading, and it's going to be long, so buckle-up and
enjoy the ride.
HTML is a language that is used predominantly to generate markup for
websites and web-based applications. It's used to define a structure which a
browser or device can use to render and organize content. A basic, perfectly
valid, HTML page might look as follows:
A Page Title
To a normal developer like myself, there's nothing to this. We define the
"DOCTYPE" first, then define the HTML content. This is a basic HTML5
compliant page, that only has a title and no content. HTML does not require
content, but it does, interestingly, require the <title>
element.
Now from this we can begin extracting some types. We can start to define how
we want to model the type-system to make sure that we always generate a
perfectly compliant page. We see that HTML has a few characteristics:
- A
DOCTYPE
defining the document;
- A
<head>
including the meta-data of the document;
- A
<title>
in the head defining the title of the document;
- A
<body>
including the content of the document;
With this, we'll actually build a quick type:
type DocType_V1 = | Html5
type Head_V1 = { Title : string }
type Document_V1 = { Doctype : DocType_V1; Head : Head_V1; Body : string }
let html5Document_v1 = { Document_V1.Doctype = Html5; Head = { Head_V1.Title = "A Test Document" }; Body = "" }
While this is a perfectly valid type, this doesn't solve our problem yet.
All we have accomplished here is defining the "basics" of HTML, we can
actually have a completely invalid <body>
at this point. We also have none
of the additional features available in the <head>
element, such as
<meta>
tags, style-sheets, JavaScript, etc. Because I'm pragmatic, we're
going to work from top-to-bottom, so we'll start with defining a better
document.
The biggest issue here is that there are 7 major document types in modern
HTML:
- HTML 5 (the newest);
- HTML 4.01 Frameset (very loose definition);
- HTML 4.01 Transitional (less loose, but still not very strict);
- HTML 4.01 Strict (very strict, compliant definition);
- XHTML 1.0 Frameset (a loose XHTML definition);
- XHTML 1.0 Transitional (less loose, but still not very strict XHTML
definition);
- XHTML 1.0 Strict (very strict, compliant XHTML definition);
Knowing this, there's actually very different aspects to each document type.
Therefore, it doesn't make sense to use an enum for it, it instead makes
sense to use a outer union for it:
type Head_V2 = { Title : string }
type StrictDocument_V2 = { Head : Head_V2; Body : string }
type TransitionalDocument_V2 = { Head : Head_V2; Body : string }
type FramesetDocument_V2 = { Head : Head_V2; Body : string }
type DocumentForm_V2 = | Strict of StrictDocument_V2 | Transitional of TransitionalDocument_V2 | Frameset of FramesetDocument_V2
type Document_V2 = | Html5Document of StrictDocument_V2 | Html4Document of DocumentForm_V2 | XhtmlDocument of DocumentForm_V2
let html5Document_v2 = Document_V2.Html5Document { StrictDocument_V2.Head = { Head_V2.Title = "A Test Document" }; Body = "" }
What just became quite interesting is that it's actually impossible to
model an unsupported document type now. Previously, we would actually hit an
issue where it would become possible to model a document that was not a
StrictDocument
, which would create an invalid HTML document.
The next step is to model some of the <meta>
tags that are often placed
inside a <head>
element. These are things like the document character-set,
linked-resources, language, description, keywords, etc. The <meta>
tags
can also be free-form, meaning they can be a tag to support things like
Facebook descriptions and identifiers which help with creating pages that
are more friendly to the social-network, etc. To support this we'll actually
build mutliple meta-tag models.
To model this, we'll start with the "standard" meta-data's to be supported.
This is easily modeled following the requirements in the HTML5 specification
which is freely (and easily) accessible:
type StandardMeta_V3 = | ApplicationName of Lang : string option | Author | Description | Generator | Keywords | Referrer
The interesting tag here is ApplicationName
, which allows you to specify
between zero and one, inclusive, tags with the same "language". These would
be included in the document as follows:
The kicker is that lang
is optional, and there may be only one of each
lang
value in a document.
Another important aspect of a <head>
element is the charset
, or
character-set. Every HTML page is allowed to declare a <meta>
tag with a
name="charset"
attribute exactly once, and this charset
should be the
encoding used to send the page from server to browser. We'll actually define
an enum type for this:
type Charset_V3 = | Utf8
And we'll include this and the standard <meta>
tags in the new Head
element:
type Head_V3 = { Title : string; Charset : Charset_V3 option; StandardMetas : Map<StandardMeta_V3, string> }
Still quite simple, and we'll use our previous model with a few minor
modifications:
type StrictDocument_V3 = { Head : Head_V3; Body : string option }
type TransitionalDocument_V3 = { Head : Head_V3; Body : string option }
type FramesetDocument_V3 = { Head : Head_V3; Body : string option }
type DocumentForm_V3 = | Strict of StrictDocument_V3 | Transitional of TransitionalDocument_V3 | Frameset of FramesetDocument_V3
type Document_V3 = | Html5Document of StrictDocument_V3 | Html4Document of DocumentForm_V3 | XhtmlDocument of DocumentForm_V3
let html5Document_v3 =
Document_V3.Html5Document
{ StrictDocument_V3.Head =
{ Head_V3.Title = "A Test Document"
Charset = Some Charset_V3.Utf8
StandardMetas = [(StandardMeta_V3.ApplicationName (Some "en"), "Test Application"); (StandardMeta_V3.Description, "A test application.")] |> Map.ofList }
Body = None }
Side note: we used Map
, but as there's currently no syntax to construct a
map natively, it's possible to create a situation where the compiler will
not stop us from creating an invalid list, this is something the F# team has
on the list, and hopefully it's resolved sooner rather than later.
So now that we have a document created and modeled, it's time to generate.
Fortunately, generating HTML is actually really easy if you follow the
rules / guidelines. For XHTML that means always lower-case the attribute
and tag names, for regular HTML it's not as strict, but still has rules.
The first rule of HTML and XHTML alike is the DOCTYPE
, which is
(fortunately) a static list. For us, we care about the following:
let getDoctype_v3 document =
match document with
| Document_V3.XhtmlDocument (Strict _) -> "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Strict//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd\">"
| XhtmlDocument (Transitional _) -> "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">"
| XhtmlDocument (Frameset _) -> "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Frameset//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd\">"
| Html4Document (Strict _) -> "<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01//EN\" \"http://www.w3.org/TR/html4/strict.dtd\">"
| Html4Document (Transitional _) -> "<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\" \"http://www.w3.org/TR/html4/loose.dtd\">"
| Html4Document (Frameset _) -> "<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01 Frameset//EN\" \"http://www.w3.org/TR/html4/frameset.dtd\">"
| Html5Document _ -> "<!DOCTYPE HTML>"
With this single function, we can get every DOCTYPE
we need for all our
supported models.
let html5Document_v3_doctype = html5Document_v3 |> getDoctype_v3
This ought to return <!DOCTYPE HTML>
. You'll notice that for all three of
the XHTML DOCTYPE
results, html
is lower-case. That's part of the spec I
mentioned.
As usual, I want to declare a couple functions we'll use for the whole
project:
let joinStrings sep (strs : string array) = System.String.Join(sep, strs)
let prependIfText sep str = if str |> Seq.isEmpty |> not then sprintf "%s%s" sep str else ""
They just make life easier.
Next, we need to think about our document. Our document is made up of
"elements" which are represented as "tags", and tags have a couple rules
about them:
- Some tags are allowed to self-close, or
<tag />
vs <tag></tag>
(the <script>
tag for example cannot self-close);
- Some tags are "inline" tags, that is, we don't want to break lines inside the tag (like
<title>
, we want to do <title>...</title>
, without a line-break);
- Some tags have attributes, and some do not;
- Some tags will have content, and some will not;
- All tags have a name;
With this, we can begin to define a function that will fulfill these
requirements. We'll always know ahead-of-time whether the tag will allow
self-closing, and whether or not we want to include line-breaks, so we want
to start defining an API:
let renderTag_v3_1 (selfClose : bool) (includeBreaks : bool) (tagName : string) = ()
Obviously we need to add a body to the function, but this starts us off. We
can then define functions which will alias to this function. The next item
should probably be attributes, because sometimes we'll have those and
sometimes we won't.
We probably want to include content as well, so our final API might look
like:
let renderTag_v3_2 (selfClose : bool) (includeBreaks : bool) (tagName : string) (attrs : Map<string, string>) (content : string) = ()
Finally, we want to define a body. The first part of any tag is starting it,
or the <tagName
bit. (Notice I excluded the closing bracket: this is going
to be dealt with momentarily.) The <tagName
bit should also have
attributes if there are any, so we'll want <tagName attr="value"
. To do
this, we can actually map our attributes to an array, then map them to
key=\"value\"
, then join them on a space, then add space if there are any,
then finally we can print the starting tag:
let renderTag_v3_3 (selfClose : bool) (includeBreaks : bool) (tagName : string) (attrs : Map<string, string>) (content : string) =
let tagStart =
attrs
|> Map.toArray
|> Array.map (fun ((k : string), (v : string)) -> sprintf "%s=\"%s\"" (k.ToLower()) (v.Replace("\"", "\\\"")))
|> joinStrings " "
|> prependIfText " "
|> sprintf "<%s%s" tagName
tagStart
Once again, we're making decent progress. We've followed the XHTML
convention, of lower-casing the key, which is also entirely valid for HTML.
(Conveniently.)
Next, if the content is empty, we want to either self-close or explictly
close the tag. This is accomplished by a simple if
.
let renderTag_v3_4 (selfClose : bool) (includeBreaks : bool) (tagName : string) (attrs : Map<string, string>) (content : string) =
let tagStart =
attrs
|> Map.toArray
|> Array.map (fun ((k : string), (v : string)) -> sprintf "%s=\"%s\"" (k.ToLower()) (v.Replace("\"", "\\\"")))
|> joinStrings " "
|> prependIfText " "
|> sprintf "<%s%s" tagName
if content |> System.String.IsNullOrEmpty then
if selfClose then sprintf "%s />" tagStart
else sprintf "%s></%s>" tagStart tagName
else
tagStart
In our else
branch, when there is content, we want to test it for any
line-breaks, and if we should include them do so:
let renderTag_v3_5 selfClose includeBreaks tagName attrs content =
let tagStart =
attrs
|> Map.toArray
|> Array.map (fun ((k : string), (v : string)) -> sprintf "%s=\"%s\"" (k.ToLower()) (v.Replace("\"", "\\\"")))
|> joinStrings " "
|> prependIfText " "
|> sprintf "<%s%s" tagName
if content |> System.String.IsNullOrEmpty then
if selfClose then sprintf "%s />" tagStart
else sprintf "%s></%s>" tagStart tagName
else
let tagStart = sprintf "%s>" tagStart
let breakV = if includeBreaks then System.Environment.NewLine else ""
let tagEnd = sprintf "</%s>" tagName
sprintf "%s%s%s%s%s%s" tagStart breakV content breakV tagEnd breakV
So now that gets our main tag-rendering started. The next bit is to actually
build prototypes for each main tag:
let renderTitle_v3 = renderTag_v3_5 false false "title" Map.empty<string, string>
let renderHead_v3 = renderTag_v3_5 false true "head" Map.empty<string, string>
let renderBody_v3 = renderTag_v3_5 false true "body"
As you can see, each tag has specific expectations. Our <title>
tag has no
line-breaks, and does not allow attributes, whereas our <head>
tag has
line-breaks, and does not allow attributes. Our <body>
tag has both.
We have another utility method to define:
let stringOrBlank = function | Some s -> s | None -> ""
Which will make sense in a moment.
The next step is to render the actual document. This is relatively easy,
render the <head>
, render the <body>
, and render the <!DOCTYPE>
.
Because we have 3 major HTML document-types, and each has different rules,
we'll define a function for each:
let getStrictBody_v3 (document : StrictDocument_V3) =
let sb = System.Text.StringBuilder()
document.Head.Title |> renderTitle_v3 |> renderHead_v3 |> sb.Append |> ignore
document.Body |> stringOrBlank |> renderBody_v3 ([] |> Map.ofList) |> sb.Append |> ignore
sb.ToString()
let getTransitionalBody_v3 (document : TransitionalDocument_V3) =
let sb = System.Text.StringBuilder()
document.Head.Title |> renderTitle_v3 |> renderHead_v3 |> sb.Append |> ignore
document.Body |> stringOrBlank |> renderBody_v3 ([] |> Map.ofList) |> sb.Append |> ignore
sb.ToString()
let getFramesetBody_v3 (document : FramesetDocument_V3) =
let sb = System.Text.StringBuilder()
document.Head.Title |> renderTitle_v3 |> renderHead_v3 |> sb.Append |> ignore
document.Body |> stringOrBlank |> renderBody_v3 ([] |> Map.ofList) |> sb.Append |> ignore
sb.ToString()
Now we see how things start to fit together, and we still have to return one
more larger document for the full <html>
bit:
let getBody_v3 document =
match document with
| XhtmlDocument (Strict d) | Html4Document (Strict d) | Html5Document d -> d |> getStrictBody_v3
| XhtmlDocument (Transitional d) | Html4Document (Transitional d) -> d |> getTransitionalBody_v3
| XhtmlDocument (Frameset d) | Html4Document (Frameset d) -> d |> getFramesetBody_v3
Of course, it's obvious this isn't handling the <html>
bit, this just
calls the next renderer.
let printDocument_v3 document =
let doctype = document |> getDoctype_v3
let sb = System.Text.StringBuilder()
doctype |> sb.Append |> ignore
System.Environment.NewLine |> sb.Append |> ignore
let attrs =
match document with
| Html4Document _ | Html5Document _ -> [] |> Map.ofList
| XhtmlDocument _ -> [("xmlns", "http://www.w3.org/1999/xhtml")] |> Map.ofList
document |> getBody_v3 |> renderTag_v3_5 false true "html" attrs |> sb.Append |> ignore
sb.ToString()
While this is a great start, we haven't fully supported our model. We don't
render the <meta>
tags I promised. To do that, we have to add a little
more code (though not much). For these we'll abstract up our <head>
rendering another level, because we need to render sequences in the <head>
tag.
let renderMetaTag_v4 = renderTag_v3_5 true false "meta"
let getCharset_v4 c =
match c with
| Charset_V3.Utf8 -> "utf-8"
let getStandardMetaAttrs_v4 m =
let result s = [("name", s)]
match m with
| StandardMeta_V3.ApplicationName lang -> lang |> (function | Some l -> [("lang", l)] | _ -> []) |> List.append [("name", "application-name")]
| Author -> "author" |> result
| Description -> "description" |> result
| Generator -> "generator" |> result
| Keywords -> "keywords" |> result
| Referrer -> "referrer" |> result
let getHead_v4 head =
let sb = System.Text.StringBuilder()
head.Title |> renderTitle_v3 |> sb.Append |> ignore
match head.Charset with
| Some c ->
let charset = c |> getCharset_v4
renderMetaTag_v4 ([("content", (sprintf "text/html; charset=%s" charset)); ("http-equiv", "Content-Type")] |> Map.ofList) "" |> sb.Append |> ignore
| None -> ()
head.StandardMetas |> Map.toArray |> Array.map (fun (k, v) -> renderMetaTag_v4 ([("content", v)] |> List.append (k |> getStandardMetaAttrs_v4) |> Map.ofList) "") |> Array.iter (sb.Append >> ignore)
sb.ToString() |> renderHead_v3
let getStrictBody_v4 (document : StrictDocument_V3) =
let sb = System.Text.StringBuilder()
document.Head |> getHead_v4 |> sb.Append |> ignore
document.Body |> stringOrBlank |> renderBody_v3 ([] |> Map.ofList) |> sb.Append |> ignore
sb.ToString()
let getTransitionalBody_v4 (document : TransitionalDocument_V3) =
let sb = System.Text.StringBuilder()
document.Head |> getHead_v4 |> sb.Append |> ignore
document.Body |> stringOrBlank |> renderBody_v3 ([] |> Map.ofList) |> sb.Append |> ignore
sb.ToString()
let getFramesetBody_v4 (document : FramesetDocument_V3) =
let sb = System.Text.StringBuilder()
document.Head |> getHead_v4 |> sb.Append |> ignore
document.Body |> stringOrBlank |> renderBody_v3 ([] |> Map.ofList) |> sb.Append |> ignore
sb.ToString()
With all that done, we should get the apporpriate result with our "standard"
meta-tags returned in the result output. This demonstrates how truly easy
it is to modify this system to add features.
Now we haven't even come close to finishing with the <head>
section, nor
even close to fulfilling even the most common elements of it. For that,
we'll actually add a couple more properties to our record type.
A <head>
section in an (X)HTML page often consists of stylesheets and
JavaScript as well as the items we've already included. Fortunately,
including support for those two sections is bewilderingly easy. We'll just
create two new properties on our Head
type, and we'll also add a new
Union
type to indicate if they're a file, or the actual script.
There are two ways to include stylesheets and JavaScript: you can include a
reference to a file that holds the information, or you can include the raw
information. Supporting both here is extremely easy:
type Resource_V5 = | File of Name : string | Data of RawData : string
So now, we could actually create a JavaScript
section to our Head
:
type Head_V5 = { Title : string; Charset : Charset_V3 option; StandardMetas : Map<StandardMeta_V3, string>; Scripts : Resource_V5 list; Stylesheets : Resource_V5 list }
That's easy enough. I also included the stylesheets, because both will use
the same union type, with different values. Now we can create a new document
with these specified:
type StrictDocument_V5 = { Head : Head_V5; Body : string option }
type TransitionalDocument_V5 = { Head : Head_V5; Body : string option }
type FramesetDocument_V5 = { Head : Head_V5; Body : string option }
type DocumentForm_V5 = | Strict of StrictDocument_V5 | Transitional of TransitionalDocument_V5 | Frameset of FramesetDocument_V5
type Document_V5 = | Html5Document of StrictDocument_V5 | Html4Document of DocumentForm_V5 | XhtmlDocument of DocumentForm_V5
let html5Document_v5 =
Document_V5.Html5Document
{ StrictDocument_V5.Head =
{ Head_V5.Title = "A Test Document"
Charset = Some Charset_V3.Utf8
StandardMetas = [(StandardMeta_V3.ApplicationName (Some "en"), "Test Application"); (StandardMeta_V3.Description, "A test application.")] |> Map.ofList
Scripts = [Resource_V5.File "somepath/file.js"; Resource_V5.Data "alert(\"Test script!\");"]
Stylesheets = [] }
Body = None }
Now we just have to render these, which is only a slight change to our
getHead
function:
let renderScript_v5 s =
match s with
| Resource_V5.File f -> renderTag_v3_5 false false "script" ([("type", "text/javascript"); ("src", f)] |> Map.ofList) ""
| Data d -> renderTag_v3_5 false true "script" ([("type", "text/javascript")] |> Map.ofList) d
let renderStyle_v5 s =
match s with
| Resource_V5.File f -> renderTag_v3_5 true false "link" ([("rel", "stylesheet"); ("type", "text/css"); ("href", f)] |> Map.ofList) ""
| Data d -> renderTag_v3_5 true true "style" ([("type", "text/css")] |> Map.ofList) d
let getHead_v5 head =
let sb = System.Text.StringBuilder()
let render (s : string) =
s |> sb.Append |> ignore
System.Environment.NewLine |> sb.Append |> ignore
head.Title |> renderTitle_v3 |> render
match head.Charset with
| Some c ->
let charset = c |> getCharset_v4
"" |> render
renderMetaTag_v4 ([("content", (sprintf "text/html; charset=%s" charset)); ("http-equiv", "Content-Type")] |> Map.ofList) "" |> sb.Append |> ignore
| None -> ()
match head.StandardMetas with
| m when m.IsEmpty -> ()
| m ->
"" |> render
m |> Map.toArray |> Array.map (fun (k, v) -> renderMetaTag_v4 ([("content", v)] |> List.append (k |> getStandardMetaAttrs_v4) |> Map.ofList) "") |> Array.iter render
match head.Stylesheets with
| [] -> ()
| s ->
"" |> render
s |> List.map renderStyle_v5 |> List.iter render
match head.Scripts with
| [] -> ()
| s ->
"" |> render
s |> List.map renderScript_v5 |> List.iter render
sb.ToString() |> renderHead_v3
let getStrictBody_v5 (document : StrictDocument_V5) =
let sb = System.Text.StringBuilder()
document.Head |> getHead_v5 |> sb.Append |> ignore
document.Body |> stringOrBlank |> renderBody_v3 ([] |> Map.ofList) |> sb.Append |> ignore
sb.ToString()
let getTransitionalBody_v5 (document : TransitionalDocument_V5) =
let sb = System.Text.StringBuilder()
document.Head |> getHead_v5 |> sb.Append |> ignore
document.Body |> stringOrBlank |> renderBody_v3 ([] |> Map.ofList) |> sb.Append |> ignore
sb.ToString()
let getFramesetBody_v5 (document : FramesetDocument_V5) =
let sb = System.Text.StringBuilder()
document.Head |> getHead_v5 |> sb.Append |> ignore
document.Body |> stringOrBlank |> renderBody_v3 ([] |> Map.ofList) |> sb.Append |> ignore
sb.ToString()
let getBody_v5 document =
match document with
| XhtmlDocument (Strict d) | Html4Document (Strict d) | Html5Document d -> d |> getStrictBody_v5
| XhtmlDocument (Transitional d) | Html4Document (Transitional d) -> d |> getTransitionalBody_v5
| XhtmlDocument (Frameset d) | Html4Document (Frameset d) -> d |> getFramesetBody_v5
let getDoctype_v5 document =
match document with
| Document_V5.XhtmlDocument (Strict _) -> "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Strict//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd\">"
| XhtmlDocument (Transitional _) -> "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">"
| XhtmlDocument (Frameset _) -> "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Frameset//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd\">"
| Html4Document (Strict _) -> "<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01//EN\" \"http://www.w3.org/TR/html4/strict.dtd\">"
| Html4Document (Transitional _) -> "<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\" \"http://www.w3.org/TR/html4/loose.dtd\">"
| Html4Document (Frameset _) -> "<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01 Frameset//EN\" \"http://www.w3.org/TR/html4/frameset.dtd\">"
| Html5Document _ -> "<!DOCTYPE HTML>"
let printDocument_v5 (document : Document_V5) =
let doctype = document |> getDoctype_v5
let sb = System.Text.StringBuilder()
doctype |> sb.Append |> ignore
System.Environment.NewLine |> sb.Append |> ignore
let attrs =
match document with
| Html4Document _ | Html5Document _ -> [] |> Map.ofList
| XhtmlDocument _ -> [("xmlns", "http://www.w3.org/1999/xhtml")] |> Map.ofList
document |> getBody_v5 |> renderTag_v3_5 false true "html" attrs |> sb.Append |> ignore
sb.ToString()
let writeFile f c = System.IO.File.WriteAllText(System.IO.Path.Combine(__SOURCE_DIRECTORY__, f), c)
html5Document_v5 |> printDocument_v5 |> writeFile "TestFile.html"
With all of this, our work starts to come together nicely. We want to test
it with an XHTML document as well, which is just as easy:
let xhtml1Document_v5 =
Document_V5.XhtmlDocument
(DocumentForm_V5.Strict
{ StrictDocument_V5.Head =
{ Head_V5.Title = "A Test Document"
Charset = Some Charset_V3.Utf8
StandardMetas = [(StandardMeta_V3.ApplicationName (Some "en"), "Test Application"); (StandardMeta_V3.Description, "A test application.")] |> Map.ofList
Scripts = [Resource_V5.File "somepath/file.js"; Resource_V5.Data "alert(\"Test script!\");"]
Stylesheets = [] }
Body = None } )
xhtml1Document_v5 |> printDocument_v5 |> writeFile "TestFile.html"
Next, we want to add support for non-standard <meta>
tags, which is
extremely easy:
type Head_V6 = { Title : string; Charset : Charset_V3 option; StandardMetas : Map<StandardMeta_V3, string>; AdditionalMetas : Map<string, string>; Scripts : Resource_V5 list; Stylesheets : Resource_V5 list }
let getHead_v6 head =
let sb = System.Text.StringBuilder()
let render (s : string) =
s |> sb.Append |> ignore
System.Environment.NewLine |> sb.Append |> ignore
head.Title |> renderTitle_v3 |> render
match head.Charset with
| Some c ->
let charset = c |> getCharset_v4
"" |> render
renderMetaTag_v4 ([("content", (sprintf "text/html; charset=%s" charset)); ("http-equiv", "Content-Type")] |> Map.ofList) "" |> sb.Append |> ignore
| None -> ()
match head.StandardMetas with
| m when m.IsEmpty -> ()
| m ->
"" |> render
m |> Map.toArray |> Array.map (fun (k, v) -> renderMetaTag_v4 ([("content", v)] |> List.append (head.AdditionalMetas |> Map.toList) |> List.append (k |> getStandardMetaAttrs_v4) |> Map.ofList) "") |> Array.iter render
match head.Stylesheets with
| [] -> ()
| s ->
"" |> render
s |> List.map renderStyle_v5 |> List.iter render
match head.Scripts with
| [] -> ()
| s ->
"" |> render
s |> List.map renderScript_v5 |> List.iter render
sb.ToString() |> renderHead_v3
You'll notice the only thing I changed is to add a
List.append (head.AdditionalMetas |> Map.toList)
in our standard meta
printing, because the standard metas are, at that point, string * string
.
This means we can guarantee it works because we used the exact same code to
do both. By taking the combined list back to a Map
, we guarantee that each
key is specified exactly once even among the combined lists.
We have one more part of the <head>
tag I want to go over, and that's the
presence of a <base>
tag. This is where we'll see a divide between the
XHTML and HTML specifications, as they have different allowable options.
For XHTML 1.0, the <base>
tag is allowed to have an href
attribute, that
specifies what the base URL is for relative navigation. (I.e. if you provide
a link to somepage.html
, the <base href="http://www.example.com/">
tag
will make that a link to http://www.example.com/somepage.html
.)
In HTML, the following unions would satisfy our <base>
tag requirements:
type HrefTarget_V7 = | SameContextWindow | NewCleanContext | ParentContext | TopMostContext
type HtmlBaseElement_V7 = | Href of string | Target of HrefTarget_V7 | HrefTarget of string * HrefTarget_V7
In XHTML, it's only allowed to be a Href
or not specified. For that, we'll
define some new <head>
models:
type HtmlHead_V7 = {
Title : string
Base : HtmlBaseElement_V7 option
Charset : Charset_V3 option
StandardMetas : Map<StandardMeta_V3, string>
AdditionalMetas : Map<string, string>
Stylesheets : Resource_V5 list
Scripts : Resource_V5 list }
type XhtmlHead_V7 = {
Title : string
Base : string option
Charset : Charset_V3 option
StandardMetas : Map<StandardMeta_V3, string>
AdditionalMetas : Map<string, string>
Stylesheets : Resource_V5 list
Scripts : Resource_V5 list }
Obviously this means we will now have two different head models, which means
we will need two different renderers, and two different sets of document
types, etc. While we're at it, we may-as-well start defining the Body
type, which will hold our main content.
type Body_V7 = { Attributes : Map<string, string>; Children : string option }
type HtmlFramesetDocument_V7 = { Head : HtmlHead_V7; Body : Body_V7 }
type HtmlTransitionalDocument_V7 = { Head : HtmlHead_V7; Body : Body_V7 }
type HtmlStrictDocument_V7 = { Head : HtmlHead_V7; Body : Body_V7 }
type XhtmlFramesetDocument_V7 = { Head : XhtmlHead_V7; Body : Body_V7 }
type XhtmlTransitionalDocument_V7 = { Head : XhtmlHead_V7; Body : Body_V7 }
type XhtmlStrictDocument_V7 = { Head : XhtmlHead_V7; Body : Body_V7 }
type HtmlDocumentForm_V7 = | Frameset of HtmlFramesetDocument_V7 | Transitional of HtmlTransitionalDocument_V7 | Strict of HtmlStrictDocument_V7
type XhtmlDocumentForm_V7 = | Frameset of XhtmlFramesetDocument_V7 | Transitional of XhtmlTransitionalDocument_V7 | Strict of XhtmlStrictDocument_V7
type WebDocument_V7 = | Html5Document of HtmlStrictDocument_V7 | Html4Document of HtmlDocumentForm_V7 | XhtmlDocument of XhtmlDocumentForm_V7
let getDoctype_v7 document =
match document with
| XhtmlDocument (XhtmlDocumentForm_V7.Strict _) -> "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Strict//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd\">"
| XhtmlDocument (XhtmlDocumentForm_V7.Transitional _) -> "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">"
| XhtmlDocument (XhtmlDocumentForm_V7.Frameset _) -> "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Frameset//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd\">"
| Html4Document (HtmlDocumentForm_V7.Strict _) -> "<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01//EN\" \"http://www.w3.org/TR/html4/strict.dtd\">"
| Html4Document (HtmlDocumentForm_V7.Transitional _) -> "<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\" \"http://www.w3.org/TR/html4/loose.dtd\">"
| Html4Document (HtmlDocumentForm_V7.Frameset _) -> "<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01 Frameset//EN\" \"http://www.w3.org/TR/html4/frameset.dtd\">"
| Html5Document _ -> "<!DOCTYPE HTML>"
As is obvious here, we're forcing the different between the document types
to be more obvious. This is mostly due to our <head>
tag issue, but we
would have needed to do this anyway (and will do so for the <body>
eventually). Now, fortunately, most of our boiler-plate will stay the same,
except for the first couple steps of rendering:
let getTarget_v7 = function | SameContextWindow -> "_self" | NewCleanContext -> "_blank" | ParentContext -> "_parent" | TopMostContext -> "_top"
let getHtmlHead (head : HtmlHead_V7) =
let sb = System.Text.StringBuilder()
let render (s : string) =
s |> sb.Append |> ignore
System.Environment.NewLine |> sb.Append |> ignore
head.Title |> renderTitle_v3 |> render
match head.Charset with
| Some c ->
"" |> render
renderMetaTag_v4 ([("content", c |> getCharset_v4 |> sprintf "text/html; charset=%s"); ("http-equiv", "Content-Type")] |> Map.ofList) "" |> render
| None -> ()
match head.Base with
| Some b ->
let attrs =
let href =
match b with
| Href h | HrefTarget (h, _) -> [("href", h)]
| _ -> []
let target =
match b with
| Target t | HrefTarget (_, t) -> [("target", t |> getTarget_v7)]
| _ -> []
target |> List.append href |> Map.ofList
"" |> render
renderTag_v3_5 true false "base" attrs "" |> render
| None -> ()
match head.StandardMetas with
| m when m.IsEmpty -> ()
| m ->
"" |> render
m |> Map.toArray |> Array.map (fun (k, v) -> renderMetaTag_v4 ([("content", v)] |> List.append (head.AdditionalMetas |> Map.toList) |> List.append (k |> getStandardMetaAttrs_v4) |> Map.ofList) "") |> Array.iter render
match head.Stylesheets with
| [] -> ()
| s ->
"" |> render
s |> List.map renderStyle_v5 |> List.iter render
match head.Scripts with
| [] -> ()
| s ->
"" |> render
s |> List.map renderScript_v5 |> List.iter render
sb.ToString() |> renderHead_v3
Now, interestingly, we can make XHTML
head rendering really easy by
converting it to an HTML head, at least for now.
let getXhtmlHead (head : XhtmlHead_V7) =
{ HtmlHead_V7.Title = head.Title
Charset = head.Charset
AdditionalMetas = head.AdditionalMetas
Base = head.Base |> Option.map Href
StandardMetas = head.StandardMetas
Stylesheets = head.Stylesheets
Scripts = head.Scripts }
|> getHtmlHead
Then we need to define some new get___Body
and such functions, which are
going to be just as trivial as before.
let getHtmlStrictBody_v7 (document : HtmlStrictDocument_V7) =
let sb = System.Text.StringBuilder()
document.Head |> getHtmlHead |> sb.Append |> ignore
document.Body.Children |> stringOrBlank |> sprintf "<div>%s</div>" |> renderBody_v3 document.Body.Attributes |> sb.Append |> ignore
sb.ToString()
let getHtmlTransitionalBody_v7 (document : HtmlTransitionalDocument_V7) =
let sb = System.Text.StringBuilder()
document.Head |> getHtmlHead |> sb.Append |> ignore
document.Body.Children |> stringOrBlank |> sprintf "<div>%s</div>" |> renderBody_v3 document.Body.Attributes |> sb.Append |> ignore
sb.ToString()
let getHtmlFramesetBody_v7 (document : HtmlFramesetDocument_V7) =
let sb = System.Text.StringBuilder()
document.Head |> getHtmlHead |> sb.Append |> ignore
document.Body.Children |> stringOrBlank |> sprintf "<div>%s</div>" |> renderBody_v3 document.Body.Attributes |> sb.Append |> ignore
sb.ToString()
let getXhtmlStrictBody_v7 (document : XhtmlStrictDocument_V7) =
let sb = System.Text.StringBuilder()
document.Head |> getXhtmlHead |> sb.Append |> ignore
document.Body.Children |> stringOrBlank |> sprintf "<div>%s</div>" |> renderBody_v3 document.Body.Attributes |> sb.Append |> ignore
sb.ToString()
let getXhtmlTransitionalBody_v7 (document : XhtmlTransitionalDocument_V7) =
let sb = System.Text.StringBuilder()
document.Head |> getXhtmlHead |> sb.Append |> ignore
document.Body.Children |> stringOrBlank |> sprintf "<div>%s</div>" |> renderBody_v3 document.Body.Attributes |> sb.Append |> ignore
sb.ToString()
let getXhtmlFramesetBody_v7 (document : XhtmlFramesetDocument_V7) =
let sb = System.Text.StringBuilder()
document.Head |> getXhtmlHead |> sb.Append |> ignore
document.Body.Children |> stringOrBlank |> sprintf "<div>%s</div>" |> renderBody_v3 document.Body.Attributes |> sb.Append |> ignore
sb.ToString()
We also added a sprintf "<div>%s</div>"
to maintain (X)HTML compliance in
the <body>
section while we develop. Then getBody
becomes pretty simple:
let getBody_v7 document =
match document with
| Html4Document (HtmlDocumentForm_V7.Strict d) | Html5Document d -> d |> getHtmlStrictBody_v7
| XhtmlDocument (XhtmlDocumentForm_V7.Strict d) -> d |> getXhtmlStrictBody_v7
| Html4Document (HtmlDocumentForm_V7.Transitional d) -> d |> getHtmlTransitionalBody_v7
| XhtmlDocument (XhtmlDocumentForm_V7.Transitional d) -> d |> getXhtmlTransitionalBody_v7
| Html4Document (HtmlDocumentForm_V7.Frameset d) -> d |> getHtmlFramesetBody_v7
| XhtmlDocument (XhtmlDocumentForm_V7.Frameset d) -> d |> getXhtmlFramesetBody_v7
But our printDocument
doesn't change:
let printDocument_v7 document =
let doctype = document |> getDoctype_v7
let sb = System.Text.StringBuilder()
doctype |> sb.Append |> ignore
System.Environment.NewLine |> sb.Append |> ignore
let attrs =
match document with
| Html4Document _ | Html5Document _ -> [] |> Map.ofList
| XhtmlDocument _ -> [("xmlns", "http://www.w3.org/1999/xhtml")] |> Map.ofList
document |> getBody_v7 |> renderTag_v3_5 false true "html" attrs |> sb.Append |> ignore
sb.ToString()
let xhtmlDocument_v7 =
XhtmlDocument (
Strict {
XhtmlStrictDocument_V7.Head =
{ Title = "Test"
Base = Some "http://example.com/"
Charset = Some Utf8
StandardMetas = [(ApplicationName None, "Test Application"); (Description, "A test application."); (Keywords, "test, application")] |> Map.ofList
AdditionalMetas = [] |> Map.ofList
Scripts = [File "test.js"]
Stylesheets = [File "style.css"] }
Body = { Attributes = [] |> Map.ofList; Children = Some "Test" }
})
let html5Document_v7 =
Html5Document {
HtmlStrictDocument_V7.Head =
{ Title = "Test"
Base = NewCleanContext |> Target |> Some
Charset = Some Utf8
StandardMetas = [(ApplicationName None, "Test Application"); (Description, "A test application."); (Keywords, "test, application")] |> Map.ofList
AdditionalMetas = [] |> Map.ofList
Scripts = [File "test.js"]
Stylesheets = [File "style.css"] }
Body = { Attributes = [] |> Map.ofList; Children = None }
}
[xhtmlDocument_v7; html5Document_v7] |> List.map printDocument_v7 |> List.iter (writeFile "TestFile.html")
Ok, ok, so I've gone through a whole bunch of stuff here, but haven't really
described what it all means. Here's the thing: the purpose of this post is
to show you exactly what is possible with the F# type system. It's
designed to show you that we can actually define and use types which create
a safe environment to generate a completely different language. We described
the basic types and processes which are used to generate HTML (at least in
my solution) from F#. I'll be making this whole library (which is far better
written than my poor examples here) open source in the coming weeks, but the
basic idea is something I've been meaning to describe for some time. With
the definitions we have, the next step would be to define parts that can be
assembled, so that we could take an html5Document
and add the appropriate
master style-sheets or templating to it. We could take a document and
convert it to a different type, with minimal issue. The goal of this library
was to generate HTML, and offer compile-time safety of that generated HTML,
and as you can see we are already accomplishing that for our <head>
.
You also might be asking yourself what the advantage to something like this
really is. "Elliott, it's not all that hard to write HTML, really." And
you're absolutely correct, HTML is a pretty easy markup language to write,
and browsers are quite forgiving, which means that even when you mess up it
still works out mostly-OK, but why don't we try to do better? I use this
system to generate my HTML because it gives me three major benefits:
- Generate partial templates to be combined in later stages with ease;
- Ensure compliance with the HTML specification and that we don't forget to
close the appropriate tags;
- Allow layout swaps and transitions with a much higher ease;
And I'm not sure about you, but these are the same reasons I use F# when I
can, it gives me these same benefits.
Now I could continue to go on about how we would design this templating
system, but I won't. It's overly excessive explanation for not-a-lot of
benefit. Instead, I'll conclude our discussion here and when this is
released, then I'll go through more of how-to-use it and why you might
want to.
Until then, farewell, and I hope everyone enjoys the holiday season!
☃☃☃