mirror of
https://github.com/bigskysoftware/hypermedia-systems.git
synced 2025-11-27 00:06:05 -05:00
857 lines
45 KiB
Plaintext
857 lines
45 KiB
Plaintext
|
||
= Components Of A Hypermedia System
|
||
:chapter: 02
|
||
:url: /hypermedia-components/
|
||
|
||
A _hypermedia system_ consists of a number of components, including:
|
||
|
||
* A hypermedia, such as HTML.
|
||
* A network protocol, such as HTTP.
|
||
* A server that presents a hypermedia API responding to network requests with hypermedia responses.
|
||
* A client that properly interprets those responses.
|
||
|
||
In this chapter we will look at these components and their implementation in the context of the web.
|
||
|
||
Once we have reviewed the major components of the web as a hypermedia system, we will look at some key ideas behind this system -- especially as developed by Roy
|
||
Fielding in his dissertation, "`Architectural Styles and the Design of Network-based Software Architectures.`" We will see where the
|
||
terms REpresentational State Transfer (REST), RESTful and Hypermedia As The Engine Of Application State (HATEOAS) come from,
|
||
and we will analyze these terms in the context of the web.
|
||
|
||
This should give you a stronger understanding of the theoretical basis of the web as a hypermedia system, how it is
|
||
supposed to fit together, and why Hypermedia-Driven Applications are RESTful, whereas JSON APIs -- despite the way the
|
||
term REST is currently used in the industry -- are not.
|
||
|
||
== Components Of A Hypermedia System
|
||
|
||
=== The Hypermedia
|
||
|
||
The fundamental technology of a hypermedia system is a hypermedia that allows a
|
||
client and server to communicate with one another in a dynamic, non-linear fashion. Again, what makes a hypermedia
|
||
a hypermedia is the presence of _hypermedia controls_: elements that allow users to select
|
||
non-linear actions within the hypermedia. Users can _interact_ with the media in a manner beyond
|
||
simply reading from start to end.
|
||
|
||
We have already mentioned the two primary hypermedia controls in HTML, anchors and forms, which allow a browser to
|
||
present links and operations to a user through a browser.
|
||
|
||
((("Uniform Resource Locator (URL)")))
|
||
In the case of HTML, these links and forms typically specify the target of their operations using [.dfn]_Uniform Resource
|
||
Locators (URLs)_:
|
||
|
||
Uniform Resource Locator:: A uniform resource locator is a textual string that refers to, or _points to_ a location
|
||
on a network where a _resource_ can be retrieved from, as well as the mechanism by which the resource can be retrieved.
|
||
|
||
A URL is a string consisting of various subcomponents:
|
||
|
||
.URL Components
|
||
----
|
||
[scheme]://[userinfo]@[host]:[port][path]?[query]#[fragment]
|
||
----
|
||
|
||
Many of these subcomponents are not required, and are often omitted.
|
||
|
||
A typical URL might look like this:
|
||
|
||
.A simple URL
|
||
----
|
||
https://hypermedia.systems/book/contents/
|
||
----
|
||
|
||
This particular URL is made up of the following components:
|
||
|
||
* A protocol or scheme (in this case, `https`)
|
||
* A domain (e.g., `hypermedia.systems`)
|
||
* A path (e.g., `/book/contents`)
|
||
|
||
This URL uniquely identifies a retrievable _resource_ on the internet, to which an _HTTP Request_ can be issued by
|
||
a hypermedia client that "`speaks`" HTTPS, such as a web browser. If this URL is found as the reference of a
|
||
hypermedia control within an HTML document, it implies that there is a _hypermedia server_ on the other side of the
|
||
network that understands HTTPS as well, and that can respond to this request with a _representation_ of the given
|
||
resource (or redirect you to another location, etc.)
|
||
|
||
Note that URLs are often not written out entirely within HTML. It is very common to see anchor tags that look like this,
|
||
for example:
|
||
|
||
.A Simple Link
|
||
[source, html]
|
||
----
|
||
<a href="/book/contents/">Table Of Contents</a>
|
||
----
|
||
|
||
Here we have a _relative_ hypermedia reference, where the protocol, host and port are _implied_ to be that of the "`current
|
||
document,`" that is, the same as whatever the protocol and server were to retrieve the current HTML page. So, if this
|
||
link was found in an HTML document retrieved from `https://hypermedia.systems/`, then the implied URL for this anchor
|
||
would be `https://hypermedia.systems/book/contents/`.
|
||
|
||
=== Hypermedia Protocols
|
||
|
||
The hypermedia control (link) above tells a browser: "`When a user clicks on this text, issue a request to
|
||
`https://hypermedia.systems/book/contents/` using the Hypertext Transfer Protocol,`" or HTTP.
|
||
|
||
HTTP is the _protocol_ used to transfer HTML (hypermedia) between browsers (hypermedia clients) and servers (hypermedia
|
||
servers) and, as such, is the key network technology that binds the distributed hypermedia system of the web together.
|
||
|
||
HTTP version 1.1 is a relatively simple network protocol, so lets take a look at what the `GET` request triggered by the anchor
|
||
tag would look like. This is the request that would be sent to the server found at `hypermedia.systems`, on port `80`
|
||
by default:
|
||
|
||
[source, http]
|
||
----
|
||
GET /book/contents/ HTTP/1.1
|
||
Accept: text/html,*/*
|
||
Host: hypermedia.systems
|
||
----
|
||
|
||
The first line specifies that this is an HTTP `GET` request. It then specifies the path of the resource being
|
||
requested. Finally, it contains the HTTP version for this request.
|
||
|
||
After that are a series of HTTP _request headers_: individual lines of name/value pairs separated by a colon. The request headers provide
|
||
_metadata_ that can be used by the server to determine exactly how to respond to the client request. In this case,
|
||
with the `Accept` header, the browser is saying it would prefer HTML as a response format, but will accept any server response.
|
||
|
||
Next, it has a `Host` header that specifies which server the request has been sent to. This is useful when multiple
|
||
domains are hosted on the same host.
|
||
|
||
An HTTP response from a server to this request might look something like this:
|
||
|
||
[source, http]
|
||
----
|
||
HTTP/1.1 200 OK
|
||
Content-Type: text/html; charset=utf-8
|
||
Content-Length: 870
|
||
Server: Werkzeug/2.0.2 Python/3.8.10
|
||
Date: Sat, 23 Apr 2022 18:27:55 GMT
|
||
|
||
<html lang="en">
|
||
<body>
|
||
<header>
|
||
<h1>HYPERMEDIA SYSTEMS</h1>
|
||
</header>
|
||
...
|
||
</body>
|
||
</html>
|
||
----
|
||
|
||
In the first line, the HTTP Response specifies the HTTP version being used, followed by a _response code_ of `200`,
|
||
indicating that the given resource was found and that the request succeeded. This is followed by a string, `OK` that
|
||
corresponds to the response code. (The actual string doesn't matter, it is the response code that tells the client
|
||
the result of a request, as we will discuss in more detail below.)
|
||
|
||
After the first line of the response, as with the HTTP Request, we see a series of _response headers_ that provide
|
||
metadata to the client to assist in displaying the _representation_ of the resource correctly.
|
||
|
||
Finally, we see some new HTML content. This content is the HTML _representation_ of the requested resource, in this
|
||
case a table of contents of a book. The browser will use this HTML to replace the entire content in its display window,
|
||
showing the user this new page, and updating the address bar to reflect the new URL.
|
||
|
||
|
||
==== HTTP methods
|
||
|
||
((("HTTP methods")))
|
||
((("HTTP methods", GET)))
|
||
((("HTTP methods", POST)))
|
||
((("HTTP methods", PUT)))
|
||
((("HTTP methods", PATCH)))
|
||
((("HTTP methods", DELETE)))
|
||
|
||
The anchor tag above issued an HTTP `GET`, where `GET` is the _method_ of the request. The particular method
|
||
being used in an HTTP request is perhaps the most important piece of information about it, after the actual resource that
|
||
the request is directed at.
|
||
|
||
There are many methods available in HTTP; the ones of most practical importance to developers are the following:
|
||
|
||
`GET`::
|
||
A GET request retrieves the representation of the specified resource. GET requests should not mutate data.
|
||
|
||
`POST`::
|
||
A POST request submits data to the specified resource. This will often result in a mutation of state on the server.
|
||
|
||
`PUT`::
|
||
A PUT request replaces the data of the specified resource. This results in a mutation of state on the server.
|
||
|
||
`PATCH`::
|
||
A PATCH request replaces the data of the specified resource. This results in a mutation of state on the server.
|
||
|
||
`DELETE`::
|
||
A DELETE request deletes the specified resource. This results in a mutation of state on the server.
|
||
|
||
(((CRUD)))
|
||
These methods _roughly_ line up with the "`Create/Read/Update/Delete`" or CRUD pattern found in many applications:
|
||
|
||
* `POST` corresponds with Creating a resource.
|
||
* `GET` corresponds with Reading a resource.
|
||
* `PUT` and `PATCH` correspond with Updating a resource.
|
||
* `DELETE` corresponds, well, with Deleting a resource.
|
||
|
||
.Put vs. Post
|
||
****
|
||
While HTTP Actions correspond roughly to CRUD, they are not the same. The technical specifications for these methods make
|
||
no such connection, and are often somewhat difficult to read. Here, for example, is the documentation
|
||
on the distinction between a `POST` and a `PUT` from https://www.rfc-editor.org/rfc/rfc2616[RFC-2616].
|
||
|
||
[quote, RFC-2616, https://www.rfc-editor.org/rfc/rfc2616#section-9.6]
|
||
____
|
||
The target resource in a POST request is intended to handle the enclosed representation according to the
|
||
resource's own semantics, whereas the enclosed representation in a PUT request is defined as replacing the state of the
|
||
target resource. Hence, the intent of PUT is idempotent and visible to intermediaries, even though the exact
|
||
effect is only known by the origin server.
|
||
____
|
||
|
||
In plain terms, a `POST` can be handled by a server pretty much however it likes, whereas a `PUT` should be handled
|
||
as a "`replacement`" of the resource, although the language, once again allows the server to do pretty much whatever it
|
||
would like within the constraint of being https://developer.mozilla.org/en-US/docs/Glossary/Idempotent[_idempotent_].
|
||
****
|
||
|
||
In a properly structured HTML-based hypermedia system you would use an appropriate HTTP method for the operation a
|
||
particular hypermedia control performs. For example, if a hypermedia control such as a button _deletes_ a resource,
|
||
ideally it should issue an HTTP `DELETE` request to do so.
|
||
|
||
A strange thing about HTML, though, is that the native hypermedia controls can only issue HTTP `GET` and `POST` requests.
|
||
|
||
Anchor tags always issue a `GET` request.
|
||
|
||
Forms can issue either a `GET` or `POST` using the `method` attribute.
|
||
|
||
Despite the fact that HTML -- the world's most popular hypermedia -- has been designed alongside
|
||
HTTP (which is the Hypertext Transfer Protocol, after all!): if you wish to issue `PUT`, `PATCH` or `DELETE` requests
|
||
you currently _have to_ resort to JavaScript to do so. Since a `POST` can do almost anything, it ends up being used for
|
||
any mutation on the server, and `PUT`, `PATCH` and `DELETE` are left aside in plain HTML-based
|
||
applications.
|
||
|
||
This is an obvious shortcoming of HTML as a hypermedia; it would be wonderful to see this fixed in the
|
||
HTML specification. For now, in Chapter 4, we'll discuss ways to get around this.
|
||
|
||
==== HTTP response codes
|
||
|
||
HTTP request methods allow a client to tell a server _what_ to do to a given resource. HTTP responses contain
|
||
_response codes_, which tell a client what the result of the request was. HTTP response codes are numeric
|
||
values that are embedded in the HTTP response, as we saw above.
|
||
|
||
The most familiar response code for web developers is probably `404`, which stands for "`Not Found.`" This
|
||
is the response code that is returned by web servers when a resource that does not exist is requested from them.
|
||
|
||
((("HTTP response", codes)))
|
||
HTTP breaks response codes up into various categories:
|
||
|
||
`100`-`199`::
|
||
Informational responses that provide information about how the server is processing the response.
|
||
|
||
`200`-`299`::
|
||
Successful responses indicating that the request succeeded.
|
||
|
||
`300`-`399`::
|
||
Redirection responses indicating that the request should be sent to some other URL.
|
||
|
||
`400`-`499`::
|
||
Client error responses indicating that the client made some sort of bad request (e.g., asking for something that didn't
|
||
exist in the case of `404` errors).
|
||
|
||
`500`-`599`::
|
||
Server error responses indicating that the server encountered an error internally as it attempted to respond to the request.
|
||
|
||
Within each of these categories there are multiple response codes for specific situations.
|
||
|
||
Here are some of the more common or interesting ones:
|
||
|
||
`200 OK`::
|
||
The HTTP request succeeded.
|
||
|
||
`301 Moved Permanently`::
|
||
The URL for the requested resource has moved to a new location permanently, and the new URL will be provided in
|
||
the `Location` response header.
|
||
|
||
`302 Found`::
|
||
The URL for the requested resource has moved to a new location temporarily, and the new URL will be provided in
|
||
the `Location` response header.
|
||
|
||
`303 See Other`::
|
||
The URL for the requested resource has moved to a new location, and the new URL will be provided in
|
||
the `Location` response header. Additionally, this new URL should be retrieved with a `GET` request.
|
||
|
||
`401 Unauthorized`::
|
||
The client is not yet authenticated (yes, authenticated, despite the name) and must be authenticated
|
||
to retrieve the given resource.
|
||
|
||
`403 Forbidden`::
|
||
The client does not have access to this resource.
|
||
|
||
`404 Not Found`::
|
||
The server cannot find the requested resource.
|
||
|
||
`500 Internal Server Error`::
|
||
The server encountered an error when attempting to process the response.
|
||
|
||
There are some fairly subtle differences between HTTP response codes (and, to be honest, some ambiguities between them).
|
||
The difference between a `302` redirect and a `303` redirect, for example, is that the former will issue the request to the
|
||
new URL using the same HTTP method as the initial request, whereas the latter will always use a `GET`. This is a small
|
||
but often crucial difference, as we will see later in the book.
|
||
|
||
A well crafted Hypermedia-Driven Application will take advantage of both HTTP methods and HTTP response codes to create
|
||
a sensible hypermedia API. You do not want to build a Hypermedia-Driven Application that uses a `POST` method for all
|
||
requests and responds with `200 OK` for every response, for example. (Some JSON Data APIs built on top of HTTP do exactly
|
||
this!)
|
||
|
||
When building a Hypermedia-Driven Application, you want, instead, to go "`with the grain`" of the web and use HTTP methods
|
||
and response codes as they were designed to be used.
|
||
|
||
==== Caching HTTP responses
|
||
|
||
((("HTTP response", caching)))
|
||
A constraint of REST (and, therefore, a feature of HTTP) is the notion of caching responses: a server can indicate to
|
||
a client (as well as intermediary HTTP servers) that a given response can be cached for future requests to the same
|
||
URL.
|
||
|
||
((("HTTP response header","Cache-Control")))
|
||
The cache behavior of an HTTP response from a server can be indicated with the `Cache-Control` response header. This
|
||
header can have a number of different values indicating the cacheability of a given response. If, for example, the header
|
||
contains the value `max-age=60`, this indicates that a client may cache this response for 60 seconds, and need not issue
|
||
another HTTP request for that resource until that time limit has expired.
|
||
|
||
((("HTTP response header", Vary)))
|
||
Another important caching-related response header is `Vary`. This response header can be used to indicate exactly what
|
||
headers in an HTTP Request form the unique identifier for a cached result. This becomes important to allow the browser
|
||
to correctly cache content in situations where a particular header affects the form of the server response.
|
||
|
||
((("HTTP response header", custom)))
|
||
(((HX-Request, about)))
|
||
A common pattern in htmx-powered applications, for example, is to use a custom header set by htmx, `HX-Request`, to
|
||
differentiate between "`normal`" web requests and requests submitted by htmx. To properly cache the response to these
|
||
requests, the `HX-Request` request header must be indicated by the `Vary` response header.
|
||
|
||
A full discussion of caching HTTP responses is beyond the scope of this chapter; see
|
||
the https://developer.mozilla.org/en-US/docs/Web/HTTP/Caching[MDN Article on HTTP Caching] if you would like to know more
|
||
on the topic.
|
||
|
||
=== Hypermedia Servers
|
||
|
||
Hypermedia servers are any server that can respond to an HTTP request with an HTTP response. Because HTTP is so simple,
|
||
this means that nearly any programming language can be used to build a hypermedia server. There are a vast number of
|
||
libraries available for building HTTP-based hypermedia servers in nearly every programming language imaginable.
|
||
|
||
This turns out to be one of the best aspects of adopting hypermedia as your primary technology for building a web application: it removes
|
||
the pressure to adopt JavaScript as a backend technology. If you use a JavaScript-heavy Single Page Application-based
|
||
front end, and you use JSON Data APIs, you are going to feel significant pressure to deploy JavaScript on the back end as
|
||
well.
|
||
|
||
In this latter situation, you already have a ton of code written in JavaScript. Why maintain two separate code bases in
|
||
two different languages? Why not create reusable domain logic on the client-side as well as the server-side? Now that
|
||
JavaScript has excellent server-side technologies available like Node and Deno, why not just use a single language for
|
||
everything?
|
||
|
||
In contrast, building a Hypermedia-Driven Application gives you a lot more freedom in picking the back end technology you want
|
||
to use. Your decision can be based on the domain of your application, what languages and server software you are familiar
|
||
with or are passionate about, or just what you feel like trying out.
|
||
|
||
You certainly aren't writing your server-side logic in HTML! And every major programming language has at least one good
|
||
web framework and templating library that can be used to handle HTTP requests cleanly.
|
||
|
||
If you are doing something in big data, perhaps you'd like to use Python, which has tremendous support for that
|
||
domain.
|
||
|
||
If you are doing AI work, perhaps you'd like to use Lisp, leaning on a language with a long history in that area of research.
|
||
|
||
Maybe you are a functional programming enthusiast and want to use OCaml or Haskell. Perhaps you just really like Julia or Nim.
|
||
|
||
These are all perfectly valid reasons for choosing a particular server-side technology!
|
||
|
||
By using hypermedia as your system architecture, you are freed up to adopt any of these choices. There simply isn't a
|
||
large JavaScript code base on the front end pressuring you to adopt JavaScript on the back end.
|
||
|
||
.Hypermedia On Whatever you'd Like (HOWL)
|
||
****
|
||
In the htmx community we call this (with tongue in cheek) the HOWL stack: Hypermedia On Whatever you'd Like. The htmx community
|
||
is multi-language and multi-framework, there are rubyists as well as pythonistas, lispers as well as haskellers. There
|
||
are even JavaScript enthusiasts! All these languages and frameworks are able to adopt hypermedia, and are able to still
|
||
share techniques and offer support to one another because they share a common underlying architecture: they are all using
|
||
the web as a hypermedia system.
|
||
|
||
Hypermedia, in this sense, provides a "`universal language`" for the web that we can all use.
|
||
****
|
||
|
||
=== Hypermedia Clients
|
||
|
||
((("web browsers")))
|
||
We now come to the final major component in a hypermedia system: the hypermedia client. Hypermedia _clients_ are software
|
||
that understand how to interpret a particular hypermedia, and the hypermedia controls within it, properly. The canonical
|
||
example, of course, is the web browser, which understands HTML and can present it to a user to interact with. Web browsers
|
||
are incredibly sophisticated pieces of software. (So sophisticated, in fact, that they are often re-purposed away from
|
||
being a hypermedia client, to being a sort of cross-platform virtual machine for launching Single Page Applications.)
|
||
|
||
Browsers aren't the only hypermedia clients out there, however. In the last section of this book we will look at
|
||
Hyperview, a mobile-oriented hypermedia. One of the outstanding features of Hyperview is that it doesn't simply provide
|
||
a hypermedia, HXML, but also provides a _working hypermedia client_ for that hypermedia. This makes building a proper
|
||
Hypermedia-Driven Application with Hyperview extremely easy.
|
||
|
||
A crucial feature of a hypermedia system is what is known as _the uniform interface_. We discuss this concept in depth
|
||
in the next section on REST. What is often ignored in discussions about hypermedia is how important the hypermedia
|
||
client is in taking advantage of this uniform interface. A hypermedia client must know how to properly interpret and
|
||
present hypermedia controls found in a hypermedia response from a hypermedia server for the whole hypermedia system
|
||
to hang together. Without a sophisticated client that can do this, hypermedia controls and a hypermedia-based API are
|
||
much less useful.
|
||
|
||
This is one reason why JSON APIs have rarely adopted hypermedia controls successfully: JSON APIs are typically consumed
|
||
by code that is expecting a fixed format and that isn't designed to be a hypermedia client. This is totally understandable:
|
||
building a good hypermedia client is hard! For JSON API clients like this, the
|
||
power of hypermedia controls embedded within an API response is irrelevant and often simply annoying:
|
||
|
||
[quote, Freddie Karlbom,https://techblog.commercetools.com/graphql-and-rest-level-3-hateoas-70904ff1f9cf]
|
||
____
|
||
The short answer to this question is that HATEOAS isn’t a good fit for most modern use cases for APIs. That is why
|
||
after almost 20 years, HATEOAS still hasn’t gained wide adoption among developers. GraphQL on the other hand is spreading
|
||
like wildfire because it solves real-world problems.
|
||
____
|
||
|
||
HATEOAS will be described in more detail below, but the takeaway here is that a good hypermedia client is a necessary
|
||
component within a larger hypermedia system.
|
||
|
||
== REST
|
||
|
||
Now that we have reviewed the major components of a hypermedia system, it's time to look more deeply into the concept of
|
||
REST. The term "`REST`" comes from Roy Fielding's PhD dissertation on the architecture
|
||
of the web. Fielding wrote his dissertation at U.C. Irvine, after having helped build much of the infrastructure of the early
|
||
web, including the Apache web server. Roy was attempting to formalize and describe the novel distributed computing system
|
||
that he had helped to build.
|
||
|
||
We are going to focus on what we feel is the most important section of Fielding's writing, from a web development
|
||
perspective: Section 5.1. This section contains the core concepts (Fielding calls them _constraints_) of Representational
|
||
State Transfer, or REST.
|
||
|
||
Before we get into the muck, however, it is important to understand that Fielding discusses REST as a _network architecture_,
|
||
that is, as an entirely different way to architect a distributed system. And, further, as a novel network
|
||
architecture that should be _contrasted_ with earlier approaches to distributed systems.
|
||
|
||
It is also important to emphasize that, at the time Fielding wrote his dissertation, JSON APIs and AJAX did not exist.
|
||
He was describing the early web, with HTML being transferred over HTTP by early browsers, as a hypermedia system.
|
||
|
||
Today, in a strange turn of events, the term "`REST`" is mainly associated with JSON Data APIs, rather than with HTML
|
||
and hypermedia. This is extremely funny once you realize that the vast majority of JSON Data APIs aren't
|
||
RESTful, in the original sense, and, in fact, _can't_ be RESTful, since they aren't using a natural hypermedia format.
|
||
|
||
To re-emphasize: REST, as coined by Fielding, describes _the pre-API web_, and letting go of the current, common
|
||
usage of the term REST to simply mean "`a JSON API`" is necessary to develop a proper understanding of the idea.
|
||
|
||
=== The "`Constraints`" of REST
|
||
|
||
((("Fielding, Roy")))
|
||
(((REST, constraints)))
|
||
In his dissertation, Fielding defines various "`constraints`" to describe how a RESTful system must behave. This approach
|
||
can feel a little round-about and difficult to follow for many people, but it is an appropriate approach for an academic
|
||
document. Given a bit of time thinking about the constraints he outlines and some concrete examples of those
|
||
constraints it will become easy to assess whether a given system actually satisfies the architectural requirements of
|
||
REST or not.
|
||
|
||
Here are the constraints of REST Fielding outlines:
|
||
|
||
* It is a client-server architecture (section 5.1.2).
|
||
* It must be stateless; (section 5.1.3) that is, every request contains all information necessary to respond to that request.
|
||
* It must allow for caching (section 5.1.4).
|
||
* It must have a _uniform interface_ (section 5.1.5).
|
||
* It is a layered system (section 5.1.6).
|
||
* Optionally, it can allow for Code-On-Demand (section 5.1.7), that is, scripting.
|
||
|
||
Let's go through each of these constraints in turn and discuss them in detail, looking at how (and to what extent) the web
|
||
satisfies each of them.
|
||
|
||
=== The Client-Server Constraint
|
||
|
||
See https://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm#sec_5_1_2[Section 5.1.2] for the
|
||
Client-Server constraint.
|
||
|
||
The REST model Fielding was describing involved both _clients_ (browsers, in the case of the web) and _servers_ (such
|
||
as the Apache Web Server he had been working on) communicating via a network connection. This was the context of his
|
||
work: he was describing the network architecture of the World Wide Web, and contrasting it with earlier architectures,
|
||
notably thick-client networking models such as the Common Object Request Broker Architecture (CORBA).
|
||
|
||
It should be obvious that any web application, regardless of how it is designed, will satisfy this requirement.
|
||
|
||
=== The Statelessness Constraint
|
||
|
||
See https://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm#sec_5_1_3[Section 5.1.3] for the Stateless constraint.
|
||
|
||
As described by Fielding, a RESTful system is stateless: every request should encapsulate all information necessary to
|
||
respond to that request, with no side state or context stored on either the client or the server.
|
||
|
||
In practice, for many web applications today, we actually violate this constraint: it is common to establish a
|
||
_session cookie_ that acts as a unique identifier for a given user and that is sent along with every request. While this
|
||
session cookie is, by itself, not stateful (it is sent with every request), it is typically
|
||
used as a key to look up information stored on the server, in what is usually termed "`the session.`"
|
||
|
||
This session information is typically stored in some sort of shared storage across multiple web servers, holding things
|
||
like the current user's email or id, their roles, partially created domain objects, caches, and so forth.
|
||
|
||
This violation of the Statelessness REST architectural constraint has proven to be useful for building web applications
|
||
and does not appear to have had a major impact on the overall flexibility the web. But it is worth bearing in mind that
|
||
even Web 1.0 applications often violate the purity of REST in the interest of pragmatic trade-offs.
|
||
|
||
And it must be said that sessions _do_ cause additional operational complexity headaches when deploying hypermedia
|
||
servers; these may need shared access to session state information stored across an entire cluster. So
|
||
Fielding was correct in pointing out that an ideal RESTful system, one that did not violate this constraint, would be simpler and therefore more robust.
|
||
|
||
=== The Caching Constraint
|
||
|
||
See https://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm#sec_5_1_4[Section 5.1.4] for the Caching constraint.
|
||
|
||
This constraint states that a RESTful system should support the notion of caching, with explicit information on the
|
||
cache-ability of responses for future requests of the same resource. This allows both clients as well as intermediary
|
||
servers between a given client and final server to cache the results of a given request.
|
||
|
||
As we discussed earlier, HTTP has a sophisticated caching mechanism via response headers that is often overlooked or
|
||
underutilized when building hypermedia applications. Given the existence of this functionality, however, it is
|
||
easy to see how this constraint is satisfied by the web.
|
||
|
||
=== The Uniform Interface Constraint
|
||
|
||
Now we come to the most interesting and, in our opinion, most innovative constraint in REST: that of the _uniform interface_.
|
||
|
||
This constraint is the source of much of the _flexibility_ and _simplicity_ of a hypermedia system, so we are going to
|
||
spend some time on it.
|
||
|
||
See https://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm#sec_5_1_5[Section 5.1.5] for the Uniform Interface
|
||
constraint.
|
||
|
||
In this section, Fielding says:
|
||
|
||
[quote, Roy Fielding, Architectural Styles and the Design of Network-based Software Architectures]
|
||
____
|
||
The central feature that distinguishes the REST architectural style from other network-based styles is its emphasis on
|
||
a uniform interface between components... In order to obtain a uniform interface, multiple architectural constraints
|
||
are needed to guide the behavior of components. REST is defined by four interface constraints: identification of
|
||
resources; manipulation of resources through representations; self-descriptive messages; and, hypermedia as the engine
|
||
of application state
|
||
____
|
||
|
||
So we have four sub-constraints that, taken together, form the Uniform Interface constraint.
|
||
|
||
==== Identification of resources
|
||
|
||
In a RESTful system, resources should have a unique identifier. Today the concept of Universal Resource Locators (URLs) is
|
||
common, but at the time of Fielding's writing they were still relatively new and novel.
|
||
|
||
What might be more interesting today is the notion of a _resource_, thus being identified: in a RESTful system, _any_ sort of
|
||
data that can be referenced, that is, the target of a hypermedia reference, is considered a resource. URLs, though common
|
||
enough today, end up solving the very complex problem of uniquely identifying any and every resource on the internet.
|
||
|
||
==== Manipulation of resources through representations
|
||
|
||
In a RESTful system, _representations_ of the resource are transferred between clients and servers. These
|
||
representations can contain both data and metadata about the request (such as "`control data`" like an HTTP
|
||
method or response code). A particular data format or _media type_ may be used to present a given resource to a client,
|
||
and that media type can be negotiated between the client and the server.
|
||
|
||
We saw this latter aspect of the uniform interface in the `Accept` header in the requests above.
|
||
|
||
==== Self-descriptive messages
|
||
|
||
((("self-descriptive messages")))
|
||
The Self-Descriptive Messages constraint, combined with the next one, HATEOAS, form what we consider to be the core of
|
||
the Uniform Interface, of REST and why hypermedia provides such a powerful system architecture.
|
||
|
||
The Self-Descriptive Messages constraint requires that, in a RESTful system, messages must be _self-describing_.
|
||
|
||
This means that _all information_ necessary to both display _and also operate_ on the data being represented must be
|
||
present in the response. In a properly RESTful system, there can be no additional "`side`" information necessary for a
|
||
client to transform a response from a server into a useful user interface. Everything must "`be in`" the message itself,
|
||
in the form of hypermedia controls.
|
||
|
||
This might sound a little abstract so let's look at a concrete example.
|
||
|
||
Consider two different potential responses from an HTTP server for the URL `\https://example.com/contacts/42`.
|
||
|
||
Both responses will return information about a contact, but each response will take very different forms.
|
||
|
||
The first implementation returns an HTML representation:
|
||
|
||
[source,html]
|
||
----
|
||
<html lang="en">
|
||
<body>
|
||
<h1>Joe Smith</h1>
|
||
<div>
|
||
<div>Email: joe@example.bar</div>
|
||
<div>Status: Active</div>
|
||
</div>
|
||
<p>
|
||
<a href="/contacts/42/archive">Archive</a>
|
||
</p>
|
||
</body>
|
||
</html>
|
||
----
|
||
|
||
The second implementation returns a JSON representation:
|
||
|
||
[source,json]
|
||
----
|
||
{
|
||
"name": "Joe Smith",
|
||
"email": "joe@example.org",
|
||
"status": "Active"
|
||
}
|
||
----
|
||
|
||
What can we say about the differences between these two responses?
|
||
|
||
One thing that may initially jump out at you is that the JSON representation is smaller than the HTML
|
||
representation. Fielding notes exactly this trade-off when using a RESTful architecture:
|
||
|
||
[quote, Roy Fielding, Architectural Styles and the Design of Network-based Software Architectures]
|
||
____
|
||
The trade-off, though, is that a uniform interface degrades efficiency, since information is transferred in a
|
||
standardized form rather than one which is specific to an application's needs.
|
||
____
|
||
|
||
So REST _trades off_ representational efficiency for other goals.
|
||
|
||
To understand these other goals, first notice that the HTML representation has a hyperlink in it to navigate to a page
|
||
to archive the contact. The JSON representation, in contrast, does not have this link.
|
||
|
||
What are the ramifications of this fact for a _client_ of the JSON API?
|
||
|
||
((("JSON API", "vs. HTML")))
|
||
What this means is that the JSON API client must know _in advance_ exactly what other URLs (and request methods) are
|
||
available for working with the contact information. If the JSON client is able to update this contact in some way, it
|
||
must know how to do so from some source of information _external_ to the JSON message. If the contact has a different
|
||
status, say "`Archived`", does this change the allowable actions? If so, what are the new allowable actions?
|
||
|
||
The source of all this information might be API documentation, word of mouth or, if the developer controls both the server
|
||
and the client, internal knowledge. But this information is implicit and _outside_ the response.
|
||
|
||
Contrast this with the hypermedia (HTML) response. In this case, the hypermedia client (that is, the browser) needs
|
||
only to know how to render the given HTML. It doesn't need to understand what actions are available for this contact:
|
||
they are simply encoded _within_ the HTML response itself as hypermedia controls. It doesn't need to understand what
|
||
the status field means. In fact, the client doesn't even know what a contact is!
|
||
|
||
The browser, our hypermedia client, simply renders the HTML and allows the user, who presumably understands the concept
|
||
of a Contact, to make a decision on what action to pursue from the actions made available in the representation.
|
||
|
||
This difference between the two responses demonstrates the crux of REST and hypermedia, what makes them so powerful
|
||
and flexible: clients (again, web browsers) don't need to understand _anything_ about the underlying resources being
|
||
represented.
|
||
|
||
Browsers only (only! As if it is easy!) need to understand how to interpret and display hypermedia, in this case HTML. This
|
||
gives hypermedia-based systems unprecedented flexibility in dealing with changes to both the backing representations and
|
||
to the system itself.
|
||
|
||
|
||
==== Hypermedia As The Engine of Application State (HATEOAS)
|
||
|
||
(((HATEOAS)))
|
||
The final sub-constraint on the Uniform Interface is that, in a RESTful system, hypermedia should be "`the engine of
|
||
application state.`" This is sometimes abbreviated as "`HATEOAS`", although Fielding prefers to use the terminology
|
||
"`the hypermedia constraint`" when discussing it.
|
||
|
||
This constraint is closely related to the previous self-describing message constraint. Let us consider again the two different
|
||
implementations of the endpoint `/contacts/42`, one returning HTML and one returning JSON. Let's update the situation
|
||
such that the contact identified by this URL has now been archived.
|
||
|
||
What do our responses look like?
|
||
|
||
The first implementation returns the following HTML:
|
||
|
||
[source,html]
|
||
----
|
||
<html lang="en">
|
||
<body>
|
||
<h1>Joe Smith</h1>
|
||
<div>
|
||
<div>Email: joe@example.bar</div>
|
||
<div>Status: Archived</div>
|
||
</div>
|
||
<p>
|
||
<a href="/contacts/42/unarchive">Unarchive</a>
|
||
</p>
|
||
</body>
|
||
</html>
|
||
----
|
||
|
||
The second implementation returns the following JSON representation:
|
||
|
||
[source,json]
|
||
----
|
||
{
|
||
"name": "Joe Smith",
|
||
"email": "joe@example.org",
|
||
"status": "Archived"
|
||
}
|
||
----
|
||
|
||
The important point to notice here is that, by virtue of being a self-describing message, the HTML response now shows that
|
||
the "`Archive`" operation is no longer available, and a new "`Unarchive`" operation has become available. The HTML representation
|
||
of the contact _encodes_ the state of the application; it encodes exactly what can and cannot be done with this particular
|
||
representation, in a way that the JSON representation does not.
|
||
|
||
A client interpreting the JSON response must, again, understand not only the general concept of a Contact,
|
||
but also specifically what the "`status`" field with the value "`Archived`" means. It must know exactly what operations
|
||
are available on an "`Archived`" contact, to appropriately display them to an end user. The state of the application is
|
||
not encoded in the response, but rather conveyed through a mix of raw data and side channel information such as
|
||
API documentation.
|
||
|
||
Furthermore, in the majority of front end SPA frameworks today, this contact information would live _in memory_ in a
|
||
JavaScript object representing a model of the contact, while the page data is held in the browser's https://developer.mozilla.org/en-US/docs/Web/API/Document_Object_Model[Document Object Model] (DOM). The DOM would be updated based on changes to this model, that
|
||
is, the DOM would "`react`" to changes to this backing JavaScript model.
|
||
|
||
This approach is certainly _not_ using Hypermedia As The Engine Of Application State: rather, it is using a JavaScript
|
||
model as the engine of application state, and synchronizing that model with a server and with the browser.
|
||
|
||
With the HTML approach, the Hypermedia is, indeed, The Engine Of Application State: there is no additional model on the
|
||
client side, and all state is expressed directly in the hypermedia, in this case HTML. As state changes on the server,
|
||
it is reflected in the representation (that is, HTML) sent back to the client. The hypermedia client (a browser) doesn't know
|
||
anything about contacts, what the concept of "`Archiving`" is, or anything else about the particular domain model for this
|
||
response: it simply knows how to render HTML.
|
||
|
||
Because a hypermedia client doesn't need to know anything about the server model beyond how to render hypermedia to
|
||
a client, it is incredibly flexible with respect to the representations it receives and displays to users.
|
||
|
||
==== HATEOAS & API churn
|
||
|
||
This last point is critical to understanding the flexibility of hypermedia, so let's look
|
||
at a practical example of it in action. Consider a situation where a new feature has been added to the web application with these
|
||
two end points. This feature allows you to send a message to a given Contact.
|
||
|
||
How would this change each of the two responses--HTML and JSON--from the server?
|
||
|
||
The HTML representation might now look like this:
|
||
|
||
[source,html]
|
||
----
|
||
<html lang="en">
|
||
<body>
|
||
<h1>Joe Smith</h1>
|
||
<div>
|
||
<div>Email: joe@example.bar</div>
|
||
<div>Status: Active</div>
|
||
</div>
|
||
<p>
|
||
<a href="/contacts/42/archive">Archive</a>
|
||
<a href="/contacts/42/message">Message</a>
|
||
</p>
|
||
</body>
|
||
</html>
|
||
----
|
||
|
||
The JSON representation, on the other hand, might look like this:
|
||
|
||
[source,json]
|
||
----
|
||
{
|
||
"name": "Joe Smith",
|
||
"email": "joe@example.org",
|
||
"status": "Active"
|
||
}
|
||
----
|
||
|
||
Note that, once again, the JSON representation is unchanged. There is no indication of this new functionality. Instead,
|
||
a client must _know_ about this change, presumably via some shared documentation between the client and the server.
|
||
|
||
Contrast this with the HTML response. Because of the uniform interface of the RESTful model and, in particular,
|
||
because we are using Hypermedia As The Engine of Application State, no such exchange of documentation is necessary! Instead,
|
||
the client (a browser) simply renders the new HTML with this operation in it, making this operation available for the end user
|
||
without any additional coding changes.
|
||
|
||
A pretty neat trick!
|
||
|
||
Now, in this case, if the JSON client is not properly updated, the error state is relatively benign: a new bit of functionality
|
||
is simply not made available to users. But consider a more severe change to the API: what if the archive functionality
|
||
was removed? Or what if the URLs or the HTTP methods for these operations changed in some way?
|
||
|
||
In this case, the JSON client may be broken in a much more serious manner.
|
||
|
||
The HTML response, however, would simply be updated to exclude the removed options or to update the URLs used for them. Clients
|
||
would see the new HTML, display it properly, and allow users to select whatever the new set of operations happens to be. Once
|
||
again, the uniform interface of REST has proven to be extremely flexible: despite a potentially radically new layout
|
||
for our hypermedia API, clients continue to work.
|
||
|
||
An important fact emerges from this: due to this flexibility, hypermedia APIs _do not have the versioning headaches
|
||
that JSON Data APIs do_.
|
||
|
||
Once a Hypermedia-Driven Application has been "`entered into`" (that is, loaded through some entry point URL), all functionality
|
||
and resources are surfaced through self-describing messages. Therefore, there is no need to exchange documentation with
|
||
the client: the client simply renders the hypermedia (in this case HTML) and everything works out. When a change occurs,
|
||
there is no need to create a new version of the API: clients simply retrieve updated hypermedia, which encodes the new
|
||
operations and resources in it, and display it to users to work with.
|
||
|
||
=== Layered System
|
||
|
||
The final "`required`" constraint on a RESTful system that we will consider is The Layered System constraint. This constraint can be found in https://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm#sec_5_1_6[Section 5.1.6] of Fielding's dissertation.
|
||
|
||
To be frank, after the excitement of the uniform interface constraint, the "`layered system`" constraint is a bit of a
|
||
let down. But it is still worth understanding and it is actually utilized effectively by The web. The constraint
|
||
requires that a RESTful architecture be "`layered,`" allowing for multiple servers to act as intermediaries between
|
||
a client and the eventual "`source of truth`" server.
|
||
|
||
These intermediary servers can act as proxies, transform intermediate requests and responses and so forth.
|
||
|
||
A common modern example of this layering feature of REST is the use of Content Delivery Networks (CDNs) to deliver unchanging
|
||
static assets to clients more quickly, by storing the response from the origin server in intermediate servers more
|
||
closely located to the client making a request.
|
||
|
||
This allows content to be delivered more quickly to the end user and reduces load on the origin server.
|
||
|
||
Not as exciting for web application developers as the uniform interface, at least in our opinion, but useful
|
||
nonetheless.
|
||
|
||
=== An Optional Constraint: Code-On-Demand
|
||
|
||
We called The Layered System constraint the final "`required`" constraint because
|
||
Fielding mentions one additional constraint on a RESTful system. This Code On Demand constraint is somewhat awkwardly
|
||
described as "`optional`" (Section 5.1.7).
|
||
|
||
In this section, Fielding says:
|
||
|
||
[quote, Roy Fielding, Architectural Styles and the Design of Network-based Software Architectures]
|
||
____
|
||
REST allows client functionality to be extended by downloading and executing code in the form of applets or scripts. This
|
||
simplifies clients by reducing the number of features required to be pre-implemented. Allowing features to be downloaded
|
||
after deployment improves system extensibility. However, it also reduces visibility, and thus is only an optional constraint
|
||
within REST.
|
||
____
|
||
|
||
So, scripting was and is a native aspect of the original RESTful model of the web, and thus
|
||
should of course be allowed in a Hypermedia-Driven Application.
|
||
|
||
However, in a Hypermedia-Driven Application the presence of scripting should _not_ change the fundamental networking
|
||
model: hypermedia should continue to be the engine of application state, server communication should still consist of
|
||
hypermedia exchanges rather than, for example, JSON data exchanges, and so on. (JSON Data API's certainly have their
|
||
place; in Chapter 10 we'll discuss when and how to use them).
|
||
|
||
Today, unfortunately, the scripting layer of the web, JavaScript, is quite often used to _replace_, rather than augment
|
||
the hypermedia model. We will elaborate in a later chapter what scripting that does not replace the underlying hypermedia
|
||
system of the web looks like.
|
||
|
||
== Conclusion
|
||
|
||
After this deep dive into the components and concepts behind hypermedia systems -- including Roy Fielding's insights into their operation -- we hope you have much better understanding of REST,
|
||
and in particular, of the uniform interface and HATEOAS. We hope you can see _why_ these characteristics make hypermedia
|
||
systems so flexible.
|
||
|
||
If you were not aware of the full significance of REST and HATEOAS before now, don't feel bad: it took some of us over a decade of
|
||
working in web development, and building a hypermedia-oriented library to boot, to understand the
|
||
special nature of HTML, hypermedia and the web!
|
||
|
||
|
||
:sectnums!:
|
||
|
||
[.html-note]
|
||
== HTML Notes: HTML5 Soup
|
||
|
||
[quote,Confucius]
|
||
The beginning of wisdom is to call things by their right names.
|
||
|
||
Elements like `<section>`, `<article>`, `<nav>`, `<header>`, `<footer>`, `<figure>` have become a sort of shorthand for HTML.
|
||
|
||
By using these elements, a page can make false promises, like `<article>` elements being self-contained, reusable entities, to clients like browsers, search engines and scrapers that can't know better. To avoid this:
|
||
|
||
* Make sure that the element you're using fits your use case. Check the HTML spec.
|
||
* Don't try to be specific when you can't or don't need to.
|
||
Sometimes, `<div>` is fine.
|
||
|
||
(((HTML, spec)))
|
||
The most authoritative resource for learning about HTML is the HTML specification.
|
||
The current specification lives on link:https://html.spec.whatwg.org/multipage[].footnote:[
|
||
The single-page version is too slow to load and render on most computers.
|
||
There's also a developers' edition at /dev, but the standard version has nicer styling.]
|
||
There's no need to rely on hearsay to keep up with developments in HTML.
|
||
|
||
Section 4 of the spec features a list of all available elements,
|
||
including what they represent, where they can occur, and what they are allowed to contain.
|
||
It even tells you when you're allowed to leave out closing tags!
|