Recently, I've been asked more than once to explain a position I have concerning the decomposition of application functionality into WSGI middleware. This blog post actually started out as an email, but since I'll probably need to make this case again, I figure I should just write it up more generally so I can point to it later instead of rethinking and retyping it. Although I mention Repoze below, this is not a Repoze-specific issue, it's instead an issue with any WSGI-based system.
It's my opinion that the current Paste "pipeline" exposed under Repoze should be advertised as the domain of integrators, who should feel free to remove and add middleware that tweaks behavior in ways that don't "significantly change" the result of visiting most resources. In the common case, I would define "significantly change" as a situation where both the content body and the content-type of "most" resources returned by the stack will be different when the middleware is absent than when the middleware is present.
This assertion is in tension with ideas that various folks have expressed about using middleware filters to "theme" otherwise presentation-free semantic XML. For instance, folks have proposed that it would be beneficial to write WSGI applications that tend to return XML something like (only an example):
<?xml version="1.0"?>
<resource id="foo" type="form">
<element name="State" type="select" source="states"/>
<element name="City" type="text"/>
<target="http://www.example.com/formtarget"/>
</resource>
<resource id="states" type="enum">
<value id="NJ">New Jersey</value>
<value id="MA">Massachusetts</value>
</resource>
<context>
<user id="fred">Fred</user>
<session id="abc123"/>
</context>
Within an "upstream" WSGI middleware filter, the semantics of this XML would be divined, and code in the middleware would be able to produce some styled HTML that represented a form. While producing the form, a developer could select elements from the semantic XML and use them to make decisions about what to display and how to display it. He would then pass the HTML he created back upstream.
I tend to disagree with using WSGI for this purpose. The reason this is in tension with my "don't significantly change" assertion is that inserting or pulling the "transform" middleware out of the WSGI middleware stack has the potential to completely change the result of calling in to the application for most resources. At least in the arguments I've heard about this, most folks who want to do this want to do it for every resource. They intend to create an application which generates semantic XML on the right hand side, and create middleware which embeds application policy to interpret and transform that XML into something renderable by a browser. In effect, they want fundamental behavior of the "application" to live in two places: in the WSGI application, and within one or more pieces of WSGI middleware.
This is understandable. People want to do this using WSGI because they sense that a Paste "pipeline" is a form of functional composition where the output of one function becomes the input of another, and programming using functional composition fits their brains. It is often easier to understand and debug programs that are written this way because often you only need to understand one small piece of code in the pipeline in order to get the results you want, and the lines of demarcation between pipeline elements are consistent and explicit. Arbitrary applications don't have such well-defined lines of demarcation, so it's often difficult to find out where to jam in some code to get the result you want. Fans of the functional composition development model also like the ability to divide the problem space along lines of responsibility with a well-defined interface between them. In a functionally decomposed application like this, on a project divided between developers along "back end" and "front end" lines it's clear who's at fault if the semantic XML is broken (the "back end folks"), and it's likewise just as clear who's at fault if the rendering isn't right (the "front-end folks"). In some cases, this is a strategic benefit on a larger project, because it allows for clearer divisions of responsibility.
ASIDE: It's often tempting to say "WSGI pipeline" (indeed I do it all the time) but there actually is no such a thing as a "WSGI pipeline". Paste calls it a pipeline, perhaps mistakenly. In any case, WSGI is actually composed of two pipelines: the ingress pipeline and the egress pipeline. On ingress, the request is passed through middleware until it gets to "an application". On egress, a response is passed back up the same set of middleware until it gets to "a server", where it is sent back to the requestor's browser. Sorry, I thought this would be an appropriate place to mention this.
But despite the benefits of using functional composition as a development model, I posit that if a WSGI application which returns semantic XML or some other serialization of structured data doesn't actually yet exist that there is no purpose in making the rendering part of the application into WSGI middleware. Instead, it should just live in the application itself, because it actually is part of the application. Likewise, if a piece of WSGI middleware can't be dropped into an arbitrary Paste pipeline and potentially do something useful without requiring radical changes to the application being served, it's not actually "middleware". It's just a piece of code dropped into the call chain because it was convenient to put it there.
Although it's reasonable to be attracted to functional composition as a development model, and you may think of a stack of WSGI middleware as a call chain, my assertion is that the particular call chain exposed by WSGI isn't always the appropriate place to do application development.
All that said, this explanation makes a lot of people scratch their heads and wonder what I'm smoking, because the allure of programming within a functional call chain is so strong.
Given that people really like the idea of using functional composition as a development model because it so closely fits their brains, we've been considering adding an additional set of pipelines to Repoze. The existing Paste pipeline would be not need to be controlled by application developers. Instead, sysadmins and integrators would be free to add and remove WSGI middleware, change server port numbers, etc there to their heart's content.
The additional pipelines would be the domain of programmers. Programmers would plug code in to some point within a a separate functional call chain. The composition would be fixed for a particular application deployment, and would not meant to be changed by integrators. The composition may not be a Paste "pipeline" because WSGI 1.0 is actually not a great model for straight functional composition because it's engineered for various HTTP corner cases. Indeed, it may not even have a declarative composition syntax like Paste provides for WSGI 1.0. Instead, the protocol itself might be WSGI 2.0 or perhaps something even simpler which exposes separate ingress and egress filter chains, and uses plain-old-Python to compose each.
The packaging for applications which had candidate functions which were willing to be injected themselves into the plug points might be plain old Python eggs. These eggs would have entry points which essentially stated that they had functions which were "pipeline candidates", e.g.:
[repoze.callchain.ingress]
myingress = foo.bar:baz
[repoze.callchain.egress]
myegress = foo.bar:buz
Then a separate bit of configuration, perhaps in a separate config file, you'd compose all of these composables up into pipelines:
[pipeline:ingress]
pipeline = egg#mypackage:myingress egg#anotherpackage:thatingress
[pipeline:egress]
pipeline = egg#mypackage:myegress egg#anotherpackage:thategress
When Repoze (or whatever contained this framework) started up, it would look for that config file, and arrange the call chains. You might package up that configuration in a top-level egg and call that "an application", because that's exactly what it would be.
Another mechanism that could be used for the same configuration purpose is ZCML and the Zope component architecture. Insert cheer or jeer here depending on which way you swing. ;-)
I'd love to hear any countervailing opinions. Likewise, if you agree strongly, I'd appreciate it if you said why. I'd be particularly interested in what Ian and Phillip have to say on the topic {hint, hint}.
I disagree with the conclusion that it should, thus, live in each of the N application frameworks out there. From your perspective, it's a feature to do it in some application framework. However, from my perspective, there are some sharp edges to that alternative.
I would like to at least see if the itch can be scratched in new ways. If nobody wants it, then it will die of its own accord. If people *do* like the idea though, more than the alternative, then it was a good idea. [wink]
I guess your proposal is a way to keep the chocolate out of the peanut butter, regarding the flaky WSGI UI ideas?
Replies to this comment