Skip to content.

plope

Personal tools
You are here: Home » Peppercorn: A Simpler Way to Decode Form Submission Data
 
 

Peppercorn: A Simpler Way to Decode Form Submission Data

The way form data submission is handled in web apps has irritated me for many years now. Thankfully, I think we may be able to make it better.

As you probably already know, there are two ways that HTTP clients can send data when an HTML form is submitted to a server: via a POST request and via a GET request. The method chosen is dependent upon the method attribute of the HTML form tag associated with the form data.

When a browser sends data from an HTML form in a POST request, it needs to inform the server of the encoding which the data it's sending is using. To do so, the encoding type is sent along with the encoded payload. Two POST encoding types are widely supported: "application/x-www-form-urlencoded" and "multipart/form-data". These are usually specified via the enctype attribute of the associated form tag.

The default encoding for form submission data is "application/x-www-form-urlencoded". This encoding mechanism is described briefly in the HTML 4 spec :

  The control names/values are listed in the order they appear in the
  document. The name is separated from the value by `=' and name/value
  pairs are separated from each other by `&'.

For example, the payload of a form submission that uses the "application/x-www-form-encoded" encoding might look like the below:

  Content-Type: application/x-www-form-encoded

  a=1&b=2&c=3

An alternate encoding type understood by every client and server these days is "multipart/form-data". This encoding type is used when you specify enctype="multipart/form-data" as an attribute of your HTML form tag. It is also described briefly in the HTML 4 spec :

  A "multipart/form-data" message contains a series of parts, each
  representing a successful control. The parts are sent to the
  processing agent in the same order the corresponding controls appear
  in the document stream.

The payload of a form submission that uses the multipart/form-data encoding might look like the below:

  Content-Type: multipart/form-data; boundary=AaB03x

   --AaB03x
   Content-Disposition: form-data; name="a"

   1
   --AaB03x
   Content-Disposition: form-data; name="b"

   2
   --AaB03x--
   Content-Disposition: form-data; name="c"

   3
   --AaB03x--

On the other hand, when a browser sends a GET request, the form data is encoded into the query string sent to the server in a format identical to that described in the application/x-www-form-urlencoded bit of the HTML 4 spec. GET requests are always assumed to produce form data in the query string in this encoding, no matter what the enctype of the form tag might say.

For example, this URL might be visited when a form was submitted when using the GET method:

  http://example.com/?a=1&b=2&c=3

Whether GET or POST is used, and no matter the form data encoding, the data sent to the server is, in practice, always considered to be a flat mapping of key to value. This is because, in the early days of the web, form submission data was supported only via a query string attached to a GET request, and it was awkward to try to carry around any other structure data in the URL other than a flat mapping.

And though the spec upon which the multipart/form-data encoding is based (the MIME spec) allows for nested parts, nothing in the stack, neither on the browser side nor on the server side, actually supports this feature. Form submission data is unconditionally represented as a flat mapping of key to value, where the key is a string, and the value is either a string or some representation of file upload data.

In the meantime, the data we manipulate on the server side in response to a form data submission is only very rarely a single flat mapping of key to value. We obviously often need to mutate complex structures as a result of a form data submission. For example:

  • I often need to extract a sequence of 0 to N files from form submission data and attach those files to objects in my application. This is often an "upload some number of files widget". In a non-web data structure, you'd probably just represent this as a sequence of files. In a form post, you need to represent it as a set of file upload controls, each with a potentially different name.
  • Sometimes there's a structure in application space that needs to be represented by form data that is naturally a sub-mapping. Maybe you may want to be able to adjust the name of a single user in a form representing a sequence of users. In a non-web data structure, you'd probably represent the data structure as a sequence of mappings, where each mapping represented the data of a single user.

If you've done any amount of web programming, I'm sure you can come up with many other examples. So, basically we have an impedance mismatch. A single form submission always wants to give us a single, flat mapping of keys to values. Our application domain consists of arbitrary data structures, almost never completely adequately represented as a single, flat mapping of keys to values. So we need to do a translation when we get a form submission from a single flat mapping to another data structure in order to be able to work with the data presented to us by a form submission in a format more natural to the application.

Innumerable systems have been built to convert the flat key/value mapping of form submission data into friendlier data structures for developer consumption: Plone's Archetypes has a subsystem for doing this, as does Zope via zope.formlib and z3c.form . Formish is another such system, as is WTForms . There are hundreds, perhaps thousands of such systems available today, it would be pointless to try to enumerate them all.

In the mean-mean time, the ascendance of AJAX and data interchange formats like JSON mean that often web applications are written with an API that accepts highly structured data (more than a single mapping of key to value), where this API is meant to be consumed by AJAX clients via JavaScript, or by REST clients via a scripting language. Often people make essentially two APIs for interacting with their systems: one for web form submissions from browsers, and another for XML-RPC or "REST" interaction by programmatic clients.

Lately, as I pondered why this all sucked, I remembered a long-buried feature of Zope 2 (actually Bobo, released in either 1996 or 1997): The "publisher" portion of Zope had a concept of "records". Here's a passage describing this feature stolen from the Zope docs:

  A more complex type of form conversion is to convert a series of
  inputs into *records.* Records are structures that have
  attributes. Using records, you can combine a number of form inputs
  into one variable with attributes.  

  Here is an example form that uses records::

  <form action="." method="POST">

    <p>Please, enter information about one or more of your next of
    kin.</p>

    <p>
      First Name <input type="text" name="people.fname:records" />
      Last Name <input type="text" name="people.lname:records" />
    </p>

    <p>
      First Name <input type="text" name="people.fname:records" />
      Last Name <input type="text" name="people.lname:records" />
    </p>

    <input type="submit" />

  </form>    

Wait, what? This can't possibly work, right? We have four input fields above. Two have the name "people.fname:records", and another two have the name "people.lname:records". So if we received a form submission from this form on a server that uses WebOb, and ask WebOb for request.POST.getall("people.fname:records") we'd get a list of two strings. That's not very helpful. Likewise if we asked for request.POST.getall("people.lname:records"), we'll get another list of two strings. That doesn't help us at all, does it, because we can't know which "people.lname:records" goes with which "people.fname:records". In this case, we can't know whether Tres' last name is "Seaver" or "McDonough". So it's useless. Or wait.. is it really?

Let's take a look at the descriptions of the various form encodings again.

application/x-www-form-urlencoded:

  The control names/values are listed in the order they appear in the
  document. The name is separated from the value by `=' and name/value
  pairs are separated from each other by `&'.

multipart/form-data:

  A "multipart/form-data" message contains a series of parts, each
  representing a successful control. The parts are sent to the
  processing agent in the same order the corresponding controls appear
  in the document stream.

One interesting feature of both of the descriptions above is the word "order". In particular, no matter which encoding you use, the spec says that the client should send you the encoded key/value pairs in the same order that they're defined in the document. This means that you can always assume that the following form:

  <form action="." method="POST">

    <p>
      First Name <input type="text" name="people.fname:records" />
      Last Name <input type="text" name="people.lname:records" />
    </p>

    <p>
      First Name <input type="text" name="people.fname:records" />
      Last Name <input type="text" name="people.lname:records" />
    </p>

    <input type="submit" />

  </form>    

If it's filled out like so and submitted:

   First Name: Chris
   Last Name: McDonough

   First Name: Tres
   Last Name: Seaver

   [submit]

The data sent to the server by any well-behaved client will be ordered in document order, so we'll always get this payload:

  Content-Type: application/x-www-form-encoded

  people.fname:records=Chris&people.lname:records=McDonough&people.fname:records=Tres&people.lname:records=Seaver

We'll however, never get the keys and values in some mixed up order in the encoded form submission if we use a well-behaved client. We'll never, in other words, get this:

  Content-Type: application/x-www-form-encoded

  people.lname:records=McDonough&people.lname:records=Seaver&people.fname:records=Chris&people.fname:records=Tres

We can take advantage of this fact by treating the data we get back from a form submission as a stream of key/value pairs rather than a mapping of key/vaue pairs. And this is exactly what Bobo did in 1997 with its "record" and "records" form element name modifiers. Logic on the server side treated the form data as a stream, so that when it saw the following sequence of key/value pairs in the stream:

   people.fname:records                    Chris
   people.lname:records                    McDonough
   people.fname:records                    Tres
   people.lname:records                    Seaver

Bobo knew enough to know that elements that had the modifier ":records" as a name suffix which shared the same prefix (people in this case) should be grouped into a record. Furthermore, it knew that when it encountered a key it had already seen, a new record should be started, a feature totally reliant on treating the form submission data as an ordered stream.

The result was such that when such a form was submitted, the resulting request object had an attribute named people on it, which was a sequence in this style:

   [{'fname':'Chris', 'lname':'McDonough'}, 
    {'fname':'Tres', 'lname':'Seaver'}]

I made heavy use of this feature while making Zope applications over the years, but its full significance never really dawned on me until recently. It just seemed like a cool gimmick.

But as I created yet another web app which treats form data submissions as a mapping of primitive data that needs to be demunged, I began to ask myself a question: instead of creating ad-hoc logic to transform the flat key/value pair composition of a form post into more natural application data, why not always treat the submission data as a stream instead of a mapping, and let some general purpose component translate the stream into a more suitable data structure so I don't have to?

Streams are interesting input sources. You can compose arbitrarily complex data structures out of streams quite easily. For example, let's consider a "schema" in this form:

   {'name':'Fred', 'phones':[{'location':'home', 'number':'555-1212'},
                             {'location':'work', 'number':'555-3434'}]}

It's perfectly possible to convert this data structure to a flat mapping if you munge the keys:

  {'name':'Fred',
   'phones.0.location':'home',
   'phones.0.number:'555-1212',
   'phones.1.location:'work',
   'phones.1.number:'555-3434'}

To impose hierarchy into a flat mapping structure, you've embedded meaning into the keys for the sake of being able to express the hierarchy. Formish takes this approach in order to represent hierarchy in form submission data.

But writing and using the code to munge and demunge these keys is not much fun. It would be easier to decode, in general, if you used a stream in the form of a sequence of key/value pairs instead of a mapping. For example:

   ('name', 'Fred')
   ('__start__', 'phones:sequence')
   ('__start__', ':mapping')
   ('location', 'home')
   ('number', '555-1212')
   ('__end__', ':mapping')
   ('__start__', ':mapping')
   ('location', 'work')
   ('number', '555-3434')
   ('__end__', ':mapping')
   ('__end__', 'phones:sequence')

The sequence of instructions embedded in this stream can be used to recompose the original mapping.

In fact, completely arbitrary structures composed of mixes of sequences, mappings, and strings can be handled in less than 50 lines of code (at least if you have help from the effbot):

  def data_type(value):
      if ':' in value:
          return [ x.strip() for x in value.rsplit(':', 1) ]
      return ('', value.strip())

  START = '__start__'
  END = '__end__'
  SEQUENCE = 'sequence'
  MAPPING = 'mapping'

  def stream(next, token):
      """
      thanks to the effbot for
      http://effbot.org/zone/simple-iterator-parser.htm
      """
      op, data = token
      if op == START:
          name, typ = data_type(data)
          if typ in (SEQUENCE, MAPPING):
              if typ == SEQUENCE:
                  out = []
                  add = lambda x, y: out.append(y)
              else:
                  out = {}
                  add = out.__setitem__
              token = next()
              op, data = token
              while op != END:
                  key, val = stream(next, token)
                  add(key, val)
                  token = next()
                  op, data = token
              return name, out
          else:
              raise ValueError('Unknown stream start marker %s' % repr(token))
      else:
          return op, data

  def parse(fields):
      """ Infer a data structure from the ordered set of fields and
      return it."""
      fields = [(START, MAPPING)] + list(fields) + [(END,'')]
      src = iter(fields)
      result = stream(src.next, src.next())[1]
      return result

  if __name__ == '__main__':
      fields = [ ('name', 'Fred'),
                 ('__start__', 'phones:sequence'),
                 ('__start__', ':mapping'),
                 ('location', 'home'),
                 ('number', '555-1212'),
                 ('__end__', ':mapping'),
                 ('__start__', ':mapping'),
                 ('location', 'work'),
                 ('number', '555-3434'),
                 ('__end__', ':mapping'),
                 ('__end__', 'phones:sequence'), ]

      print parse(fields)

Thus was born Peppercorn . Peppercorn is a system that transforms streams of key/value pairs into arbitrary data structures composed of mappings, sequences, and strings. There's no reason to go look at the code for Peppercorn somewhere else; all its code is printed in the example above (literally). It has one API named "parse", and that's it.

Because systems like WebOb retain ordering information, it's easy to use Peppercorn to decode form post data:

   fields = request.POST.items() # (key, value) retaining ordering
   structure = peppercorn.parse(fields)

If the POST you're using above comes from the following HTML form:

   <form action="." method="POST">

     Name: <input type="text" name="name"/>
     <div>
     <div>Phones</div>
     <input type="hidden" name="__start__" value="phones:sequence"/>
     <div>
     <input type="hidden" name="__start__" value=":mapping"/>
     Location: <input type="text" name="location"/>
     Number: <input type="text" name="number"/>
     <input type="hidden" name="__end__" value=":mapping"/>
     </div>
     <div>
     <input type="hidden" name="__start__" value=":mapping"/>
     Location: <input type="text" name="location"/>
     Number: <input type="text" name="number"/>
     <input type="hidden" name="__end__" value=":mapping"/>
     <input type="hidden" name="__end__" value="phones:sequence"/>
     </div>
     </div>

     <input type="submit"/>

  </form>

The form rendering might look like so, when filled out:

     Name: Fred

     Phones
     ------

     Location: Home        Number: 555-1212
     Location: Work        Number: 555-3434

     [submit]

When submitted to a system that ran WebOb, if you ran this code:

   fields = request.POST.items()
   structure = peppercorn.parse(fields)

You would wind up with this data structure as 'structure':

   {'name':'Fred', 'phones':[{'location':'home', 'number':'555-1212'},
                             {'location':'work', 'number':'555-3434'}]}

I'm currently in the process of trying to build a higher-level form system that exploits Peppercorn (the encode portion of the form submission problem). In the meantime, let me know if you find Peppercorn or the stream concept upon which it's built useful. It's released to PyPI, and can also be had via SVN .

Created by chrism
Last modified 2010-03-22 11:29 PM

HTML 5

FWIW HTML 5 repeating form elements are based basically on ordering. Specifically there's no hook to rename form fields when ordering is changed, you only can accomplish it by looking at the field submission order (well, one other way, but I'll ignore it). Hidden fields would work well enough I suppose, though it neither suggests nor disallows them. How important are those markers? You could infer start/end by names, after all. If writing variabledecode again (same basic scope of work) I'd probably do something generally like that, though it had not occurred to me to move up to the most root key necessary to make something unique.

1269295836

In this model, obviously the control structures as hidden fields are the only thing that give the parser enough clues to reconstruct the data structure. You could encode the structure in the field names like variabledecode does, but then you need to rename fields when things are added to the structure and removed from it. But you knew that, because you said so in the lead-in, so I'm not sure how to answer.

I was thinking of getting stupid clever and making it look more like this:

<form action="." method="POST">

Name: <input type="text" name="name"/>
<div>
<div>Phones</div>
<input type="hidden" name="[" value="phones"/>
<div>
<input type="hidden" name="{"/>
Location: <input type="text" name="location"/>
Number: <input type="text" name="number"/>
<input type="hidden" name="}"/>
</div>
<div>
<input type="hidden" name="{"/>
Location: <input type="text" name="location"/>
Number: <input type="text" name="number"/>
<input type="hidden" name="}"/>
<input type="hidden" name="]"/>
</div>
</div>

<input type="submit"/>

</form>

Seems too clever tho.

peppecorn encode

Been fooling around a bit with peppercorn and pylons and figured I'd share this ( http://gist.github.com/396568 ) if it might help anyone interested in using formencode and peppercorn together. Included is a function for reencoding peppercorn style a mapping structure, a hacked version of pylons.decorators.validate which can use peppercorn.parse and a wrapper around multidict which helps htmlfill do the right thing wrt multiple fields of the same name.