James Gardner: Home > Work > Code > Wizard > 0.1.0 > Manual

Wizard v0.1.0 documentation

Manual

There are two ways of dealing with long and complex forms:

  • All in one page
  • In small chunks with one sub-form per page

Presenting lots of data in one form where some parts of the form depend on parts completed earlier results in a form which is very un-friendly for the user so it is often best to split the form into multiple pages.

If you can perform actions based on each step of the form such as sending emails, storing information in a database or creating user accounts then your life as a developer is easy. You can just process each sub-form as the user completes it and implement code to re-direct them to the current step if they use the back button to attempt to re-complete a sub-form they have already completed. In reality, cases where the situation described above actually exist are few and far between. Much more common is the case where the actions from each step of the form should only be applied if the user finishes all steps of the form. For this case the data from each step needs to be stored in some temporary location. There are a few different places the data could be stored:

  • In hidden fields in each step

    The advantage of this is that the user can use the back button as much as they like because all the data is stored client side so re-submitting it doesn’t cause any problems. It is also relatively easy to implement, forms just need to be modified to contain the hidden fields.

    There are some disadvantges though. Firstly the data from all previous steps needs to be validated each time a form is submitted to avoid the possibility that a hacker changes data in the hidden fields validated in an earlier step and secondly that if the wizard needs some sort of out-of-band authentication, such as a user clicking on a link in an email to confirm their details, all data will be lost because the hidden fields can’t easily be embedded in a link. Another problem is that one user could choose a username and before they get to the end of the wizard another could choose the same username. Because the data is stored client side the first user will have had that user created at the end of the wizard process and the second user will get an error saying that the username is already taken and will have to start the wizard all over again. The solution is to store data server-side and only store an ID client side

  • Server-side in a wizard-specific session store or database

    In this approach a wizard ID is stored in either a cookie, a session or the URL and on each request the submitted data is stored in the database in a way that associates the data with the wizard ID. On each request, submitted data from previous steps can be re-validated and the user can click the back button as much as they like because re-submitting a form simply clears the existing data from subsequent steps and adds the new data. Becasue the data is stored in a database, the application has an oppurtunity to search it so that when a second person chooses the same username the wizard data can be checked and an error raised. Also, because the wizard only relies on an ID it can easily be passed around in an email or other forms for non-HTTP steps such as confirming an email or being redirected to a third party payment system.

    This approach has some disadvantages too though. Anyone guessing the wizard ID could potentially pretend to be the user who is really using that wizard and so could change the data. This can be solved by always including a hash of the ID with a secret key so that even if an attacker guesses the ID they can’t access the wizard. Because not everyone who starts a wizard will finish it your database could become clogged up with old form data so the wizard tool needs a way of purging old data. The Wizard software described here has solutions to both these problems.

Configuration Options

The Wizard requires the following configuration options:

secret
A secret string or phrase which you should keep secret, used to make the hash generated from the Wizard ID more random.
expire_probabilty
Should be a number from 0 to 1 specifying the likelyhood of the code which remove expired wizards being run on any particular request. If you set it to 0, expired wizard data will never be removed, if you set it to 1 expired wizards will be removed each time the wizard is accessed but there is no need to have the code run quite so often. A value of 0.01 or less would be fine. That would run the code to remove expired wizards roughly every 100 times the wizard was accessed.
allowed_failures
This is the number of times an attempt to access the same wizard ID with an invalid hash should be tolerated. Depending on your application there is very little chance a hash will be wrong legitimately so setting this to a low number such as 2 or 3 should be fine. Bear in mind though that a malicious user could then force a lot of wizards to be forcibly expired (possibly when a user is still using the wizard) just by trying a lot of wizard IDs with incorrect hashes until they are expired. For this reason it is recommended you use a captcha after the allowed number of failures and before the wizard code deletes the wizard as a precaution.

Getting the Configuration Options to the Wizard

All the wizard functions take a state argument and they expect to be able to access the configuration options under the attribute state.config.wizard. As an example, the secret option would be accessed as state.config.wizard.secret. It is up to you to provide this state object and the state.config.wizard object.

Here we’ll set up a suitable state object:

>>> class EmptyState(object): pass
>>> state = EmptyState()
>>> state.config = EmptyState()
>>> state.config.wizard = EmptyState()
>>> state.config.wizard.secret = 'Some secret string'
>>> state.config.wizard.expire_probability = 0.01
>>> state.config.wizard.allowed_failures = 3

Just to show it works:

>>> print state.config.wizard.expire_probability
0.01

Of course if you are using PowerPack and ConfigConvert you can just add the wizard options to your usual configuration file like this and they will be automatically set up in the correct way anyway:

wizard.secret = Some secret string
wizard.expire_probability = 0.01
wizard.allowed_failures = 3

Then in your config.py file you can use this:

# Wizard Options
wizard_converters={
    'secret': unicodeToUnicode(),
    'expire_probability': StringToFloat(min=0, max=1),
    'allowed_failures': StringToFloat(min=1, max=10),
}
wizard_converter = toRecord(
    missing_error = "The required option 'wizard.%(key)s' is missing",
    empty_error = "The option 'wizard.%(key)s' cannot be empty",
    converters=wizard_converters,
)
config['wizard'] = Conversion(setting_parts['wizard']).perform(wizard_converter).result

Writing Validators

The idea of Wizard is that it works together with ConversionKit and FormConvert or FormEncode to validate each step. For example, imagine you have a wizard for registering with a website with the following steps:

Step 1
Create an account
Step 2
Set up your preferences
Step 3
Invite friends

You would then write 3 ConversionKit converters, one to validate and convert the submitted data from each step.

Note

Wizard supports FormEncode too if you don’t want to use ConversionKit and FormConvert but you have to look at the source to see how to use it.

Writing converters is an involved business and you can read the ConversionKit and FormConvert manuals for all the details but a basic validator/converter for creating an account might look like this:

>>> from recordconvert import toRecord
>>> from stringconvert import unicodeToUnicode
>>> account_form_to_account_record = toRecord(
...     converters = dict(
...         username = unicodeToUnicode(),
...         password = unicodeToUnicode(),
...         confirm_password = unicodeToUnicode(),
...     )
... )

We are using the unicodeToUnicode() converter to ensure that the data is a Unicode string. In real life you would need the username converter to be able to check the database to ensure the username was available. So that this is a real example let’s set up an SQLite in memory database for the example and create a users table and add an existing user called Ian:

>>> import sqlite3
>>> connection = sqlite3.connect(':memory:')
>>> cursor = connection.cursor()
>>> cursor.execute('CREATE TABLE users (username VARCHAR(20))')
<sqlite3.Cursor object at ...>
>>> cursor.execute('INSERT INTO users (username) VALUES (?)', ('ian',))
<sqlite3.Cursor object at ...>
>>> cursor.close()

Here’s the converter we need:

>>> def UsernameAvailable():
...     def username_available_converter(conversion, state):
...         connection = state.database.connect()
...         cursor = connection.cursor()
...         cursor.execute(
...             'SELECT 1 FROM users WHERE username=?',
...             (conversion.value,)
...         )
...         rows = cursor.fetchall()
...         if len(rows) and rows[0][0] == 1:
...             conversion.error = "The username %r is already taken"%conversion.value
...         else:
...             conversion.result = conversion.value
...         cursor.close()
...     return username_available_converter

To use this converter you can create a state object with a .database attribute which has a connect() method connects to the database. Once again this is easy to set up if you are using the PowerPack but if you want to do it manually you can do so like this:

>>> state.database = EmptyState()
>>> def connect():
...     # Return the global connection created above (you'd do someting
...     # more sophisticated in your own code
...     return connection
>>> state.database.connect = connect

With these pieces in place let’s look at the converter for the first step. It now looks like this:

>>> account_form_to_account_record = toRecord(
...     converters = dict(
...         username = UsernameAvailable(),
...         password = unicodeToUnicode(),
...         confirm_password = unicodeToUnicode(),
...     )
... )

Just to prove it works here’s an example of it working. In real wizard code you’d use account_form_to_account slightly differently. Here’s a function which tests it works. Notice that the state argument is passed as the object as the second argument to the conversion’s perform() method so that it gets passed as the second argument to the username_available_converter() inner function when the conversion is performed.

>>> from conversionkit import Conversion
>>>
>>> valid_form_data = dict(username='james', password='123', confirm_password='123')
>>> Conversion('james').perform(UsernameAvailable(), state).result
'james'
>>> invalid_form_data = dict(username='ian', password='123', confirm_password='123')
>>> Conversion('ian').perform(UsernameAvailable(), state).error
"The username 'ian' is already taken"

Now that the converter for a single form is working, let’s create converters for steps 2 and 3. They look like this:

>>> from stringconvert.email import unicodeToEmail
>>> from stringconvert import unicodeToBoolean
>>>
>>> prefs_form_to_prefs_record = toRecord(
...     missing_defaults= dict(include_me_on_marketing_list='No'),
...     converters = dict(
...         include_me_on_marketing_list = unicodeToBoolean(),
...     )
... )
>>>
>>> invite_form_to_invite_record = toRecord(
...     converters = dict(
...         invite_friend = unicodeToEmail(),
...     )
... )

Now with all this infrastructure in place we can get onto the clever part. Next you create a converter which is capable of converting all the steps in one go. It might look like this:

>>> from nestedrecord import decodeNestedRecord
>>> from conversionkit import chainConverters
>>>
>>> wizard = chainConverters(
...     decodeNestedRecord(depth=1),
...     toRecord(
...         converters = dict(
...             step1 = account_form_to_account_record,
...             step2 = prefs_form_to_prefs_record,
...             step3 = invite_form_to_invite_record,
...         )
...     )
... )

You don’t have to name the keys step1, step2 etc but I’ve found it more useful than naming the step after the form in that step because if you refacotr the wizard as you develop it the contents of the steps might change.

With the three form converters and the overall wizard converter in place you can now validate the form from each step with its own validator, save the data in the wizard and then in the next step, pass the whole lot through the wizard converter to re-convert and validate all previous steps.

Before we do we need to create the tables the Wizard needs.

Setting up the Required Tables

This is best done manually with whatever tool you are using to set up the rest of your database. Here is an example which sets up the tables for SQLite and has some commments about what the fields are used for:

>>> cursor = connection.cursor()
... cursor.execute('''
...     CREATE TABLE wizard (
...       wizard_id                 serial not null,
...       accessed                  timestamp default now() NOT NULL,
...       name                      varchar(20), -- The string name which will prefix all keys for this wizard
...       step_completed            varchar(20), -- A custom string containing the name of the last step that
...                                              -- has been completed. This always forms the second part of the key.
...       status                    smallint default 1,
...       expires                   integer default 432000, -- The default expire time is 5 days
...       hash                      char(6),     -- The hash does not need to be unique and is actually populated
...                                              -- after the row is create because it uses the ID
...                                              -- (I know we could use the serial table)
...       failed_attempts           smallint default 0 NOT NULL,
...       primary key (wizard_id)
...     ) ;
...
...     CREATE TABLE wizard_field (
...       wizard_id                 integer NOT NULL,
...       type                      smallint default 0 NOT NULL,
...       key                       varchar(60) NOT NULL,
...       value                     varchar(1023) NOT NULL
...     );
...
... ''')
>>> cursor.close()

Creating a Wizard

To create a wizard use the model_wizard_new() function:

>>> from wizard import model_wizard_new
>>>
>>> wizard = model_wizard_new(state, 'user_registration')
>>> print wizard['wizard_id']
1
>>> print wizard['hash']
asdad

    wizard_id=wizard_id,
    hash=hash,
    name=name,
    expires=expires,
    status=status,
    accessed=accessed,
    failed_attempts=0,

Each wizard has a name. In this case it is called user_registration. The wizard ID and hash are generated when you create the wizard.

If you are interested in what is going on behind the scenes the code just adds this to the wizard table:

>>> cursor = connection.cursor()
>>> cursor.execute("SELECT * FROM wizard")
adad
>>> cursor.close()

Storing and Retieving the Wizard ID and Hash

Next you need to pass the wizard ID and hash back to the user so that when they submit the first stage of the form the submitted data can be linked with the correct wizard.

Store the Wizard ID and Hash using Hidden Fields and Retrive from request.params

One way of doing this is to pass the wizard ID and hash back to the user as hidden fields in the next stage of a form. If you are using PowerPack and FormBuild this would look like this:

values = {
    'wizard_id': wizard['wizard_id'],
    'hash': wizard['hash'],
}
return view(
    state,
    'create_pod.index/create.html',
    dict(
        form=FormA(values=values),
        current_step=1,
    ),
)

The template could then contain:

${form.hidden('wizard_id')}
${form.hidden('hash')}

When the form is submitted it is submitted to a step1() action you can use the @wizard_from_params decorator to extract the wizard ID and hash from the request POST data:

>>> from wizard import wizard_from_params
>>>
>>> @wizard_from_params('wizard_id', 'hash', remove=True)
... def action_step1(state):
...     # state.wizard provides all the wizard information.
...     print state.wizard.wizard_id
...     print state.wizard.hash

Here the wizard is automatically checked based on the hidden variables from the form submission. The remove=True argument is supposed to remove these variables from the state.params dictionary so they don’t interfere with any other processing you might wish to carry out but due to a “feature” of that object this isn’t easily posible. The decorator handles all failure cases or attacks, redirecting to a different URL if there are too many failures.

If everything is OK the state.wizard attribute is populated. There are similar decorators for extracting the variables from the session or the URL.

Let’s test this by simulating a request which includes the wizard_id and hash. First install WebOb:

$ easy_install WebOb

Then let’s create a dummy request using the WebOb pacakge and add it to the state:

>>> from webob import Request
>>> fake_environ = {
...     'QUERY_STRING': 'wizard_id=1&hash=asdad',
... }
>>> state.request = Request(fake_environ)

Now when we call the action you’ll see the wizard_id and hash are extracted by @wizard_from_params and printed in the action:

>>> action_step1(state)

As you can see this works!

Before we go on to how to access the wizard data let’s look at other ways of storing and extracting the wizard information.

Store the and Retrieve the Wizard ID and Hash in a Session

Similarly, you can set the Wizard ID and Hash on a session store (such as a Beaker session) and then retieve them with the @wizard_from_session decorator.

The session store must be available as state.session.

Caution

If your session store happens to be memory-based you might find that the wizard data gets lots every time you restart the server. This can make debugging complex wizards rather tiersome so it is recommended you use a cookie for debugging in such circumstances.

Store the Wizard ID and Hash as Part of a URL

If you are using a system which makes use of the unofficial WSGI convention of putting routing args in the wsgiorg.urlvars key in the environ dictionary and you have a WebOb Request instance accessible as state.request, you can use the @wizard_from_urlvars decorator to extract the wizard ID and and hash from the routing variables.

Creating a Custom Decorator for Storing the Wizard ID and Hash

Creating your own decorator is really easy, you just create a get_keys function which gets the wizard ID and hash in whichever way is appropriate for your application and then return wizard_decorator(get_keys). Here’s an example which gets the wizard ID and hash from variables parsed from the URL itself. The variables are specified as the wizard_key and hash_key arguments to the outer function:

from wizard import wizard_decorator

def wizard_from_urlvars(wizard_key, hash_key):
    '''\
    A decorator which takes a wizard ID and a hash, extracts them from the URL
    and ensures they are both valid
    '''
    def get_keys(state):
        if not (state.request.urlvars.has_key(wizard_key) and \
           state.request.urlvars.has_key(hash_key)):
            raise NoWizardInformation(None)
        return (
            state.request.urlvars[wizard_key],
            state.request.urlvars[hash_key],
        )
    return wizard_decorator(get_keys)

This function is actually implemented in wizard if you want it.

Checking the Wizard is Valid

All of the decorators which extract the wizard ID and hash from a particular place themselves rely on the wizard_decorator() function. It is this function which checks that the wizard is valid and aborts the request if it isn’t which means that if the request processing proceeds to your action, the wizard is valid.

# XXX Is the aborting correct?

  • If the wizard has expired but hasn’t been purged yet the number of failed attempts is incremented and an Expired exception is raised.
  • If the number of failed attempts is greater than or equal to the allowed number specified in the config file the number of failed attempts is incremented and a WizardCaptchaNeeded exception is raised.

If the wizard has a status other than an active one or if the hash is invalid the number of failed attempts is incremented but no exception is raised

This decorator will validate the wizard ID and hash (which it obtains itself from the state) and set a

>>> from wizard import model_wizard_complete
>>>
>>> model_wizard_complete(state, state.wizard.wizard_id)

Here are some of the things which can go wrong and a description of how they are dealt with:

Saving a Form Step in the Wizard Store

You would handle the from submission from each step in the wizard exactly as you normally would. If the form is invalid it should be redisplayed with error messages. If it is valid the original form parameters from state.params should be stored in the wizard store. You use the original ones so that they can be re-validated on subsequent steps without having to be converted back from the validated values.

Before saving the data it is also wise to remove any data which might already exist in the same wizard from the curent step, perhaps because a user clicked back and is now re-submitting the data. Here’s some code to process the form, remove any existing data and add new data. Notice we are using the account_details converter:

XXX Does Wizard really remove data from subsequent steps or just the current one?

Using the state argument in this way lets you encapsulate some quite complex logic in ConversionKit converters and then use them in the same way as any other ConversionKit converter.

from wizard import model_wizard_delete
from wizard import model_wizard_insert_public

params = Conversion(state.request.POST).perform(MultiDictToDict()).result
conversion = Conversion(params).perform(account_details)
if not conversion.successful:
    form = FormA(values=conversion.value, errors=encode_error(conversion))
    # Redisplay form
    ...
else:
    # Remove any existing data from this step and subsequent steps
    model_wizard_delete(
        state,
        state.wizard.wizard_id,
        ['step1', 'step2', 'step3'],
    )
    # Insert the new data
    model_wizard_insert_public(
        state,
        state.wizard.wizard_id,
        # Save the original data, not the result. In case a badly written
        # validator has modified it by mistake, get the originals again:
        encode_partially_encoded_data({
            'step1': conversion.value
        })
    )

Notice that we don’t use the result of the conversion anywhere, it isn’t needed. It is purely there for validation to check we can convert it.

Inserting Private Data

The wizard supports both public and private fields. By inserting data from the browser as public you can always be sure it won’t overwrite private data you’ve set manually.

You can insert private data in the same way as public data but using the model_wizard_insert_private() function.

You cannot insert the same key as both public and private data, keys must be unique regardless of the private/public nature.

Accessing the Field Values and Results

After the first step in the wizard is completed you might want to access the results in the second step. To do so you would use the @wizard_validate_steps decorator after whichever decorator you have used to load the wizard ID and hash from a cookie, session or URL.

from wizard import wizard_validate_steps @wizard_validate_steps(

[‘core_info’, ‘comms_prefs’, ‘invite_contacts’, ‘payment_options’], create_community, dict(controller=’create_pod’, action=’start’)

)

Themeing Wizard Errors

You might want to theme the various error pages the Wizard can produce or trigger a redirect back to the start of the wizard if there is a security error. Wizard allows you to do this by setting the last argument to @wizard_validate_steps.

Completing a Wizard

Once a user has completed a wizard you will finally want to access state.wizard.result to get the combined and converted data from every step of the wizard. I like to create a separate function for performing the actions necessary and I pass it the state.wizard.result object as its value.

After all the necessary changes are made you can complete the wizard which removes all the data which has been cached in the wizard. This step is optional but if you don’t do it there is a chance someone might go back and complete the wizard again, causing the actions perfromed after the wizard is completed to be run again and this probably isn’t what you want.

>>> from wizard import model_wizard_complete
>>>
>>> model_wizard_complete(state, state.wizard.wizard_id)

If you are stroing the wizard ID in a cookie or session you probably want to delete them too:

>>> state.response.delete_cookie('wizard')

Wizard Security

There are lots of security features in the Wizard which haven’t been explained in detail yet. Topics which the documentation will eventually cover include:

  • Choosing a good hash
  • Removing subsequent steps
  • Always re-validating steps
  • Using a re-captcha to prevent brute-force attacks
  • Using a base64 encoding

Summary

This section contains out of date information I need to merge into the main documentation.

Here is the strategy for field, labeling and storage:

  • Field names are labelled without their step
  • Step names are added when saving to the wizard
  • When using the wizard validate decorator, the whole schema is validated, but the step removed before each sub-schema is validated, the results are then optionally merged

This allows:

  • HTML forms to be constructed without any data about which step they are in which makes it easier to refactor fields into different steps.
  • Actions to use ordinary validators, agnostic of which step in which they appear
  • The validated data to be in the same format whether produced from the validation of a single step from a form submission or from revalidation of data from the wizard
  • Stored wizard data from each step to be validated separately without any risk of values from one step interfering with validation from another.

XXX Check this is correct, does delete delete them both? XXX TODO: Write one which uses a different hash algorithm

Provides a Wizard class to store the state of variables between wizard states.

Seting up a wizard involves the following steps:

  • Setting up a the formencode validators for each of the data structures you want to end up with
  • Creating a compound validator representing each screen of form fields you wish to display
  • Creating a Wizard with one compound validator for each screen

Optionally you can also:

  • Create a merging validator so that the results aren’t step specific.

Then you need to set up the actions for each step. Usually the wizard starts with a start() action which sets up a new wizard and either:

  • Sets the wizard ID as a session variable and redirects to the first form action. Any subsequent actions will be able to access the wizard data based on the ID in the session.
  • Displays the form with hidden fields containing the wizard ID and hash. Any future forms will be able to access the wizard data from their own hidden field submissions.
  • Redirects to a new URL which contains the wizard ID and hash as part of the URL. Future actions can access the wizard data from the wizard ID and hash contained in the URL

Now let’s use the request in a pre-validator:

>>> from formconvert import multiDictToDict
>>> from conversionkit import toDictionary
>>>
>>> contact_to_dictionary = chainConverters(
...     multiDictToDict(),
...     toDictionary(
...         converters={
...             'firstname': unicodeToUnicode(),
...             'lastname': unicodeToUnicode(),
...             'email': StringToEmail(),
...         }
...     )
... )

Let’s pass the entire request as the argument to convert:

>>> print Conversion(request.GET).perform(contact_to_dictionary).result
{'lastname': 'Garnder', 'firstname': 'James', 'email': u'james@example.com'}

As you can see, the MultiDictToDict() pre-converter is called to produce a dictionary from the request the SplitName('name') converter is called.

James Gardner: Home > Work > Code > Wizard > 0.1.0 > Manual