James Gardner: Home > Work > Code > URLConvert > 0.2.0 > Junk

URLConvert v0.2.0 documentation

Junk

Note

Of course in a real web application the environ dictionary would either be the WSGI environment or, in the case of a running CGI script, the os.environ dictionary.

Astute readers will notice that when used in a CGI environment the variable wsgi.url_scheme will not be available. In such cases the scheme is inferred from the port. Even more astute readers will realise that the hostname and port that should be used when applying the rules is not necessarily the same as SERVER_NAME and SERVER_PORT. URLConvert will use X_HOST_FORWARDED_FOR and X_PORT_FORWARDED_FOR in preference if they are available so that the rules work correctly even when behind a proxy server. The SCRIPT_NAME is automatically removed from the URL so that the path_info rules will work even when the application is mounted at a URL location other than /.

XXX Should it use HTTP_HOST in preference to SERVER_NAME?

To convert the dictionary back to a URL URLConverter needs to know what the current URL is because it will use it as a basis for any parts of the URL which aren’t set by the rule. URLConvert obtains the informtion it needs from the environment. The key information URLConvert requires is this:

What is a useful URL?

Any URL which a user can type into a browser to access a web application.

What are the useful features of useful URLs?

Pitfalls for web developers: * Different URLs can point to the same resource * There’s no need for path params or query strings * The same application can be mounted at different URLs but still has to function.

File handling.

Before we start looking at URL convert

URL Generation

You can generate a URL using the same rules we have just seen for matching. For example. To generate a URL to the http://www.example.com/signin URL you could write this:

build(controller='account', action='signin', subdomain='www')

If a human was asked to generate this URL they would work through all the rules they knew about until they found one that would re-generate the same variables with the same values if it were re-parsed. The process might look like this:

  • Look at the variable names in the URL and the add dictionary. If the variable names specified to build() aren’t the same as the combination of the variables in the URL and those in the add dictionary the rule doesn’t match.
  • If the variable names do match, compare the values of the variable values in the add dictionary and those specified to build(). If the values aren’t the same the rule doesn’t match.
  • Otherwise, if the all the variable names match and the values in the add dictionary match the values specified then the rule does match.

Once again the order in which the rules are searched is important. When the generated rule is matched the rules are going to be checked from top to bottom so if two rules could match the URL, the one at the top would match first. For this reason it makes sense to test the rules from top to bottom when generating a URL so that the same rule used to match a URL will also be used to generate one. This is what happens.

QuickStart

Now that the service is set up, let’s convert the variables back to a URL:

>>> for vars in [
...     {u'hi': u'james'},
...     {u'bye': u'fred'},
... ]:
...     generation = ruleset.generate_url(vars)  ###, dict(host='example.com', scheme='http', port='80'))
...     if generation.success:
...         print "Success: %r" % generation.result
...     else:
...         print "Failed: %s" % generation.error
Success: u'http://example.net/james'
Failed: No rule matched

Notice that the second attempt fails because there is no rule for handling a variable bye.

Here’s an example rule:

>>> from urlconvert import rule
>>> r = rule(u'{}://{}:{}/{controller}/{action}')
>>> r
<urlconvert.Rule object at 0x...>

As you can see the Rule object has two attributes, .to_vars and .to_url:

>>> hasattr(r, 'to_url')
True
>>> hasattr(r, 'to_vars')
True
.to_vars
A converter for converting a dictionary of URL components to a dictionary of routing variables
.to_url
A converter for converting a dictionary containing both a dictionary of routing variables and a dictionary representing the current URL to a dictionary of URL parts representing the generated URL.

Each of the parts of the URL is then handled by a different URL part handler. As an example, the path component is handled by a StandardPathHandler(). Each of the handlers have to_partial_url() and to_parital_vars() methods which behave exactly like the to_url() and to_vars() methods put only work on the particular part of the URL.

>>> from urlconvert import StandardPathHandler
>>> s = StandardPathHandler('/{controller}/{action}')
>>> vars = s.to_vars(Conversion('/some/path')).value
>>> vars
{'action': 'path', 'controller': 'some'}

From this example you might think that if you were to call s.to_url() you would end up with the string /some/path but actually it is a little more complicated than that. The objects return a tuple which always contains three items: the original vars dictionary, the result of the conversion and a list of all the keys from the vars dictionary which were needed to perform the conversion. If the conversion failed last list will be empty.

>>> s.to_url(Conversion(vars), state).value
({'action': 'path', 'controller': 'some'}, '/some/path', ['action', 'controller'])
>>> conversion = s.to_url(Conversion({}), state)
>>> print conversion.error
The variable 'action' in the rule could not be found in the variables supplied

The reason for this strucutre is that the variables you specify can be matched at any part of the URL. The Rule object which processes all the different parts needs to make sure they all recieved the same vars dictionary to process, that all the variables in the vars dictionary were matched (whether or not they were matched by any one part or by multiple parts) and it needs the result from each part so that it can assemble the final URL.

In addition to the StandardPathHandler, URLConvert also provides a StandardSchemeHandler, StandardDomainHandler and StandartPortHandler. See the documentation for each for the details of how they handle different URLs.

>>> from urlconvert import StandardSchemeHandler, StandardDomainHandler, \
...     StandardPathHandler, StandardPortHandler

Now you’ve seen some of the inner workings of URLConvert lets get back to the basics.

Having a single rule for every possible URL in your site is often a bit restrictive so URLConvert provides a RuleSet object which tries each rule in turn until it finds a match.

>>> from urlconvert import RuleSet
>>> ruleset = RuleSet()
>>> ruleset.add(StandardRule('{}://{}:{}/{controller}'))
>>> ruleset.add(StandardRule('{}://{}:{}/{controller}/{action}'))
>>> ruleset.to_vars(Conversion(u'http://example.com/admin/view')).value
{'action': 'view', 'controller': 'admin'}
>>> ruleset.to_url(Conversion({'controller':'admin'}), state=state).value
u'http://example.com/admin'

Notice that the first rule didn’t match when we converted /admin/view because there were too many / in the path but that the second rule did. In the second example the first rule matched.

In normal operation you wouldn’t want to have to worry about the conversion objects. Just wrap whichever object you are working with in a URLConvert() object and use that instead:

>>> from urlconvert import URLConvert
>>> rulset = URLConvert(ruleset)
>>> rulset.to_url({'controller': 'admin'}, state=state)
u'http://example.com/admin'

Caution

You should never use the same path instance more than once. This is because a RuleSet instance might modify the default converter (which you’ll learn about next) or change the path in some other way so if you were to use the same instance elsewhere, it might already have been modified.

Sometimes it can be useful to manually update variables matched against a particular URL. For example the above rules would be rather irritating if your dispatch mechanism always needed to know the controller and action to dispatch to because the first rule doesn’t result in an action variable being matched. You can therefore redefine the rules like this:

>>> ruleset = RuleSet()
>>> ruleset.add(StandardRule('{}://{}:{}/{controller}', add={'action':'index'}))
>>> ruleset.add(StandardRule('{}://{}:{}/{controller}/{action}'))
>>> ruleset = URLConvert(ruleset)
>>> ruleset.to_vars('http://example.com/admin/view', state=state)
{'action': 'view', 'controller': 'admin'}
>>> ruleset.to_vars('http://example.com/admin', state=state)
{'action': 'index', 'controller': 'admin'}

Notice that this time the conversion of http://example.com/admin adds an action variable with the value index. Updating variables only has an effect when converting a URL to the variables dictionary, not when converting the variables to a URL. When you are converting a variables dictionary to a URL you are expected to pass the correct variables.

Caution

The default variables support in URLConvert is not the same as that in Routes. In Routes you can do this:

'/admin/:action/:id', controller='admin', action='index', id='1'

This route would then match /admin, /admin/, /admin/index, /admin/index/, /admin/index/1 and /admin/index/1/. Each of these would correctly produce the routing variables {'controller': 'admin', 'action': 'index', 'id': '1'}. The problem comes when generating the route for the same routing dictionary, which URL do we choose? Routes chooses the shortest.

Also URLConvert requries that default variables be specified when generating URLs otherwise the rule will not be applied.

The beauty of URLConvert isn’t just that you can specify some quite complex logic in a simple rule but that you can create your own Rule objects which express completely different logic or different ways of dealing with URLs and that these rules will work perfectly well alongside the standard rules in a ruleset. Even more importantly, if you have some extremely complex logic to express which isn’t well-represented by either the standard rules or custom rules you could create yourself, you can always fall back to the very first example and hand-code your logic from scratch because even Rule objects are simply implementations of the basic URLConverter objects and can be intermixed with them.

Let’s use the SimpleURLConverter in the current ruleset as an example:

The simplest converter looks like this and will convert any URLs starting with http://example.com/ assuming the following characters make up the controller name:

All such converters should be derived from Rule:

>>> from urlconvert import Rule
>>>
>>> class SimpleConverter(Rule):
...     def to_url(self, conversion, state):
...         vars = conversion.value
...         if vars.has_key('controller'):
...             conversion.value = 'http://mydomain.com/'+vars['controller']
...         else:
...             conversion.value = None
...         return conversion
...
...     def to_vars(self, conversion, state=None):
...         if conversion.value.startswith('http://example.com/'):
...             value = conversion.value[len('http://example.com/'):]
...             if '/' not in value:
...                 conversion.value = {'controller': value}
...             else:
...                 conversion.error = 'URL %r not matched' %conversion.value
...                 conversion.value = None
...         else:
...             conversion.error = 'URL %r not matched' %conversion.value
...             conversion.value = None
...         return conversion
...
>>> convert = SimpleConverter()
>>> ruleset = RuleSet()
>>> ruleset.add(SimpleConverter())
>>> ruleset.add(StandardRule('{}://{}:{}/{controller}/{action}'))
>>> ruleset = URLConvert(ruleset)
>>> ruleset.to_vars('http://example.com/admin', state=state)
{'controller': 'admin'}
>>> ruleset.to_vars('http://example.com/admin/view', state=state)
{'action': 'view', 'controller': 'admin'}

In this case the first conversion was handled by SimpleConverter with the second being handled by the StandardRule. Custom converters like SimpleConverter have to return None if they can’t convert the URL or variables passed to them.

So far all the examples have only used the path component. Let’s look at some more complex examples which use variables in other parts of the URL:

>>> ruleset = RuleSet()
>>> ruleset.add(StandardRule('{}://{}:81/{controller}/{action}'))
>>> ruleset.add(StandardRule('{}://{}:82/{controller}/{action}', add={'port': 82}))
>>> ruleset.add(StandardRule('{}://{}:{port}/{controller}/{port}'))
>>> ruleset.add(StandardRule('{}://{}:{port}/{controller}/{action}'))

In order to see what is going on when matching these rules we’ll need to enable logging. All messages are logged to the urlconvert logger which we can enable to log information to the standard output with these lines:

>>> import logging
>>> logging.basicConfig(level=logging.DEBUG)

Now let’s test the rules:

>>> ruleset = URLConvert(ruleset)
>>> ruleset.to_vars('http://example.com:81/admin/view', state=state)
{'action': 'view', 'controller': 'admin'}

In the first case you will be able to see from the log output that this is matched by the first rule.

>>> ruleset.to_vars('http://example.com:82/admin/view', state=state)
{'action': 'view', 'controller': 'admin', 'port': 82}

This example is matched by the second rule and since a default port of 82 is specified in the defaults, this is also added to the variables dictionary.

Notice that the port is an integer. This is because the StandardPortHandler automatically converts the port to and from a Python integer as neceassy. We’ll discuss conversions later on.

>>> ruleset.to_vars('http://example.com:83/admin/83', state=state)
{'action': '83', 'controller': 'admin', 'port': 83}

If the same variable is used in two parts of a URL then their string representation must be the same in both instances for the route to match as demonstrated by the example above. If they don’t match, the next rule is used:

>>> ruleset.to_vars('http://example.com:84/admin/view', state=state)
{'action': 'view', 'controller': 'admin', 'port': 84}

If you wanted to match any port but did not want the port variable to end up in the variables dictionary you could have used this route instead of the final route we did use:

StandardRule('{}://{}:{}/{controller}/{action}')

Clearly for these rules to work in the way you expect the order of the routes is important. If you had added the final route first it would have been applied to all the URLs entered. You should always add the most specific rules first.

Any of the variables dictionaries produced would also be matched by a rule which when converted back to a variables dictionary would be the same as the original variables dictionary. In this case the same rules applied to create the dictionaries would also be the ones applied to generate the URLs but this neededn’t always be the case.

Let’s take another example and look at the scheme:

>>> ruleset = RuleSet()
>>> ruleset.add(StandardRule('{}://{}:{}/normal'))
>>> ruleset.add(StandardRule('https://{}:{}/secure', add={'scheme':'https'}))
>>> ruleset = URLConvert(ruleset)
>>> ruleset.to_url({'scheme':'https'}, state=state)
'http://example.com/secure'

The important point to note here is that the URL is generated from the second rule, not the first. This is because the {} characters in the scheme part of the URL for the first rule means “don’t bother looking at the scheme part” and not “accept any value for the scheme”. The variables we specified included scheme and the first rule cannot match any variable named scheme.

There is one more point we have largely glossed over so far and that is how variables are encoded and decoded from the URL form. This turns out to be a slightly complex issue for three reasons:

  • Different characters are allowed at different parts of a URL
  • Any characters in the part of the URL after the port may be hex encoded as two octets preceded by a % character but some characters must be encoded. Different people have different ideas about which characters should be encoded and there is more than one right answer since the same URL can be represented in different ways
  • The decoded path_info segments might themselves be encoded in a character set other than US-ASCII

URLConverter takes the following approach. Each standard handler for each part of the URL has its own corresponding Converter object. Converters have an encode() and a decode() method both of which take a dictionary of variables as their only argument and return the encoded or decoded versions of the string representation of those dictionaries values. The encoding and decoding gets done right at the edge of the application so URLs are decoded just before the handler’s to_partial_vars() method is returned and are encoded just before the handler’s to_partial_url() method returns the assembled URL fragment.

Here are the standard converters and a description of the rules they apply: In addition to the standard encoders mentioned above their are two, non-standard alternatives you should be aware of:

NullEncoder

Does no encoding or decoding and simply returns the unmodified variables dictionary. This can be useful if it is important for you to get hold of the raw URL characters or if you want to do your encoding and decoding manually outside of the URLConvert system.

Utf8PathInfoEncoder

Same as the StandardPathInfoSegmentEncoder except that it treats the decoded variables as UTF-8 strings which it then decodes again to Python Unicode objects. Using this class instead of StandardPathInfoEncoder allows you to use Unicode strings as path info segments.

Warning

Although being able to use UTF-8 encoded unicode in URLs sounds great fun it is quite possible it will cause problems with other software or servers on your system. Potential dangers include double-decoding which could result in characters you don’t expect being present in your variables dictionary leading to security vunerabilities. If in doubt stick to the StandardPathInfoDecoder.

Note

For further information about URLs, character sets and allowed characters visit http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.2 and more importantly http://www.ietf.org/rfc/rfc2396.txt

Now that you have seen the encoders available let’s see how to use them. Let’s use a NullEncoder with the StandardPathInfoHandler in a particular rule:

>>> from urlconvert import NullEncoder
>>> ruleset = RuleSet()
>>> ruleset.add(StandardRule('{}://{}:{}/std/{controller}'))
>>> ruleset.add(StandardRule('{}://{}:{}/null/{controller}', path_info_encoder=NullEncoder))
>>> rulset = URLConvert(ruleset)
>>> rulset.to_vars('http://example.com/std/admin%2Fview')
{'controller': 'admin/view'}
>>> rulset.to_vars('http://example.com/null/admin%2Fview')
{'controller': 'admin%2Fview'}

Notice that the NullEncoder doesn’t decode the / character contained in the pathinfo segment.

Danger

Be aware that if you are using the standard configuration it is quite possible for variables to end up containing characters such as / and \. This means that variables returned by to_vars() should not be used directly as filesystem path component because a malicous user could construct a path you may not be expecting.

As an example imagine a web application serves files from /var/www/mysite/public and it uses a rule with a path_info component of /{filename}. If a user entered /index.html the file /var/www/mysite/public/index.html would be served. If a user instead entered /..%2F..%2F..%2F..%2Fpasswd then the file /var/www/mysite/public/../../../../passwd (which resolves to /etc/passwd) would be served. This is almost certainly not what you wanted so be sure to properly escape variables derived from user-submitted content depending on its intended use.

Servers like Apache which can run CGI scripts treat the path componet of a URL as being made up of two parts, a script name and a path info. As an example consider this URL:

http://example.com/cgi-bin/dispatch.py/admin/view

If you wanted to perform a controller match on this URL you might be tempted to write a rule like this:

>>> rule = StandardRule('{}://{}:{}/cgi-bin/dispatch.py/{controller}/{admin}')

Consider what would happen if you had this rule in your application and another user deployed it via a dispatch.py script at this URL:

http://example.com/cgi-bin/some/other/directory/dispatch.py/admin/view

If URLConvert were to try to match the URL the rule it wouldn’t be applied because there are a different number of / characters in the part of the URL after the port.

When applications are set up like this the server actually sets two environment variables called SCRIPT_NAME and PATH_INFO. In the case of the first URL SCRIPT_NAME would be set to cgi-bin/dispatch.py and PATH_INFO would be set to /admin/view. For the second URL, SCRIPT_NAME would be cgi-bin/some/other/directory/dispatch.py and PATH_INFO would be the same. Because the PATH_INFO compoent is always the same it is this variable which the StandardRule uses to match against the path and not the entire URL after the port.

This means that you should never include the SCRIPT_NAME part of a URL in your rules because StandardRule ignores it anyway to enable your application to work no matter which URL it is mounted under.

URLConvert also correctly includes the SCRIPT_NAME component of a URL when it is generating a URL so you can be sure all your URLs will be correctly adjusted even if the application is not mounted at the root of a URL structure. This turns out to be a huge bonus because it means you don’t have to manually re-code all your URLs if you want to move the application within an existing URL structure.

Because URLConvert already correctly handles SCRIPT_NAME and PATH_INFO it doesn’t include any sort of prefix functionality of the sort you might see in other URL converstion or dispatch packages because none is needed.

Here are some examples:

>>> rule = StandardRule('{}://{}:{}/cgi-bin/dispatch.py/{controller}/{admin}')
>>> rule.to_url({'controller': 'admin', 'action': 'view'})
'http://example.com/admin/view'
>>> rule.to_vars('http://example.com/admin/view')
{'action': 'view', 'controller': 'admin'}

Now let’s change the SCRIPT_NAME:

>>> state['environ']['SCRIPT_NAME'] = 'cgi-bin/dispatch.py'
>>> rule.to_url({'controller': 'admin', 'action': 'view'})
'http://example.com/cgi-bin/dispatch.py/admin/view'
>>> rule.to_vars('http://example.com/cgi-bin/dispatch.py/admin/view')
{'action': 'view', 'controller': 'admin'}

So far all the variables you’ve seen so far have been dynamic variables, that is, variables whose value depends on the URL of the request. URLConvert supports another type of variable known as a static variable. Static variables are replaced into the rule string before the rule is applied. They are used for variables that are dependant on the way the application is installed. For example, say you had an administraion controller but you wanted the person installing your application to be able to customise the URL at which that set of URLs was accessible at you might do the following:

>>> ruleset = RuleSet()
>>> ruleset.add(StandardRule('{}://{}:{}/[admin]/', add={'controller: 'admin', 'action': 'index'}))>>> 
ruleset.add(StandardRule('{}://{}:{}/[admin]/{action}', add={'controller: 'admin'}))

If the config static variable was set to admin the rule set would work like this:

>>> ruleset.to_vars({'controller': 'admin', 'action': 'view'}, state=state)
'http://example.com/admin/view'

However if the config static variable was set to control the rule set would work like this:

>>> ruleset.to_vars({'controller': 'admin', 'action': 'view'}, state=state)
'http://example.com/control/view'

Static variables are also substituted in the same way when converting URLs back to a variable dictionary.

One area where static variables are particularly useful is when it comes to using subdomain functionality. Often you want to match any subdomain relative to the usual domain for that site so you might do something like this:

>>> ruleset = RuleSet()
>>> ruleset.add(StandardRule('{}://{subdomain}.{sld}.{tld}:{}{}')

This works fine when the application is hosted at a domain such as example.com:

>>> ruleset.to_vars('http://www.example.com', state=state)
{'subdomain': 'www', 'domain': 'example', 'tld': 'com'}

But the same rule wouldn’t match if the application was instead hosted at server.example.com because this time there are more . characters in the host part than the rule allows:

>>> ruleset.to_vars('http://www.server.example.com', state=state)
None

Since the host part of the URL at which your application is hosted is unlikely to change one solution to this problem is to use static variables. Here’s how:

>>> ruleset = RuleSet()
>>> ruleset.add(StandardRule('{}://{subdomain}.[domain]:{}{}')

The state object will now need some configuration information. URLConvert use the same configuration format as Paste Deploy. This means the config object should exist as a dictionary with two keys app and global. The app dictionary contains keys relevant to various parts of an application, the global dictionary contains global settings which application configuration can use instead of the local configuration. This means that URLConvert will expect its configuration to be in state['config']['app']['urlconvert']['static']. Let’s set that up:

>>> state['config'] = {'app': {'urlconvert': {'static': {}}}}

Now we can apply the rules again:

>>> state['config']['app']['urlconvert']['static']['domain'] = 'server.example.com'
>>> ruleset.to_vars('http://www.server.example.com', state=state)
{'subdomain': 'www'}
>>>
>>> state['config']['app']['urlconvert']['static']['domain'] = 'example.com'
>>> ruleset.to_vars('http://www.example.com', state=state)
{'subdomain': 'www'}

Notice that this time, the same rule matched both hosts as long as the configuration was updated.

Static variables can be used at any part of a rule but sometimes you might want to use a variable extracted from the environment dictionary. A good example is the SCRIPT_NAME variable described earlier. For example, imagine that user’s have their own section of a site. You might want to include the username as part of the URL but it might not be useful to have the username in the variables dictionary because you might access it via some user-specific functionality in your application such as the AuthKit user object. Instead you can specify it as a semi static variable like this:

>>> ruleset = RuleSet()
>>> ruleset.add(StandardRule('{}://{}:{}/[environ:REMOTE_USER]/{action}')

Now when the user james is signed in the following will happen:

>>> state['environ']['REMOTE_USER'] = 'james'
>>> ruleset.to_vars('http://example.com/james/inbox', state=state)
{'action': 'index'}
>>> ruleset.to_vars({'action': 'index'}, state=state)
'http://example.com/james/inbox'

If the semi-static variable isn’t present it will be replaced by the empty string ''.

Tip

In fact the StandardRule makes use of a semi-static variable to implement the script name functionality. Before the URL is split the semi-static variable [environ:SCRIPT_NAME] is substituted into the rule.

There is one more feature of URLConvert we haven’t yet discussed: variable filters. Imagine you have a rule such as this representing a blog post:

>>> ruleset = RuleSet()
>>> ruleset.add(StandardRule('{}://{}:{}/{year}/{month}/{day}/{title}')

This rule would currently be applied to both of these URLs:

http://example.com/2008/06/14/some-blog-post
http://example.com/admin/users/delete/james

For the rule to be useful we need to be able to filter the rule depending on whether the variables are valid or not. There are two ways to do this the simplest is to specify a custom filter to go with the rule, here’s an example:

filters:

>>> ruleset = RuleSet()
>>> ruleset.add(StandardRule('{}://{}:{}/{year:digit(2,4)}/{month:integer(2, min=1, max=12)}/{day:digit(2)}/{title}')

This will now only match the first URL:

>>> ruleset.to_vars('http://example.com/2008/06/14/some-blog-post', state=state)
{'year': '2008', 'month': 6, 'day': '15'}
>>> ruleset.to_vars('http://example.com/admin/users/delete/james', state=state)
None

Notice that the variables which use the digit filter are left as strings whereas the ones which use the integer filter are converted to integers. Both the integer and digit filters take the number of characters to expect as the first argument. If you specify a second argument, as was done in the case of year the first is interpreted as the minimum number of digits, the second is interpreted as the maximum. You can also specify the minimum and maximum that the number (when converted) to an integer would represent. For example the month can only take the values 1 to 12.

You can add your own filters by creating a function and adding it to the state['config']['app']['urlconvert']['filters'] dictionary. Filter functions can take any number of arguments as long as the values only contain the letters a-z, A-Z and the numbers 0-9. The arguments are not Python expressions even though at first sight they might appear to be.

regex are difficult. :integer

Even though the variable filter functionality is fairly powerful, it is rarely particularly useful. The rule would still match an invalid date such as 2008/02/31 even though we’ve been quite careful with the arguments specified in the filter. A better approach is to use URLConvert’s other result filter mechanism which acts on the whole variable dictionary.

methods(‘post’,’get’)

where you specify a

Branching of rules (with a prefix?)

To Document

Need to talk about internationalisation, to_python(), from_python()

Frequently Asked Questions

  1. What is the equivalent of static routes?
  1. Static routes can be emulated like this:

    >>> ruleset = RuleSet()
    >>> ruleset.add('http://google.com', defaults={'static': 'google'})
    >>> ruleset.to_url(dict(static='google'))
    'http://google.com'
    
  1. What is the equivalent of named routes?

A. URLConverter doesn’t support named routes because you can achieve the same result by specifying the full variables dictionary. If you really don’t want to do this you can emulate named routes like this:

>>> from urlconvert import NamedRule
>>> ruleset = RuleSet()
>>> ruleset.add(StandardRule('{*}://{*}:{*}{*}/{controller}/{action}?{*}'))
>>> varmap = {'home': {'controller': 'home', 'action': 'index'}}
>>> converter = LookupName(url_converter=ruleset, key='name', varmap=varmap, state=state)
>>> converter.to_url({'name': 'home'})
'http://example.com/home/index'

In this case the name key is simply replaced with the home dictionary from the varmap. This achieves the same thing as a named route in Routes you just specify the conversions you want in the varmap.

>>> class NamedRuleConverter(URLConverter):
...     def __init__(self, converter):
...         self.converter = converter
...
...     def to_url(self, vars, state):
...         if vars.keys() == ['name']:
...         return self.converter(vars, state)
...    def to_vars(self,
>>> ruleset.to_url(dict(static='google'))
'http://google.com'
  1. How do I use Django-style regular expressions?

A. Regular expression matching is deliberately not included in URLConverter becase whilst it is a very handy way of matching a URL, the same regular expression cannot be used for converting back from the match dictionary to a URL therefore defeating the object of a two-way conversion system like URLConvert. If you must, you can implement a regular expression matching using a custom URLConverter class. Use the SimpleConverter as an example but include the regular expression match in the to_url() method.

  1. How effective is URLConvert in a CGI Environment

A. Very effective. URLConvert doesn’t actually instantiate rules until the first time a conversion takes place and even then it will only instanitate the next rule if the previous rule in a ruleset wasn’t applied. This means that CPU cycles aren’t wasted instantiating all your Rules unless they are needed. This makes URLConvert well suited to environments such as Google App Engine.

# XXX This means the on-demand rule compilation isn’t needed because the Rule itself won’t be instantiated.

  1. Does URLVars work with Pylons?

A. Yes, most definitely. It was designed with Pylons in mind. In fact it will work with any WSGI application or middleware which conforms to the standard for URL variables, namely that a dictionaty of them be made available in the environment under the wsgi_org.urlvars key.

For Pylons 0.9.7 users, all that is required is that you replace this line in config/middleware.py line:

app = RoutesMiddleware(app, global_conf, app_conf)

with these lines:

from urlconvert import URLConvertMiddleare
app = URLConvertMiddleware(app, global_conf, app_conf)

You will probably also want to add URLConvert to your install_requires line in setup.py so that it is automatically installed along with other dependencies of your Pylons application.

Filters

{port|integer(is=81)}

SubDomains

{}[.ox.ac.uk]

>>> ruleset.to_url('http://example.com/admin/view')
{'action':'view', 'controller': 'admin'}

You can

You’ve seen how the path_info is handled by a StandardRule object but So the standard Rule implementation does the following:

  • Splits the URL into scheme, host, port, script_name, path_info and query_string components
  • Applies a different handler to each component of the URL
  • Returns the URL or variables dictionary of the rule applies

Because this example is only really using the path_info component of the URL we can use the vars_to_path_info() and path_info_to_vars() methods instead of the full to_url() and to_vars() methods. These will only generate the path_info component not the full URL. Let’s try it:

>>> rule.path_info_to_vars('admin/view')
{'action': 'view', 'controller': 'admin'}
>>> rule.vars_to_path_info({'action': 'view', 'controller': 'admin'})
'/admin/view'

In order to test a full URL we need to provide some extra information. URLConverter replaces unspecified parts of the URL with their equivalents from the current URL so it needs to know what the current URL is and it obtains this informatoin from the environment. In a WSGI applicaiton this is the available as the first positional parameter to the WSGI callable, usually named environ and in a CGI applicaiton it is availalbe as os.environ but you’ll need to import the os module. In this example we’ll create a fake environment so you can see how it works:

>>> environ = {
...     'SCRIPT_NAME':'',
...     'PATH_INFO':'',
...     '':'',
...     '':'',
...     '':'',
...     '':'',
...     '':'',
... }

For example, it is useful to think of each of the segments of the PATH_INFO component to be treated directly as though they represented variables. It is also useful to treat the sections of the host as though they represented routing variables too but whereas the PATH_INFO components are treated as being separated by a /, the host componets are separated by a .. This is a simplistic example but it demonstrates the

Let’s look at how this simple converter solves the three problems stated above.

  • when a request comes in you could use a dispatch system as simple as this:

    class Handle:
    
        def home(self):
            return "Home Page"
    
        def admin(self):
            return "Admin Page"
    
    print "Content-type: text/plain\n\n"
    print getattr(Handle(), convert.to_url(url)['controller'])()
    

    The URLs http://example.com and http://example.com/admin would now resolve to two pages

  • you could modify the no

  • use a convention such that key variables parsed from the URL represent the module, class and method which should handle that request

  • by specifying the variables

The simplest converter looks like this and will convert any URL of the form http://example.com/controller:

from urlconvert.parse import urlparse

class SimpleConverter(object):
    def to_url(self, vars, state):
        if vars.has_key('controller'):
            return state['current_url']+vars['controller']
        return None

    def to_vars(self, url, state=None):
        parts = urlparse(url)
        if parts.path:
            return {'controller': parts.path}
         return None

convert = SimpleConverter()
James Gardner: Home > Work > Code > URLConvert > 0.2.0 > Junk