URLConvert v0.2.0 documentation

API Documentation¶

exception urlconvert.RuleParseError¶

class urlconvert.RuleSet(rules, cache_generation=True, cache_matching=True)¶

generate(routing_vars, default_url_parts=None)¶

generate_url(routing_vars, default_url_parts)¶

match(url_parts)¶

match_url(url, script=None)¶

exception urlconvert.URLConvertError¶

urlconvert.addExtras(add)¶: Add the extra variables to the dictionary (as long as they don’t conflict)

urlconvert.build_url(scheme, host, port=None, script=None, path=None, query=None, try_to_hide_port=True)¶

urlconvert.convertAndCache(converters, cache=True, type=None)¶

urlconvert.decodePart(part, unquote=<function unquote at 0x2109d70>, encoding=None)¶

urlconvert.decodePath(encoding=None)¶

Decodes all % escapes from a URL path segment to Unicode.

encoding specifies the encoding from which the value should be decoded. If set to None, no decoding is performed.

urlconvert.decodeQuery(encoding=None)¶

Decodes all % escapes from a URL script part to Unicode.

encoding specifies the encoding from which the value should be decoded. If set to None, no decoding is performed.

urlconvert.decodeScript(encoding=None)¶

Decodes all % escapes from a URL script part to Unicode.

encoding specifies the encoding from which the value should be decoded. If set to None, no decoding is performed.

urlconvert.encodePart(part, encoding=None, quote=<function quote at 0x2109e60>)¶

urlconvert.encodePath(encoding=None, quote=<function quote at 0x2109e60>)¶

RFC 3986 states that the percent-encoded byte values should be decoded as UTF-8.

http://tools.ietf.org/html/rfc3986 section 2.5.

urlconvert.encodeQuery(encoding=None)¶

urlconvert.encodeScript(encoding=None)¶

urlconvert.extract_host(service, converter=None)¶

Calculate the host from a services dictionary.

Attempts to get the host by trying each of these in order:

The .config.server.display_host attribute of the service object if such an attribute exists
The first part of an the X_FORWARDED_FOR key in the environ (can’t be trusted)
The HTTP_HOST key in the environ (can’t be trusted)
The SERVER_NAME key in the environ (can be trusted, but might not be the host the user should use)

The converter argument is an optional converter which is used to decode and validate the host once it has been converted. If None, the decode_host converter is used.

urlconvert.extract_path(service, converter=None)¶

Calculate the path from a services dictionary. The returned path does not include the first /.

Attempts to get the path by trying each of these in order:

The PATH_INFO variable in environ which is expected to start with a / character.

The converter argument is an optional converter which is used to decode and validate the path once it has been converted. If None, the decode_path converter is used.

urlconvert.extract_port(service, converter=None)¶

Calculate the port from a services dictionary. Returns the port as a unicode string.

Attempts to get the host by trying each of these in order:

The .config.server.display_port attribute of the service object if such an attribute exists
The X_PORT_FORWARDED_FOR key in the environ (totally unofficial)
The SERVER_PORT key in the environ (can be trusted, but might not be the port the user should use)

The converter argument is an optional converter which is used to decode and validate the port once it has been converted. If None, the decode_port converter is used.

urlconvert.extract_query(service, converter=None)¶

Calculate the query string from a services dictionary.

Attempts to get the query by trying each of these in order:

The QUERY_STRING variable in environ

The converter argument is an optional converter which is used to decode and validate the query once it has been converted. If None, the decode_query converter is used.

urlconvert.extract_scheme(service, converter=None)¶

Calculate the scheme from a services dictionary.

Attempts to get the scheme by trying each of these in order:

The .config.server.display_scheme attribute of the service object if such an attribute exists
The wsgi.url_scheme key in the environ
Based on the port (with a call to extract_port())

The converter argument is an optional converter which is used to decode and validate the scheme once it has been converted. If None, the decode_scheme converter is used.

urlconvert.extract_script(service, converter=None)¶

Calculate the script name from a services dictionary.

Attempts to get the script by trying each of these in order:

The SCRIPT_NAME variable in environ

The converter argument is an optional converter which is used to decode and validate the script once it has been converted. If None, the decode_query converter is used.

urlconvert.extract_url(service, with_port=False, with_script=False, with_query=False, scheme_converter=None, host_converter=None, port_converter=None, path_converter=None, script_converter=None, query_converter=None)¶

urlconvert.extract_url_parts(service, with_port=True, with_script=True, with_query=True, scheme_converter=None, host_converter=None, port_converter=None, path_converter=None, script_converter=None, query_converter=None)¶

urlconvert.generateDynamicHost(host_part)¶

urlconvert.generateDynamicPath(part, path_part)¶

urlconvert.generateStaticOrDynamic(part, expected_value=None, dynamic=None)¶

urlconvert.joinVars(add)¶

urlconvert.makeUnicode()¶

urlconvert.matchDynamic(part, dynamic)¶: Match the part to the routing variable given by dynamic.

urlconvert.matchDynamicDomain(host_part)¶: Match the host part of a URL against the variables defined in the string host_part.

urlconvert.matchDynamicPath(part, path_part)¶: Match the path or script part of a URL against the variables defined in the string path_part.

urlconvert.matchHost()¶

The RFC describes ther host as:

This is a sequence of domain labels separated by
".", each domain label starting and ending with an alphanumeric
character and possibly also containing "-" characters.

We follow this here.

urlconvert.matchPath()¶

This converter takes a permissive approach and doesn’t encode any characters which according to the RFC don’t need to be encoded:

alphanumeric: 0123456789
              ABCDEFGHIJKLMNOPQRSTUVWXYZ
              abcdefghijklmnopqrstuvwxyz

mark:         -_.!~*'()

unreserved:   alphanumeric
              mark

phar:         unreserved
              escaped
              :@&=+$,

The only tricky bit is how to handle semi-colons because in theory path segments can be split into parameters separated by a ‘;’ but unlike this situation with a / the URLConvert doesn’t split on that character. This converter assumes that the ; should not be escaped so

This means the path_info segments may include the following characters with no escaping:

0123456789
ABCDEFGHIJKLMNOPQRSTUVWXYZ
abcdefghijklmnopqrstuvwxyz
-_.!~*'()
:@&=+$,
;

Any other characters in a variable value to be used as a segment get encoded to the double hex octet form. For example / becomes %2F.

urlconvert.matchPort()¶: Matches that the port represents an integer in the range 1-65535.

urlconvert.matchQuery()¶: I couldn’t find the documentation for query strings so I’m assuming it is the same as the the path component for the timebeing

urlconvert.matchScheme(make_lowercase=True, allowed_values=None)¶

A converter which takes a fully decoded scheme as a Unicode string and ensures it is valid.

The RFC states that scheme names:

consist of a sequence of characters beginning with a
lower case letter and followed by any combination of lower case
letters, digits, plus ("+"), period ("."), or hyphen ("-").  For
resiliency, programs interpreting URI should treat upper case letters
as equivalent to lower case in scheme names (e.g., allow "HTTP" as
well as "http").

   scheme        = alpha *( alpha | digit | "+" | "-" | "." )

In practice for web applications the scheme will only be http or https so you can specify allowed_values if you prefer to be more restrictive.

Since schemes are case insensitive it is often easier to just work with lowercase schemes. Unless you set make_lowercase() to False, schemes returned from this converter will be made lowercase.

urlconvert.matchScript()¶: Same as matchPath()

urlconvert.matchStatic(part, expected_value)¶

urlconvert.mergeParts()¶: Takes a dictionary of dictionaries and merges them into a single dictionary, immediately returning an error conversion if the same variable is used in different dictionaries and has a different value in each.

urlconvert.missingKey(parts, input)¶

urlconvert.parse_re(rule, part)¶

Parse the Unicode string rule to a regular expression. part can be host or path and determines which characters need escaping.

Some examples:

hello.{name}.com`` becomes ``^\.hello\.(?P<name>[a-zA-Z-%_\.0-9]+)\.com$
/hello/{name}`` becomes ``^\/hello\/(?P<name>[a-zA-Z-%_\.0-9]+)$

urlconvert.parse_vars(rule)¶

urlconvert.plainDecode(encoding)¶

urlconvert.plainEncode(encoding)¶

urlconvert.removeExtras(add)¶: Remove the extra variables to the dictionary (as long as they don’t conflict)

urlconvert.rule(rule, add=None)¶

urlconvert.ruleToParts()¶

urlconvert.urlToParts()¶: Takes a URL or rule and splits it up into a scheme, host, port and path.