Home Blog CV Projects Patterns Notes Book Colophon Search

File Uploads in Python

25 Sep, 2007

File uploads are one of those things in Python which are still rather tricky to handle. First of all you need a form like this one with enctype="multipart/form-data" and a file input field:

<form action="up" method="post" enctype="multipart/form-data">
Upload file: <input type="file" name="myfile" /> <br />
             <input type="submit" name="submit" value="Submit" />
</form>

In your Python code you can then get the uploaded file data like this:

import cgi
form_data = cgi.FieldStorage()
file_data = form_data['myfile'].value

and you can write it somewhere like this:

fp =open('some/file','wb')
fp.write(file_data)
fp.close()

This is all well and good but what if you want to stream that data to a service such as Amazon S3 or what if you want to provide feedback to the user about how much of the file has been uploaded? The example here can't help with this because you can only access the data once the whole file is uploaded.

Here's how you can solve this problem. It is a bit of a nasty solution because you need to create your own file class which calls your callback. You also need your own FieldStorage class which has its make_file() method overridden to use the open file object you supply instead of the tempfile it would use by default. It also only works with forms with one file field but it demonstrates the principles on which you can build your own solution.

Here is a Pylons controller using this system:

import logging
from upload.lib.base import *
log = logging.getLogger(__name__)

import os
import shutil
import cgi

class ProgressFile(file):
    def write(self, *k, **p):
        if hasattr(self, 'callback'):
            self.callback(self, *k, **p)
        return file.write(self, *k,**p)

    def set_callback(self, callback):
        self.callback = callback

def stream(file_object):

    class CustomFieldStorage(cgi.FieldStorage):
        def make_file(self, binary=None):
            self.open_file = file_object
            return self.open_file

    return CustomFieldStorage

class UpController(BaseController):

    def index(self):
        return """
            <html>
            <body>
            <h1>Upload</h1>
            <form action="up" method="post" enctype="multipart/form-data">
            Upload file: <input type="file" name="myfile" /> <br />
                         <input type="submit" name="submit" value="Submit" />
            </form>
            </body>
            </html>
        """

    def upload(self):

        def callback(file, *k, **p):
            log.debug("Logged %s", [file.tell()])

        fp = ProgressFile('somefile', 'wb')
        fp.set_callback(callback)
        custom_field_storage = stream(fp)(
            environ=request.environ,
            strict_parsing=True,
            fp=request.environ['wsgi.input']
        )
        fp.close()
        return 'done'

As the file is uploaded it will now get streamed to the open some/file object as required and the calback() function gets called on every write so that you can find out how much data has been written with file.tell(). If you try this you will see the file uploads fine and you receive the done message. The output logs then look something like this as each write() call is logged:

13:36:22,447 DEBUG [upload.controllers.up] Logged [30011568L]
13:36:22,448 DEBUG [upload.controllers.up] Logged [30011619L]
13:36:22,448 DEBUG [upload.controllers.up] Logged [30011661L]
13:36:22,448 DEBUG [upload.controllers.up] Logged [30011662L]
13:36:22,448 DEBUG [upload.controllers.up] Logged [30011702L]
13:36:22,473 DEBUG [upload.controllers.up] Logged [30011762L]
13:36:22,474 DEBUG [upload.controllers.up] Logged [30011812L]
13:36:22,474 DEBUG [upload.controllers.up] Logged [30011876L]
13:36:22,474 DEBUG [upload.controllers.up] Logged [30011942L]
13:36:22,474 DEBUG [upload.controllers.up] Logged [30012006L]
13:36:22,475 DEBUG [upload.controllers.up] Logged [30012048L]
13:36:22,475 DEBUG [upload.controllers.up] Logged [30012049L]

Copyright James Gardner 1996-2020 All Rights Reserved. Admin.