Python Setuptools Egg Plugins +++++++++++++++++++++++++++++ :Posted: 2010-01-15 13:21 :Tags: Python, Pylons :Headline: Learn how to use the egg plugin feature of setuptools. One very under-used feature of ``setuptools`` is its ability to allow you to build eggs which serve as *plugins* for other *host* eggs. The way plugins work is actually very simple. Imagine we are building a database abstraction layer where the host package is called Database and provides the high level interface, and the plugins are Psycopg2Database and Pysqlite2Database which provide the lower level implementation so that by installing new plugins you can add support for new databases to the Database package. To make this more concrete let's think about an ``insert_record()`` function which allows users to insert a record and which returns the ID of the newly-created row. It turns out that different SQL engines expose slightly different ways of obtaining the ID of the last inserted row so the implementation in for the different databases will happen in the plugins. For this to work the Database package needs to be able to find all the plugins, know which database they are for, and load their implementation of the ``insert_record()`` function. With ``setuptools``, each Python object in a plugin is acessed via an *entry point*. An entry point is just a name that "points" to an object. We have two types of object, a string representing the engine name, and the function for inserting a record. Let's choose the entry point names ``engine_name`` and ``insert_record`` respectively for these objects. Entry points have to exist in *entry point groups* so you'll need to create one of those too. You can have multiple entry points in a single entry point group and you can have multiple entry point groups in a single egg. Let's call our entry point group ``database.engine``. The host egg doesn't need any changes to support entry points but the plugins need the entry points specified. Let's imagine the Psycopg2Database has a ``psycopg2database.helper.insert_record()`` function and a string at ``psycopg2database.engine_name`` and that the Pysqlite2Database has a ``pysqlite2database.helper.insert_record()`` function and ``pysqlite2databse.engine_name`` string for the engine name. .. tip :: Confusingly, ``pysqlite2`` is the name of the Python module for accessing SQLite3 databases and ``psycopg2`` is the name of the Pylons module for accessing PostgreSQL 8 databases, hence the naming above. Here are the implementations of the objects the entry points point to: ``psycopg2database.helper.insert_record()`` :: def insert_record( connection, table_name, data_dict, primary_key_column_name=None, engine=None, ): print 'psycopg2 plugin not implemented yet' ``psycopg2database.engine_name`` :: engine_name = "psycopg2" ``pysqlite2database.helper.insert_record()`` :: def insert_record( connection, table_name, data_dict, primary_key_column_name=None, engine=None, ): print 'pysqlite2 plugin not implemented yet' ``pysqlite2databse.engine_name`` :: engine_name = "pysqlite2" To make ``setuptools`` aware that the entry points point to these objects change the Psycopg2Database ``setup.py`` to add the ``entry_points`` argument to ``setup()`` like this: :: setup( ... entry_points=""" [database.engine] insert_record=psycopg2database.helper:insert_record engine_name=psycopg2database:engine_name """, ... ) and change the Pysqlite2Database ``setup.py`` to add the ``entry_points`` argument to ``setup()`` like this: :: setup( ... entry_points=""" [database.engine] insert_record=pysqlite2database.helper:insert_record engine_name=pysqlite2:engine_name """, ... ) In each case the entry point must be under the entry point group name (``[database.engine]``) and it must start with the entry point name followed by an ``=`` sign. The part after the ``=`` can be is the module path followed by a ``:`` followed by the name of the Python object being pointed to. At this point I usually re-install the plugin eggs to ensure ``setuptools`` finds the updated entry points. :: python setup.py develop Now we need to be able to use the plugins. The code snippit below shows you how to get all the engine names and ``insert_record()`` functions from every insstalled plugin. Notice that once you've iterated over each entry point you need to load them with ``.load()`` to get the actual Python object the entry point points to: :: from pkg_resources import iter_entry_points dist_plugins = {} for ep in iter_entry_points( group='database.engine', # Use None to get all entry point names name=None, ): if not dist_plugins.has_key(ep.dist): dist_plugins[ep.dist] = {} dist_plugins[ep.dist][ep.name] = ep.load() print dist_plugins If you run it you'll get this output: :: {Psycopg2Database 0.1.0 (/home/james/Desktop/Cur/Psycopg2Database/trunk): {'insert_record': , 'engine_name': 'psycopg2'}, Pysqlite2Database 0.1.0 (/home/james/Desktop/Cur/Pysqlite2Database/trunk): {'insert_record': , 'engine_name': 'pysqlite2'}} It would be useful to present this as a single dictionary with the engine name as the key and the function as its value. This code does this: :: plugins = {} for k, v in dist_plugins.items(): plugins[v['engine_name']] = v['insert_record'] Now let's turn this into useful functionality. The ``insert_record()`` in the Database package looks like this: :: insert_record(connection, table_name, data_dict, primary_key_column_name=None, engine=None) Let's update it so that it loads the correct plugin based on the name: :: from pkg_resources import iter_entry_points plugins_loaded = False plugins = {} def load_plugins(): dist_plugins = {} for ep in iter_entry_points( group='database.engine', # Use None to get all entry point names name=None, ): if not dist_plugins.has_key(ep.dist): dist_plugins[ep.dist] = {} dist_plugins[ep.dist][ep.name] = ep.load() for k, v in dist_plugins.items(): plugins[v['engine_name']] = v['insert_record'] plugins_loaded = True def insert_record( connection, table_name, data_dict, primary_key_column_name=None, engine=None, ): if not plugins_loaded: load_plugins() if not plugins.has_key(engine): raise Exception('No driver for the %r engine'%engine) # Use the plugin's insert method return plugins[engine]( connection, table_name, data_dict, primary_key_column_name, engine, ) If you saved the above as ``database_helper.py`` you could test it as follows: :: >>> from database_helper import insert_record >>> insert_record(1,2,3,4, 'mysqldb') Exception: No driver for the 'mysqldb' engine >>> insert_record(1,2,3,4, 'psycopg2') psycopg2 plugin not implemented yet >>> As you can see, an exception is raised when ``mysqldb`` is specified because the plugin doesn't exist but when ``psycopg2`` is specified, the correct function in the plugin gets called. This code can still be improved though. Its reasonable to assume that the ``helper`` module for each plugin would need to import the Python database module for the database it is abstracting. Since the current code loads every entry point whether it is needed or not, all the Python database modules would need to be present for every plugin that existed. This isn't a huge problem because presumably you wouldn't install plugins for databases where you hadn't also installed the underlying driver, but we can still do better. Let's create a new dictionary called ``loaded`` and update the code so that only the ``engine_name`` entry point is loaded: :: from pkg_resources import iter_entry_points plugins_loaded = False plugins = {} loaded = {} def load_plugins(): dist_plugins = {} for ep in iter_entry_points( group='database.engine', # Use None to get all entry point names name=None, ): if not dist_plugins.has_key(ep.dist): dist_plugins[ep.dist] = {} dist_plugins[ep.dist][ep.name] = ep for k, v in dist_plugins.items(): plugins[v['engine_name'].load()] = v['insert_record'] plugins_loaded = True Now in the ``insert_record()`` function we can load the actual entry point: :: def insert_record( connection, table_name, data_dict, primary_key_column_name=None, engine=None, ): if not plugins_loaded: load_plugins() if not plugins.has_key(engine): raise Exception('No driver for the %r engine'%engine) if not loaded.has_key(engine): loaded[engine] = plugin[engine].load() # Use the loaded plugin's insert method return loaded[engine]( connection, table_name, data_dict, primary_key_column_name, engine, ) That's all there is to it. You should now be able to go away and write your own plugins. For some information about how entry points are used in Pylons, read the Pylons Book chapter 17. By the way, if you hadn't noticed yet, the Database, Psycopg2Database and Psysqlite2Database packages are real and use roughly the mechanism described here. If you fancy writing a plugin for your favorite database and releasing it as an egg, feel free. The beauty of egg plugins is that I don't even need to be involved because the Database module will respond to your plugin automatically. Good luck!