Thursday, May 17, 2007

Python CGI Howto

I'm trying to use this blog as a place to document my new experiences with Python: some of them are very basic (like this post) and some are somewhat more advanced (for example the internal workings of Crunchy). So please don't be surprised at my "noobie" take on some things...

Anyway, the other day I was playing around with HacketyHack (just to know my enemy, as it were). Overall, I was quite impressed by it - although it does have slightly different aims to Crunchy. One of the best "features" of HacketyHack is its overall polish, and this was largely due to its excellent website. So I decided to make use of the Sourceforge webspace that Crunchy has had for a while (it used to be our main site) and at the same time learn how CGI works.

In case you didn't know, CGI stands for Common Gateway Interface and allows web pages to be generated on the fly by arbitrary programs. CGI works by literally sending printed output from the program down a pipe to the user's browser. Python, of course, has libraries to make CGI scripting easy. Here is a very simple python CGI script:

#!/usr/bin/python
import cgi

cgi.test()

I know that normally you wouldn't worry about that first line, but for CGI scripts it is absolutely essential! (without it you get an Internal Server Error). The second line imports Python's special CGI helper library and which includes a useful test function that we call in the final line. To make it work you have to place this script in your cgi-bin/ directory and give it global read and execute permissions using the *nix command:

$ chmod +rx filename

Once you've done this you can visit the page in your browser (it will probably be at a path something like /cgi-bin/test.py, depending on how you named your script). You should see a detailed description of the environment in which you're script is running, but it won't look very pretty.

Writing a proper hello world CGI script is very simple, it looks like this:

#!/usr/bin/python

print "Content-type: text/html"
print

print "<html><body>Hello World</body></html>"

Some more explanation: we don't actually need the cgi helper library here, so we don't load it. The first line of python code prints out a MIME-type that tells the browser what is coming next, then we print a blank line to indicate that we have finished generating headers, finally we print out the HTML code that we want to send.

Of course this is just a very short introduction to CGI scripting with Python, and if you want to create a larger dynamic website in Python you're probably better off using one of the Python web frameworks (like Twisted or Pylons). One day I might write a post about them...

Thursday, May 3, 2007

SVN Merging using SubClipse

Up until a few days ago I was laboriously merging in changes in Subclipse by hand: copying files across manually, trying to make sure I didn't miss anything. Then I discovered the merge tool: Now I can merge in complex changes that would have taken hours in just a few minutes.

The process is remarkably simple: Start by right click the root of the tree that you want the code to be merged into (in the file browser) and go to Team -> Merge. This gives you a somewhat unintuitive dialog: it took me a while to figure out that it was only asking for a source to merge from (even though one section does say To:). For most reasonable cases you want to use the same SVN path in the From: and the To: sections. In the From revision box you will want to put the revision which you last merged in, and in the To section select Head Revision.

Now you're ready to go: just hit Merge and SVN will work its magic (although it might take a while - be patient). Once it has finished it will display a log of its actions in the Console window, most of it will have worked fine - but there will probably be some conflicts, marked with a red C. All you have to do here is locate the relevant files in the file browser, right click on them and hit Team -> Edit Conflicts. This will open a new editor window with two panes, you're version on the left and the other version on the right.

In the Edit Conflicts window all conflicts will be highlighted, and all you have to do is resolve them by making suitable changes to the left-hand pane and saving (this is easy: they're rarely more than a few lines of code). Once things are sorted out, right click on the file in the file browser again and select Team -> Mark as Resolved. Once you've resolved all the conflicts you're done. Subclipse really is a wonderful piece of software.

One last tip: I find it useful to commit my branch just before merging changes: that way if something goes wrong I just have to hit Team -> Revert and everything is better again.

Note: For those of you who haven't tried Eclipse, Subclipse is an Eclipse plugin that adds SVN support - I use it in conjunction with Pydev.

Monday, April 30, 2007

A Python Plugin System

A couple of months ago André mentioned that having a plugin system in Crunchy would be "nice thing". I was feeling bored when I read his email, and so decided to go away and write one.

This post documents my attempt, which I like to think was actually rather successful.

Basically, Crunchy plugins are .py files that reside in a specific directory. They are automatically imported and initialised by the Crunchy core system on startup.

So the first thing we need to do is figure out what that special path is and enumerate all the .py files in it:


import os
import os.path
import imp

pluginpath = os.path.join(os.path.dirname(imp.find_module("pluginloader")[1]), "plugins/")
pluginfiles = [fname[:-3] for fname in os.listdir(pluginpath) if fname.endswith(".py")]


I'm afraid both lines are rather complicated, but they certainly demonstrate the power of Python. Because this code is run from within the module pluginloader we look up its (absolute) path and join that to the (relative) subpath plugins/. This gives us the absolute path to the plugin directory in a system-independent way.

The second line is (as you probably realised) a list comprehension, it first generates a list of the files in the plugin directory, then adds filters that list for files ending in .py, and finally removes the trailing .pys from the filenames to give a list of module names. All in one line!

Now that we have a list of modules, all we have to do is import them, which we can do with another list comprehension:


import sys

if not pluginpath in sys.path:
sys.path.insert(pluginpath)
imported_modules = [__import__(fname) for fname in pluginfiles]


Once again, the Python code is disarmingly simple: all it does is add the plugin directory to the module import path and import all the modules, generating a list of the imported modulle objects along the way.

In the Crunchy Plugin API we ask that all initialisation code be placed in a custom register() function inside each plugin module. We can now call these functions with just one more line:


[mod.register() for mod in imported_modules]


And there you have it, a simple but powerful plugin system written in just 10 lines of Python code. Enjoy!

Edit: setuptools does the same job, with many more features and more flexibility.

...and back again...

Hi, again.

I've been lazy, and let this lapse (I know, no posts since July last year!). Now I want to resurrect my blog - in order to post some interesting things that I've been working on.

Firstly some news about Crunchy: Since I last posted it has been renamed from Crunchy Frog to just plain Crunchy. This was because of another project called CrunchyFrog. The AJAX based IO system that I mentioned last July has finally come together in the last couple of months as part of a complete rewrite of Crunchy - with a plugin-based architecture (more on that soon), a very neat HTTP Server (with COMET capabilities) &c &c.

And finally some personal news: I successfully completed the Summer of Code last year as a student, and am acting as a mentor this year (for Bryan Psimas - a very promising student from the states). I'm also working in the City of London this year: as a techie at an investment bank.