Wad of Stuff: python

Showing posts with label python. Show all posts

Wednesday, 12 August 2009

Requiring at least one inline FormSet

Last month I posted an article about my Improved Django FormWizard, well this month I've release a simple subclass of Django's BaseInlineFormSet that demonstrates how you can require a user to enter at least one entry in an inline formset.

After updating to wadofstuff.django.forms 1.1.0 you can use the RequireOneFormSet class as the formset argument to inlineformset_factory().

When the formset is validated and it does not contain one or more entries, then a ValidationError is raised which gets put into formset.non_form_errors. You will need to check this in your templates if you wish to display the error message to your users.

Saturday, 25 July 2009

Inlines support for Django generic views

Django's excellent Generic Views provide developers with most of what they need to get a site up and running (if they aren't using the Admin of course). The flexibility of these views is such that for most sites you don't need much else. Extending these views is also well documented and probably covers off 95% of the situations where the plain generic views fall short.

In a recent project I found another area where the generic views fall short: inline formsets.

I have added the ability to use inline formsets in generic views. The new view functions are drop in replacements for Django's with the addition of a new inlines argument.

Installation

From Source

Download wadofstuff-django-views.

To install it, run the following command inside the unpacked source directory:


python setup.py install

From pypi

If you have the Python easy_install utility available, you can
also type the following to download and install in one step:


easy_install wadofstuff-django-views

Or if you're using pip:


pip install wadofstuff-django-views

Or if you'd prefer you can simply place the included wadofstuff
directory somewhere on your Python path, or symlink to it from
somewhere on your Python path; this is useful if you're working from a
Subversion checkout.

Note that this application requires Python 2.4 or later. You can obtain
Python from http://www.python.org/.

Usage

wadofstuff.django.views.create_object(..., inlines=None)
wadofstuff.django.views.update_object(..., inlines=None)

These functions are identical to the Django ones except for the addition of the
inlines argument. This argument consists of a list of dictionaries that will
be passed as arguments after the parent_model argument to
inlineformset_factory(parent_model, ...).

For example, arguments to a generic view might typically look like:

crud_dict = {
   'model':Author
   'inlines':[{
       'model':Book,
       'extra':2,
       'form':BookForm,
   },{
       'model':Article,
   }],
   # ... other generic view arguments
}

would translate to calls to inlineformset_factory() like:


inlineformset_factory(Author, model=Book, extra=2, form=BookForm)

and


inlineformset_factory(Author, model=Article)

The view function will create a formset for each inline model and add them to the template context. In the example above the context variables would be named book_formset and article_formset.

Update: A quick change to allow the views to be imported from wadofstuff.django.views. Bumped version to 1.0.1.

Tuesday, 21 July 2009

Atom Feed for SVN Commit Log

Recently the developers of Extjs added a page that opened up their subversion commit log so users could see what was being added/fixed. Unfortunately they decided to only publish it using the output of svn log -v --xml rendered in an Ext.grid.GridPanel.

It is a nice example of what you can do with their framework, but this format is not so friendly to use or keep tabs on so I've created a mashup that converts it to Atom Syndication Format so interested users can subscribe to it in their favourite feed reader.

The URL to subscribe to is http://feeds.feedburner.com/ExtjsSvnCommitLog. I'm hoping the developers publish their own feed soon. If they do then I'll stop mine.

Here is the script I wrote to convert it to Atom. Witha few tweaks it should be usable for any SVN commit log.

My first attempt at this was to use Yahoo! Pipes to try and munge it but I couldn't get it to work. I'd be interested to see if Pipes was capable of doing this.

Since writing it I also found this XSLT script that does something similar.

Update: I've cleaned the script up and made if reusable for any SVN log source. See the documentation for details on how to use it.

Improved Django FormWizard

A few months back I had a project that I thought needed a wizard-style interface for one of its forms. For a while now Django has included the FormWizard class in django.contrib.formtools.wizard so I decided to use that. However, I immediately hit a couple of issues with it.

FormWizard requires you to output the previous_fields context variable in each of the form's step templates. Django's FormWizard previous_fields is just a string of raw HTML. If you are using some other kind of markup to do your forms e.g. XForms, or even javascript widgets from dojo or extjs, then this is no good.

I quickly came up with a fix for this and opened #10557 which has now been slated to be included in Django 1.2.....

Wait a minute! Django 1.1 isn't even out yet!

The second issue was that FormWizard didn't handle FormSets.

Who knows when 1.2 will be out so I've bundled up my code and released it as wadofstuff.django.forms 1.0.0.

From the README:

Functions

security_hash(request, form, exclude=None, *args)

Calculates a security hash for the given Form/FormSet instance.

This creates a list of the form field names/values in a deterministic
order, pickles the result with the SECRET_KEY setting, then takes an md5
hash of that.

Allows a list of form fields to be excluded from the hash calculation. This
is useful form fields that may have their values set programmatically.

Classes

BoundFormWizard

A subclass of Django's FormWizard that adds the following functionality:

Renders previous_fields as a list of bound form fields in the template context rather than as raw html.
Can handle FormSets.

The usage of this class is identical to that documented here with the exception that when rendering previous_fields you should change your wizard step templates from:


{{ previous_fields|safe }}

to:


{% for f in previous_fields %}{{ f.as_hidden }}{% endfor %}

The latest stable release for the forms module can be obtained by:

Running easy_install wadofstuff-django-forms
Downloading the 1.0.0 release and running python setup.py install.

Note: The BoundFormWizard class can only handle regular FormSets and not ModelFormSets. I have ModelFormWizard class that I'll be adding to a future release that can handle simple Models.

Tuesday, 26 May 2009

Python ipaddr performance

Last weekend while I was cleaning up my IP address summarization script (I added a setup.py, created a Cheese Shop entry and a downloadable archive) I had a look at the state of IP address manipulation in Python and found a new module called ipaddr. What sparked my interest in this module was that it had already been integrated into upcoming Python 2.7 and 3.1 as a standard library.

It seemed to contain all the functionality of IPy but with much cleaner code. It also has a function called collapse_address_list() that is similar to my summarize function but it can handle a list of non-contiguous networks and IP addresses and collapse them down.

When I wrote my summarize() function I spent quite a while optimizing the algorithm so that it can handle large address ranges in a reasonable amount of time. For example, on my development box it can summarize the two worst case ranges: IPv4 0.0.0.0 to 255.255.255.254 in 0.1s and IPv6 :: to ffff:ffff:ffff:ffff:ffff:ffff:ffff:fffe in 0.2s.

I had to see how well ipaddr performed so I wrote a little benchmarking script that summarized a /24. The results were very poor with it taking 64.27 seconds to perform 1000 runs. I then ported my summarize() code to use ipaddr and modified collapse_address_list() to use it. The result was a 50 times improvement in performance (1000 runs took 1.21 seconds).

ipaddr is developed by Google and the coders use Reitveld for reviewing patches. So I've submitted my patches there. Feel free to review and comment on the code.

Here's hoping that they are accepted and benefit everyone by making it into the next release(s) of Python!

Monday, 2 March 2009

Full Django Serializers - Part II

In the first part of this article I covered the excludes and extras options to the Wad of Stuff serializer. In this article I introduce the relations option.

Relations

The Wad of Stuff serializers allow you to follow and serialize related fields of a model to any depth you wish. This is why it is considered a "full serializer" as opposed to Django's built-in serializers that only return the related fields primary key value.

When using the relations argument to the serializer you may specify either a list of fields to be serialized or a dictionary of key/value pairs. The dictionary keys are the field names of the related fields to be serialized and the values are the arguments to pass to the serializer when processing that related field.

This first example shows the simplest way of serializing a related field. The Group model has a ManyToManyField to the Permission model. In this case there is only one permission but if there were more then each would be serialized.

>>> print serializers.serialize('json', Group.objects.all(), indent=4, relations=('permissions',))
[
    {
        "pk": 2,
        "model": "auth.group",
        "fields": {
            "name": "session",
            "permissions": [
                {
                    "pk": 19,
                    "model": "auth.permission",
                    "fields": {
                        "codename": "add_session",
                        "name": "Can add session",
                        "content_type": 7
                    }
                }
            ]
        }
    }
]

The simple case may be all you need but if you want more control over exactly which fields or extras are included, excluded, and the depth of relations to follow then you need to pass a dictionary in the relations option. This dictionary is a series of nested dictionaries that are unrolled and passed as arguments when serializing each related field.

>>> print serializers.serialize('json', Group.objects.all(), indent=4, relations={'permissions':{'fields':('codename',)}})
[
    {
        "pk": 2,
        "model": "auth.group",
        "fields": {
            "name": "session",
            "permissions": [
                {
                    "pk": 19,
                    "model": "auth.permission",
                    "fields": {
                        "codename": "add_session"
                    }
                }
            ]
        }
    }
]

The relations option in this example roughly translates to a call to serialize('json', permissions_queryset, fields=('codename',)) when the permissions field is serialized.

Serializing deeper relations

The power of the relations option becomes obvious when you see it in action serializing related fields that are 2 or more levels deep. Below the content_type ForeignKey field on the Permission model is also serialized.

>>> print serializers.serialize('json', Group.objects.all(), indent=4, relations={'permissions':{'relations':('content_type',)}})
[
    {
        "pk": 2,
        "model": "auth.group",
        "fields": {
            "name": "session",
            "permissions": [
                {
                    "pk": 19,
                    "model": "auth.permission",
                    "fields": {
                        "codename": "add_session",
                        "name": "Can add session",
                        "content_type": {
                            "pk": 7,
                            "model": "contenttypes.contenttype",
                            "fields": {
                                "model": "session",
                                "name": "session",
                                "app_label": "sessions"
                            }
                        }
                    }
                }
            ]
        }
    }
]

Combining options

You may also combine the other options when serializing related fields. In the example below I am excluding the content_type.app_label field from being serialized.

>>> print serializers.serialize('json', Group.objects.all(), indent=4, relations={'permissions':{'relations':{'content_type':{'excludes':('app_label',)}}}})
[
    {
        "pk": 2,
        "model": "auth.group",
        "fields": {
            "name": "session",
            "permissions": [
                {
                    "pk": 19,
                    "model": "auth.permission",
                    "fields": {
                        "codename": "add_session",
                        "name": "Can add session",
                        "content_type": {
                            "pk": 7,
                            "model": "contenttypes.contenttype",
                            "fields": {
                                "model": "session",
                                "name": "session"
                            }
                        }
                    }
                }
            ]
        }
    }
]

That wraps up this series of articles. Head over to Wad of Stuff at Google Code to grab the source and don't hesitate to open a ticket if you find any bugs or have any enhancement requests.

Friday, 27 February 2009

Django Full Serializers - Part I

Introduction

The wadofstuff.django.serializers python module extends Django's built-in serializers, adding 3 new capabilities inspired by the Ruby on Rails JSON serializer. These parameters allow the developer more control over how their models are serialized. The additional capabilities are:

excludes - a list of fields to be excluded from serialization. The excludes list takes precedence over the fields argument.
extras - a list of non-model field properties or callables to be serialized.
relations - a list or dictionary of model related fields to be followed and serialized.

De/Serialization Formats

At the moment the module only supports serializing to JSON and Python. It will also only deserialize data that is in the original Django format. i.e. it won't deserialize the results of using the excludes, extras, or relations options.

Source

The source for the serialization module can be obtained here.

Examples

Project Settings

You must add the following to your project's settings.py to be able to use the JSON serializer.

SERIALIZATION_MODULES = {
 'json': 'wadofstuff.django.serializers.json'
}

Backwards Compatibility

The Wad of Stuff serializers are 100% compatible with the Django serializers when serializing a model.

>>> from django.contrib.auth.models import Group
>>> from django.core import serializers
>>> print serializers.serialize('json', Group.objects.all(), indent=4)
[
    {
        "pk": 2,
        "model": "auth.group",
        "fields": {
            "name": "session",
            "permissions": [
                19
            ]
        }
    }
]

Excludes

>>> print serializers.serialize('json', Group.objects.all(), indent=4, excludes=('permissions',))
[
 {
     "pk": 2,
     "model": "auth.group",
     "fields": {
         "name": "session"
     }
 }
]

Extras

The extras option allows the developer to serialize properties of a model that are not fields. These properties may be almost any standard python attribute or method. The only limitation being that you may only serialize methods that do not require any arguments.

For demonstration purposes in this example I monkey patch the Group model to have a get_absolute_url() method.

>>> def get_absolute_url(self):
...     return u'/group/%s' % self.name
...
>>> Group.get_absolute_url = get_absolute_url
>>> print serializers.serialize('json', Group.objects.all(), indent=4, extras=('__unicode__','get_absolute_url'))
[
 {
     "pk": 2,
     "model": "auth.group",
     "extras": {
         "get_absolute_url": "/group/session",
         "__unicode__": "session"
     },
     "fields": {
         "name": "session",
         "permissions": [
             19
         ]
     }
 }
]

Stay tuned for the second part of this article where I demonstrate the ability to serialize related fields such as ForeignKeys and ManyToManys.

Tuesday, 10 February 2009

Django: DRY Custom Model Forms and Fields

With the release of Django 1.0, the ability to add a list of validators to a model field was removed. This was replaced with the clean*() methods in the new forms classes. Unfortunately when you write your own form subclasses you may end up repeating a lot of the parameters in your form field definitions that you had already declared in your model fields such as max_length, required, help_text, etc. So in the spirit of DRY I worked out a way to avoid this.

A few months back I was updating an old Django site to use new the latest 1.0 release and needed to re-implement the functionality of the following, slightly contrived, pre-1.0 model:

from django.db import models
from django.core.validators import MatchesRegularExpression, isNotOnlyDigits

class Acronym(models.Model):
 acronym = models.CharField(maxlength=3, unique=True, db_index=True,
     validator_list=[isNotOnlyDigits,
         MatchesRegularExpression(r'^[A-Z0-9]+$',
         error_message='This field may only contain uppercase letters and' \
         ' numbers.')],
     help_text='A three letter acronym.')
 definition = models.CharField(maxlength=64)

The new model looked like this after I fixed up all of the parameters:

from django.db import models

class Acronym(models.Model):
 acronym = models.CharField(max_length=3, unique=True, db_index=True,
     help_text='A three letter acronym.')
 definition = models.CharField(max_length=64)

The problem above is that the extra validation is now not being done and needs to be reimplemented either in a new form or form field class.

My first attempt probably looked like this:

import re
from django import forms
from models import Acronym

TLA_RE = re.compile(r'^[A-Z0-9]+$')

class NaiveAcronymForm(forms.ModelForm):
 acronym = forms.RegexField(TLA_RE, max_length=3, required=True,
     label='Code', help_text='A three letter acronym.')
 class Meta:
     model = Acronym

The amount of repetition is already becoming obvious and the code above only provides half the solution. We also need to implement the isNotOnlyDigits validation.

import re
from django import forms
from models import Acronym

TLA_RE = re.compile(r'^[A-Z0-9]+$')

class NaiveAcronymForm(forms.ModelForm):
 acronym = forms.RegexField(TLA_RE, max_length=3, required=True,
     label='Code', help_text='A three letter acronym.')

 def clean_acronym(self):
     if self.cleaned_data['acronym'].isdigit():
         raise forms.ValidationError(u"This value can't be comprised soley of digits."

 class Meta:
     model = Acronym

I got to this point in my own code and thought that there must be a way to leverage all the work already being done by Django to generate a form field from a model field. When I make a change to my model fields I want those changes to automatically flow on to my forms.

The solution I came up with was this:

import re
from django import forms
from models import Acronym

TLA_RE = re.compile(r'^[A-Z0-9]+$')

class AcronymField(forms.RegexField):
 """Form field for three letter acronym."""
 default_error_messages = {
     'invalid': u'This field may only contain uppercase letters and ' \
         'numbers.',
     'notonlydigits': u'''This value can't be comprised solely of digits.'''
 }

 def __init__(self, *args, **kwargs):
     """Initialize the field with the acronym regex."""
     super(AcronymField, self).__init__(TLA_RE, min_length=3, *args,
         **kwargs)

 def clean(self, value):
     """Ensure acronym matches regex and is not only digits."""
     value = super(AcronymField, self).clean(value)
     if value.isdigit():
         raise forms.ValidationError(self.error_messages['notonlydigits'])
     return value

class AcronymModelForm(forms.ModelForm):
 """Form for Acronym model."""
 def __init__(self, *args, **kwargs):
     """Programmatically declare fields."""
     super(AcronymModelForm, self).__init__(*args, **kwargs)
     field = self.Meta.model._meta.get_field('acronym')
     self.fields['acronym'] = field.formfield(form_class=AcronymField)

 class Meta:
     model = Acronym

How does this work? First of all, AcronymField is subclassed from forms.RegexField and the __init__() method overriden. It then calls the parent class's __init__() with my extra parameters, the regexp and min_length, as well as passing the original positional and keyword arguments. This ensures that any extra parameters from the model such as required, label, max_length, and help_text are preserved. I also override the clean() method and again call the superclass's clean() method before checking that the value is not only digits.

Next I subclass AcronymForm from forms.ModelForm and override the __init__() method again. Here the superclass's __init__() method is called then the acronym form field is replaced with one that uses the new AcronymField class. It is within the call to field.formfield(...) (see django.db.models.fields.Field class for the gory details) that all the values for required, label, and help_text are taken from the model field and the form field instance created.

While this results in more lines of code for a model form definition it ultimately results in less presentation and validation errors due to model fields and form fields getting out of sync. This approach could probably be generalized even further to a subclass of ModelForm but I've only had to use it for a handful of models so I leave that as an exercise for the reader.

Update: fixed formatting mistake so you can more clearly see the before and after models.

Friday, 19 September 2008

Using Splunk's iplocation search command behind a proxy

Splunk has an iplocation search command that will add City and Country fields to your search results. It does this by looking up the IP addresses it finds using the hostip.info API. Unfortunately if your Splunk server doesn't have direct Internet access then this script will fail.

The script itself is a very simple Python script that use the module urllib.urlopen to make the API call. To get it to use your proxy server is easy.

Make a backup of the original script:

$ cd $SPLUNK_HOME/etc/searchscripts
$ cp iplocation.py iplocation.py.bak

Edit iplocation.py and add the following line below the LOCATION_URL definition:

PROXIES = {'http':'http://proxy.example.com:8080'}

Then find the line that reads:

location = urllib.urlopen( LOCATION_URL + ip )

and change it to:

location = urllib.urlopen( LOCATION_URL + ip, proxies=PROXIES )

Then perform your search and pipe it to iplocation. Make sure to limit your search as the script will do a HTTP request for every IP address it finds.

Tuesday, 17 June 2008

SUNWapch2u, SUNWPython, and mod_python mismatched expat issue

It is fantastic that Sun have shipped a recent version of Python with Solaris 10 8/07. Unfortunately it seems the SUNWapch2u package builders aren't talking to the SUNWPython package builders which has resulted in the well documented “Expat Causing Apache Crash” issue when you try to build mod_python linked against SUNWapch2u and SUNWPython.

The mismatched expat versions used by each are:

SUNWapch2u: expat 1.95.2
SUNWPython: expat 1.95.8
SUNWlexpt: expat 1.95.7 (neither apache or python use this but it is there just to add to the confusion)

In order to build a working mod_python you will have to compile your own copy of expat 1.95.8 as well as your own apache2 ensuring that you pass the “--with-expat=...” option to apache2's configure script.

I've opened a case with Sun to see if they'll fix this. Watch this blog for updates.

Update 2007.11.13: Changed SUNWapache2 to correct package name SUNWapch2u.

Update 2008.06.17:

With the release of the following Solaris 10 patches the issue describe above has been resolved:

120543-11 SunOS 5.10: Apache 2 Patch fixes Bug ID 6630259 "If Python and Apache 2 are used together with libexpat, httpd crashes".
137147-04 SunOS 5.10: libexpat patch updates libexpat to version 2.0.x.
121606-03 GNOME 2.6.0: Python patch fixes Bug ID 6630230 "Link Python dynamically to /usr/sfw/lib/libexpat.so".

This combination of patches allows you to build your own mod_python linked against Sun's apache2 and python 2.4. If you can wait for Bug ID 6630237 "Supply mod_python with Apache 2" to be delivered then you won't even have to do that!

Wednesday, 5 September 2007

Solaris 10 8/07 is available

Also known as Solaris 10 Update 4. The list of what's new is here. Some highlights are:

Python updated to 2.4.4 with 64 bit support and it now lives in /usr/bin. The 8/07 what's new docs don't mention this but it is mentioned in the HW 7/07 what's new.
Gnu Zebra replaced with the Quagga routing suite.
IP Instances: LAN and VLAN Separation for Non-Global Zones
DTrace can now be used in Non-Global zones.

A few things will need a little further investigation:

Sun Service Tags
Coherent Console

You can download it now from Sun.

Tuesday, 27 March 2007

Summarizing an IP address range

As I've mentioned previously, I've found IPy python module to be extremely useful for manipulating IP addresses. One such use is a script I've written to summarize an IP address range into the networks that make it up. The script supports both IPv4 and IPv6 addresses.

Example usage:

$ ./summarize.py 192.168.1.0 192.168.1.8
192.168.1.0/29
192.168.1.8

or alternatively as a python module:

>>> from summarize import summarize
>>> summarize('192.168.1.0', '192.168.1.254')
['192.168.1.0/25', '192.168.1.128/26', '192.168.1.192/27', '192.168.1.224/28', '192.168.1.240/29', '192.168.1.248/30', '192.168.1.252/31', '192.168.1.254']

The source for this script is available for download from the Wad of Stuff repository.

Friday, 9 February 2007

Sorting IP addresses in python

Here's a quick example of a function to sort a list of IP addresses in python using the decorate-sort-undecorate idiom (aka Schwartzian Transform) and the IPy module.

def sort_ip_list(ip_list):
    """Sort an IP address list."""
    from IPy import IP
    ipl = [(IP(ip).int(), ip) for ip in ip_list]
    ipl.sort()
    return [ip[1] for ip in ipl]

And here is an example of it in use:

>>> l = ['127.0.0.1', '192.168.0.10', '192.168.0.1', '10.1.1.1', '172.16.255.254']
>>> sort_ip_list(l)
['10.1.1.1', '127.0.0.1', '172.16.255.254', '192.168.0.1', '192.168.0.10']

Thursday, 8 February 2007

Manipulating IPv4 and IPv6 addresses in python

Eighteen months ago, after many years of programming in perl for most of my systems programming needs, I decided to give python a go after much coaxing by a colleague, Alec Thomas. At the time I had started developing an IP address management system, designed for ISPs/Telcos who need to manage hundreds of address blocks and allocations to customers and internal infrastructure. I did a quick evaluation of both Ruby on Rails and Django and decided on Django for a few reasons:

I didn't have to manage my database schema and models separately. Django allowed me to define my data models in a single place and it handled the job of creating the database tables. (This was before Rails had migrations).
Django's built in administrative interface was a huge time saver and allowed me to focus on developing my application rather than designing forms.
After programming in perl for so long the cleanliness of the python language really appealed to me. Ruby to me just seemed like OO-Perl done properly but with all the $@!#{} perlisms left in.

While developing the application I found an extremely useful python module called IPy (originally developed here), that handles IPv4 and IPv6 addresses and networks.

Here's a sample of how you can use it:

>>> from IPy import IP
>>> ip = IP('127.0.0.0/30')
>>> for x in ip:
...  print x
...
127.0.0.0
127.0.0.1
127.0.0.2
127.0.0.3
>>> ip2 = IP('0x7f000000/30')
>>> ip == ip2
1
>>> ip.reverseNames()
['0.0.0.127.in-addr.arpa.', '1.0.0.127.in-addr.arpa.',
'2.0.0.127.in-addr.arpa.', '3.0.0.127.in-addr.arpa.']

I'll be discussing some tools I've developed with this module at a later date.