Object Oriented Databases with Django
Three times over my Django-programming career I've had this same problem. I have a highly polymorphic domain that I'm trying to represent to Django. As is well known in the world of RDBMSes, this leads to very large and complex sets of database tables with correspondingly slow lookup and heavy validation requirements.
Lets, for a simple example, imagine building an ecommerce solution for selling antiques. The antiques we have in stock all belong to one or more categories of antiques. So far so simple: an Antique model with a many to many relationship with a Category model. For each antique we need to store various bits of information: information on its quality, provenance, and damage. But, crucially, the information we need to store for each antique depends on its categories. So for a piece of furniture our shop browsers will expect certain information, while collectors of miniature paintings will expect others.
We could solve this problem by allowing each antique to have any number of properties, and then enter all the relevant information ourselves each time we put a new antique in the system. This would work, but be very error prone: we might forget a salient piece of data. It is also laborious, in most cases there are sensible defaults for propoerties that we'd do well to use. And it begs the question what data type do we store data in? Some properties might be dates, others quality levels, yet others names.
So we could create a fiendishly complex model where categories define certain required parameters, defining their default value and data type, and some flexible parameter table holds the paramaters that are non-default for a particular antique, in such a way that any data can be input (or maybe different parameter tables for different data types). It works but it is very inelegant and laborious to code.
There's got to be a better way.
There is (in my humble opinion).
This kind of polymorphism belongs in object-oriented code, not relational database semantics. In my application (which has nothing to do with antiques but I can't say what it is without giving the client away), I represent categories as classes: the django model holds the name of a class, each category maps to one class. That class then defines the required data and the defaults and the validation methods, and so on (they inherit from a base class, so most of this is pretty easy).
# A sample category class, with the properties it expects, and their data types. class FurnitureAntique (Category): scratches = ListOf(LocatedDefect) period = TimePeriod() primaryWood = Choice(WOOD_TYPES)
The classes so defined are held in a simple registry (a dictionary mapping name to class)
# Add the category class to the registry, so it can be retrieved by name later. Cateogry.registerClass(FurnitureAntique)
And the django database model for a category allows you to easily get the corresponding class.
# The django model representing a category of antique
class AntiqueCategory (models.Model):
categoryClass = models.CharField(maxlength=64)
def getClass(self):
# Retrieve the corresponding category class from the registry by name
Category.getCategoryByName(self.categoryClass)
Then I have another django model for the antiques. This has a text field that holds pickled data for its properties. I have added a method to the django model that will unpickle the data, find the classes for the categories, create and return a new instance of the category class with the unpickled data (below I've shown this where each antique has one and only one category, in my app it is a many to many relation, and for instances that belong to more than one category the correctly uses multiple inheritance and an anonymous class).
class Antique (models.Model):
pickle = models.TextField()
category = models.ForeignKey(AntiqueCategory) # Using a many-to-one for clarity here.
def getCateogryInstance(self):
cls = category.getClass()
data = pickle.loads(self.pickle)
return cls(**data)
This allows me to store any number of well-validated properties in the antique table, corresponding to validation requirements in the category table, without having to have lots and lots of tables in the database.
The downside is that I can't use the databases optimised searching systems (such as indexes) to search the pickled properties. In my application this doesn't matter (it would be difficult to create a user interface that allowed the user to specify such a highly polymorphic search in any case), but in other domains this may be a killer for this approach.
Now in the example above I'm using a very simple set of class properties in the Category class to define what properties we expect. Rather than create a whole new set of validation building blocks, I suspect I can reuse the newforms fields from Django for this same purpose (with the addition of fields representing compound data). But I haven't got far trying this yet. In my application parameters are all of one type (an enumerated choice), so I haven't need this yet, but I'm keen to have a go at some point.
Comments
Just found your site and I am looking forward to reading and learning. We are trying to connect Django and Flex, potentially using djangoamf. Since I am a novice in both arenas I'm hoping I find, or you post, on methods to doing this.
Posted by: Brian Cragin | July 30, 2007 09:02 PM