ah well, life happens, and things get in the way - i’ve already gotten behind schedule - just a lot of distractions this last week. i’ve still been working on the database though, which feeds into ideas for the backend. and as usual, lots of other ideas occur along the way. they get jotted down on index cards, and eventually make their way into the database. and there’s a large backlog of blog entries as well.
anyway, what i’m doing is trying to cram all of my information into this (relational) database. currently the weak spots are
- allowing multiple values in a field
- composite field types
- type parsers
- relationships
- version history
i’m sure somebody, somewhere, has solved these problems before. there are some pretty advanced commercial databases out there. sure would be nice to be able to use one of those for the backend…
multiple field values - eg type=”person, performer”, date=”1789 or 1794″, date=”oct 1994 to dec 1998″, filter=”apples and oranges but not lemons”. right now i’m just storing these in the database just like that (ie text). presumably a parser could come along and store the information “properly”, though i’m not sure yet what form that will take. neomem had code to handle multiple item links, but it really needs to be expanded to be able to store arbitrary types of information with arbitrary relationships (ie not just AND).
composite field types - most databases are way too limited in their field types, so much so as to be almost useless - i’m just storing everything as text for now. but basically, all types are composite types, made up of smaller bits of information (until you get to numbers and strings). eg a person is made up of name, address, home phone, etc. an address is made up of street, city, state, zip, etc. a weight is made up of a number and a unit (eg “150 lbs”). so it would be nice to handle all of this generically, to arbitrary depths. right now neomem makes a distinction between object types and field types, but that’s pretty arbitrary (and based in part on the distinction between records and fields, which again is somewhat arbitrary). i think there are some advanced databases that let you define composite types like this, though i haven’t yet looked for any open source projects.
type parsers - ideally, each type would have its own parser, which might call subparsers. eg a NumberUnit parser would call a Number parser to interpret “150″ and a Unit parser to interpret “lbs” (the whole point being to do calculations and convert between units automatically, eventually). and dates, good grief - the number of time that wheel has been reinvented. it really should be a service of the operating system, ie pass it some text and it’ll interpret it for you. well i guess most operating systems do have such parsers but they’re nowhere near as powerful as they could be - hence, they’re basically useless. neomem’s current date parser is a step in that direction though. and all fields should also allow ranges, multiple values, and qualifiers as well, eg “jan 15, 1970 to summer 1975?”.
relationships - there are a lot of different ways of handling this, and i haven’t figured out the best way to do it yet. in some cases all you really need is to store one way information (eg .type=book), since to go the other way you can just do a search (currently how neomem handles relationships). sometimes you’d like two way storage (eg lotr.author=tolkein, tolkein.books=lotr,silmarillion) - and doing it automatically would be nice (ie you add it one place and the db adds the other side). and sometimes you’d like to attach information to the relationship (ie treat the relationship like any other object). ultimately it would be nice to have all relationships stored explicitly as objects - it would make things like displaying the entire network of objects and their relationships easier. and also for attaching “weights” to relationships, eg for building up networks of related objects.
version history - it would be nice if the database tracked the history of any field. this would include the large text fields also, allowing you to track back through the changes. and the information should be easily accessible, for viewing, editing, or feeding into plotting.
i really don’t want to reinvent any wheels though - i’m going to do some research and see what other databases are out there. hopefully there’ll be something useful…
Tags: No Tags
3 comments
Comments feed for this article
November 8th, 2005 at 6:48 am
ivan
“multiple field values”
In Zoot very convenient is that program “understands” human expressions, for example: today, tomorrow, last week, Monday. It is very useful when you tune organizer(tasks)-database: you can see your tasks in various time scale.
November 8th, 2005 at 1:53 pm
bburns
yeah, that’s the sort of thing i had in mind - and outlook is similar. but if you’ve ever try to put something like “may 2005″ into excel or access, it forces it into 5/1/2005, which is not what you meant at all!
November 9th, 2005 at 2:38 am
ivan
every task has a start date and due date. So if you plan to do somthimg on May 2005 but you don’t know yet exactly it is convenient to assign such task to the very beginning of May, when you probably will make clear decision of certain day.
But may be this example doesn’t exhaust the problem at all.