Industry


Ads by TechWords

See your link here


The DRIPping folly of the "Information Principle"

A couple of relational theory-priests have been hurling anathemas in response to my DBMS2 concept, especially in the comments to a prior blog entry.   They are particularly hung up on Date's Relational Information Principle (DRIP), which basically asserts that the only proper way to store information is in tuples in a relational database.

That's a fine principle under certain assumptions, such as:

1.  Programmers' time and business users' time are expensive.
2.  Computers and networks are free.
3.  All database queries either have factual answers or else are unanswerable.
4.  Enterprises have complete control over the structure of their data.

Well, #1 is legitimate.  As for #2, however, computers and networks are far from free, and the same goes for the time of database administrators.  TCO matters a great deal, to the extent that it often outweighs current or hypothetical future programming costs.

DRIPpers sometimes pretend to acknowledge point #2 in passing, but they don't really come close to addressing it very often.  Rather, they tend to make vague and exaggerated claims about what current products can do, or else moan grouchily that the database vendors haven't yet solved some of the hardest problems of artificial intelligence. 

And that's the good news for DRIP.

A Google search on the string "information principle" "relevance ranking" comes up virtually empty.  The DRIPpers just aren't -- well, they aren't relevant to a world in which text search is becoming increasingly important, as are other kinds of query and analysis in which matching is multi-valued, contextually dependent, and often best done via "black-box" kinds of algorithms.

The DRIPpers seem equally ill-prepared to deal with a world in which data is managed by packaged applications, by other enterprises' applications, by the applications of companies your enterprise has recently acquired, in legacy paper file folders, and so on. 
It's a nice theory, and Chris Date has surely done a great service educating people to some good ideas in database design.  But when his acolytes start trying to inferfere in practical discussions of commercial technology use, the DRIPping can get hard to take.

________________________________________________________

FROM THE EDITOR:

This blog post has been edited.

Additionally, this comment thread has been closed, in line with Computerworld's Terms of Service. Computerworld wants to foster a civil and respectful debate over important IT issues, but this thread has become too personal and not useful to Computerworld's audience of IT professionals. Certain comments may be reposted at a later date, but new comments will be disabled.

Ian Lamont
Online Projects Editor
Computerworld.com
ian_lamont@computerworld.com

What People Are Saying

Some written material on the

Some written material on the "Askew Wall" should probably be available in some of the "Relational Database Writings" publications (Darwen, Date, Mcgoveran and a few others were authors for some of them, link: http://portal.acm.org/results.cfm?query=%2Bauthor%3AP112723%20author%3AP35984&querydisp=author%3AHugh%20Darwen&coll=GUIDE&dl=GUIDE&CFID=53085979&CFTOKEN=70479054). However, Relational Database Writings publications may be difficult to come by, (if a valid public link to an Askew Wall article or text is available, it would be of interest and most appreciated). The "Askew Wall" has also been the subject of lectures.

On a first try, all I found

On a first try, all I found on the "Askew Wall" was what looked like a lot of slides for a lecture, and that seemed to be focused to a great extent on DML (SQL bad but could be worse, Alphora the great hope, etc.)

EDIT:  Of course, that might have something to do with it being a .PDF named "SQL problems ..."

Mr. Monash states: "The

Mr. Monash states:
"The "True Relational" advocates, as I understand it, want to replace SQL DBMS by DBMS that would support another DML, one that they think would be better for programmers. That's fine, I guess. It might even perform well in certain cases. But I'm not sure what important problem they'd solve with that, given all the development tools in the world that insulate a lot of programmers from writing a lot of gnarly SQL if they don't want to."

As Mr. Darwen explains in an article on the "Askew Wall", there are practical problems a DBMS made to be closer to the relational model would help to address, it is not just a different DML but much more than that.

The latter. Thanks,

The latter.

Thanks, Jim!

I'll correct the typo in the original post.

Clarification, please - pcw

Clarification, please -

pcw wrote "I had no problems running enterprise applications on fully normalised databases."

Curt Monash wrote "the claim that fully denormalized databases -- managed by a traditional SQL DBMS -- perform fine with current technology is obviously and massively true."

Mr. Monash, were you intending to express agreement with pcw about normalized databases, or did did you actually mean to write what you did, "DEnormalized"?

Just as I was ready to write

Just as I was ready to write off these threads altogether, a couple of sober comments show up!

First of all, the claim that fully normalized databases -- managed by a traditional SQL DBMS -- perform fine with current technology is obviously and massively true.  It's not true for every enterprise, every application, or every datatype, but it's true a lot of the time.

The "True Relational" advocates, as I understand it, want to replace SQL DBMS by DBMS that would support another DML, one that they think would be better for programmers.  That's fine, I guess.  It might even perform well in certain cases.  But  I'm not sure what important problem they'd solve  with that, given all the development tools in the world that insulate a lot of  programmers from writing a lot of gnarly SQL if they don't want to.

They also SEEM to be implying, via the urban legend of Required Technologies, Inc.'s implementation of "TransRelational(TM)" modelling/architecture/technology/whatever, that there will soon be a "True Relational" DBMS that gives great performance for close to the full spectrum of demanding database applications.   If that were true, I wouldn't have any trouble at all seeing what problem they think they're solving.   But it's hooey, just another of the inaccurate stories floating around that somehow get Chris Date's name attached to them.

CAM

I have read this discussion

I have read this discussion with interest - and the one that spawned it, and the article that kicked that one off. Maybe I'm filtering the material through my own prejudices, but nowhere in all of it can I find a clear statement or demonstration of why the Relational Model is a Bad Thing. Or even an Insufficient Thing. Or of why DBMSs' moving further away from truly implementing the RM than they already are would be a Good Thing.

There is an opinion stated - somewhat obliquely - that it leads to inefficient or ineffective applications. But I found no supporting material for this, no closely analyzed real-world examples or carefully argued thought experiments.

There is an assertion that relational databases capable of meeting real-world challenges are unacceptably complex. But isn't a (well-designed) system complex just to the extent that its real-world problem space is complex? Is there any proof, or even partial evidence, that a non-relational system could be less complex and still provide the equivalent functionality and robustness?

There is the claim that experience in the development of DBMSs provides special, valuable insights. Probably so. But one insight that seems to be lacking here is that the RM is a logical model, quite independent of physical implementation. There is no reason other than DBMS development cost why an RDBMS cannot offer a DBA a plethora of physical storage options that can be selected and tuned to support enterprise needs - either automatically or manually, with the control data stored in relations in either case.

There's no doubt in my mind that the ability to exchange and represent data in many styles - such as XML-based and object-oriented representations - carries many benefits. But neither is there any question that these are at their best when based on a relational DBMS that remains the core of the system and the repository for all application data and metadata.

Cutting through the personal

Cutting through the personal abuse there seems to be an underlying statement that so called "relational" DBMSs cannot perform.

When I was a working DBA 6 years ago I had no problems running enterprise applications on fully normalised databases. What the hell has happened to the technology that means that this is no longer the case?

Scarlet, I must invoke

Scarlet, I must invoke Orwell's NewSpeak again.

The kind of "thougghtful" approach you refer to would be valid in a normal environment, where about 90% of the info being published is correct and not more than 5-10% is erroneous; and where people are eager to learn and able to reason.

In an enviornment with an inverse ratio, where knowledge and reason are dismissed, and the objective is to sell one's ignorance to others equally ignorant, none of whom is interested in learning core substance, the thoughtful approach is costly and completely ineffective. Costly because one must continuously debunk nonsense that is incoherent and requires more effort and time to address than it is practically possible. It wears down, because you're essentially deal with what is a dead hand in poker.

When I came to this country I was using the thoughtful approach. Over time I learned it is a waste of time and now I call a spade a spade.

Your comments suggest that u're still belaboring under the delusion that the thoughtful efforts are worth it. I recommend you recognize reality for what it is.

Kristopher, Thanks for your

Kristopher,

Thanks for your long and clear comment.  By coincidence, I just got off fhe phone from a long and far-ranging conversation with a pretty senior database guy at IBM.  Although it wasn't really his area, I checked whether it is indeed the case by now that there's a whole lot of SQL access to IMS.  The answer, as I pretty much knew it would be, was "yes."

SQL updating of IMS is highly impractical.  But SQL queries aren't a big problem at all.  One can index pretty much anything ...

But when it comes to your views on text search -- think again.   Just think again, please. 

CAM