PhilosophyOfPick

From Pickwiki
Revision as of 18:56, 29 December 2009 by Rex Gozar (talk) (* revert spam)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Introduction

Here are a few things that strike me as interesting about the philosophy behind Multi-Value (MV or "Pick") systems. Some of them are things that seem generally accepted, others are aspects that perhaps reflect my understanding, or prejudices. Oh well, never mind, read on and see what you think.


Data representation

It is one of the most striking things about the MV model that all data is held as text. To me this fact indicates a massive philosophical difference - someone once called this a "liberal humanistic database" in comparison to those he called "technocratic fascist databases". You may think that's too harsh, but I agree with his point of view by and large (especially when I want to get something out of a Microsoft file).

What is the effect of this? Well, it's probably bad if you are looking to store massive amounts of scientific numerical data because it's not the most compact way representing numbers. In this case a specialised data file (or store) would be the way to go. In the case of general purpose data storage, however, I think this is excellent. One can inspect the data with the most primitive tool, and glean some idea of its meaning and structure. That is to say, this system facilitates the exposure of the information within the data to a human intelligence. To my mind, the system allows for the fact that there is such a thing as human intelligence.

I read a very interesting article called "Semiotics and GUI Design" - it's worth a google. It tells of three types of sign. The index, where there is a physical connection, like footprints. The icon where there is a picture. The symbol which uses convention alone to determine meaning. Symbolic representation means that we can step back from the system and productively reason about it as a system. The article asserted that "the symbolic mode of semiosis is the one that is most distinctively human". I think this relates directly to the MV model where data is held as text. This is the representation most readily analysed by simple programming, it is also the one most readily understood by an ordinary person.

It has always puzzled me why SQL DBMS's (arguably called Relational) have data types like "long integer", and "real" numbers. I recognise these terms from my days as a FORTRAN programmer; but why have they been dragged into the database? It seems to me that what should have been a general purpose thing has been invaded by a programming mindset. Instead of a useful tool enhancing clarity of thought, we have added requirements of specialised technical competency. If you want to know why this is so, then follow the money. I understand the latest SQL specification is 1600 pages.


Give them Plenty of Rope

Lovely idea of allowing the customers to do what they want. Tries to figure out what that is, not make you fit into its world. Original BASIC didn't even need spaces between words if it could interpret it.


MV compared to Relational

The MV model is not a data model like the Relational Model. One reason is that the very term "data model" has been defined and redefined over the years in such a way that it excludes MV. This was not done to exclude MV, it just happens to be that way. However, it is still to my mind a way of modeling data and I will continue to use the term "model" to describe it.

The Relational and Multi-Value models were conceived more or less at the same time and both were based on the idea of data being held in some form of table. However, two quite different approaches to handling data in computers were taken.

The Relational Model was based on E. C. Codd's work and aimed to set up a generic method of handling data. The method used is formal, mathematical, and allows for a way of handling data to be defined that was suitable for formal proofs in an analysable logic. The usual way of storing data at the time was heirarchic, thus specific to each application, and did not allow a generic logic to be developed. Codd used an array representation of relations to explain the "Relational View of Data". Arrays have rows and columns which, along with the term 'tablespace' in actual database implementations, may explain why people think of the Relational Model as being made up of tables.

As the overall concept has been refined over the years, we now have three levels. They are

  • The presentation layer
  • The logical layer
  • The storage layer

The logical layer is held to be the Relational Model. Here we have the "Relational Data Model" using "Relational algebra", where Codd's "Relational calculus" operates. This is where the Relational Database Management System (RDBMS) exists. In the ideal system (which, of course, does not exist), the storage layer can be adjusted and tuned for performance while the logical layer and its constraint logic is unchanged.

SQL databases implement various forms of DBMS, each of which follow the relational model to some degree. SQL is used not only in the logical layer, but also to define the storage structure as well. SQL databases seem to define various data 'types' that exist only because they were used in computer programming.


The Multi-Value Model was based on a project to report on information held in large computer data banks. The method involved defining the path to the storage and how to format it in printouts. This path and formatting information was held in a data dictionary and gave the ability to request data by a field name rather than by an access path. This was extended by Don Nelson and Dick Pick by designing a storage structure which made it simple to define the path to specific data fields. Rather than interposing a logical layer between presentation and storage, the storage layer was abstracted to match the presentation. All data was stored as a string of characters so that you could read the text pretty well 'by eye' directly off the storage structure.

In the Multi-Value model there is no DBMS as such. The definition used for data presentation does not provide constraint checking on data updates. All controls are enforced by the application. However, there have been a large number of application generators and fourth generation languages developed in MV. I remember writing one myself that took my specification and generated working code. It took me two weeks to write and produced 50,000 lines of code in 20 minutes for something over 100 programs (data entry, menus, and reports). I think the reason why I (and so many others) could do this was that the particular type of BASIC used in MV is really good at handling strings, including strings of source code.