11. 3 : ORGANISATION OF DATA FOR EFFECTIVE RETRIEVAL
SPECIFICATION STATEMENTS (AQA 5521)
This section is about RELATIONAL DATABASES
Describe the nature and purpose of a relational database and how it works
Understand that data needs to be organised in a relational database to allow for effective updating and retrieval
Understand how data can be retrieved to produce meaningful information
Recall the relevant advantages of relational databases over flat file information storage and retrieval systems
Select and justify appropriate file and relational database structures for particular applications
DATABASES
ACCESS is a GENERIC piece of software which carries out this function. Remember that you need to refer to any types of software by their generic type rather than by actual product name. Talk about a RELATIONAL DATABASE rather than ACCESS. A basic definition of a database could be: "a collection of information related to a particular topic, or designed for a particular purpose".
For this section you need to be able to consider the key factors above.
Need to know the difference between a FLAT FILE and a RELATIONAL database.
I'm putting together a guide to using ACCESS to create databases, which will be available to download from here in PDF format shortly.
Think of a database as a filing cabinet, with drawers full of folders which contain document files. All organisations need to store and retrieve data.
A FLAT FILE has only one TABLE. It tends to contain redundant data very quickly.
DATA tends to be stored independent of the actual programmes: on tables which are laid out in a particular way.
With RELATIONAL DATABASES, there can be several TABLES which are linked in various ways: these are the RELATIONSHIPS which give the database its name. The system is known by the letters RDMS or RDBMS) What matters are the RELATIONSHIPS as this determines the movement of data through the system.
The result is more complex than a flat file system, and may require a database MANAGER or administrator, and this means that there is more that could go wrong. If there is only one system, rather than several smaller ones, this means that there may be a reliance on the system working correctly
On the other hand, there is the possibility ( as with networks ) of having levels of access, and certainly a lot of time will be saved, as there is no need to duplicate entries. For example, all patients at a doctor's surgery will have a code for their doctor, rather than putting the doctor's name on every patient's record. There is also greater CONSISTENCY, as data is copied where necessary rather than duplicated. Information can also be updated quickly. For example, if a doctor leaves the surgery and another doctor comes and takes over his patients, all that has to happen is that the details of the doctor which apply to that particular code are change: 1 change can affect hundreds of linked records.
There are various ways in which TABLES can be linked. I look at the example of a LIBRARY. Many 6th form students are unfamiliar with the old system whereby each book had a little envelope at the front with a card. This card was removed from the book, and stamped with a date, and placed into a cardboard envelope which each user of the library had. The envelope was then filed away in a long narrow wooden box, so that the library had a record of the book being borrowed. The stamped date was a reminder to the user as to when the book had to be returned.
With each book placed in a database, it is easy for the library to locate any item, and set reminder flags, so that as a book is returned, it can be put to one side if someone has requested it.
The various links or RELATIONSHIPS can include:
ONE TO ONE
ONE TO MANY
MANY TO ONE
MANY TO MANY
These may also be referred to in textbooks as ENTITY RELATIONSHIPS.
Do some reading to make sure that you know what each of these involves.
As the DATABASE is being set up, there has to be a process called NORMALISATION. This involves thinking through the structure of the database so that there is the minimum of duplication, so as to produce an EFFICIENT database.
You should also be familiar with the terms FIELD, RECORD, PRIMARY KEY and INDEX, OBJECT, TABLE, FORM, QUERY, REPORT and MACRO.
Suitable uses for a RELATIONAL DATABASE:
Storage of data for individuals, to allow mail merge etc.
Bookings
Equipment records
Admissions data for schools and colleges - student records
Doctors' surgery - patient records / records of drugs prescribed...
Library catalogues
What databases are used at KES ?
Examination entry system
Attendance system
Student record system
Timetabling
Remember that any data stored on a database is subject to the DATA PROTECTION ACT, and any that contain personal information must be REGISTERED. There are various aspects of the ACT which are featured in Unit 1 of the course (See relevant section of Geography Pages)
Go HERE to download a guide to using ACCESS. You can also located one at BRISTOL UNIVERSITY which relates to Access 97 but is broadly transferrable.
SQL = Structured Query Language
This is used to query databases. There are various queries that can be used...
What problems might there have been if a company was to convert from a FLAT FILE to a RELATIONAL database...
DATA MINING and Supermarket LOYALTY CARDS could be discussed at this point...
Querying tends to take one of three forms:
SORTING
Make sure that data can be sorted logically: in alphabetical or date order.
or
FILTERING
Filtering involves removing certain records from view so that you can only see the records you need to see. Records aren't deleted.
or
QUERIES
These can be SELECT queries, ACTION queries or CROSS TAB queries.
TABLES and RELATIONSHIPS
These lie at the heart of any database.
Consider an ESTATE AGENT. These will traditionally have a list of properties on their 'books' which they will try to match to buyers. Sometimes it will be obvious that a buyer is only interested in one type of property, or one geographical area, or perhaps has a tight budget, so the estate agent should be able to consider this when deciding which mailings to send to a customer. Increasingly, estate agents are moving online, and will also use databases to store information on their houses, and produce mailings for buyers on their contacts list.
Which TABLES would you expect there to be in an estate agents' database ?
(This work links in with a SPREADSHEET activity I did on houses earlier in the year...)
BUYERS is one table you would need.
This could include a TYPE FIELD, which would be used to indicate whether a sale was definite, or whether the person was just testing the waters and seeing what was available (or perhaps had a fetish for looking round other people's houses...)
SELLERS would be another table.
HOUSES would be another table.
How would you link the tables together ?
BUYERS need to be linked up with HOUSES.
SELLERS need to be linked up with HOUSES.
For HOUSES and SELLERS the relationship should be ONE TO ONE. In most cases, people only sell one house...
For HOUSES and BUYERS the relationship should be MANY TO MANY. This is because there are many potential buyers for a property, especially in a sought after area in a rising market, and equally there may be some buyers who have an interest in more than one house, so the database has to be able to cope with this. Once a BUYER has expressed an interest in a property they may still go round other houses.
If this database was NORMALISED, you may decide to have just one TABLE for both BUYERS and SELLERS, where you would call it CUSTOMERS perhaps and have a FIELD which identified the customer as either a BUYER or SELLER (or even both if they were selling their house with an estate agent and looking at another property on the estate agent's database...)
In terms of the CUSTOMER table, what fields would you need ?
Think through the information that an estate agent would need about its customers: starting with the obvious personal details.
What information would the database store about each PROPERTY ? Again, think it through....
Try to remove REDUNDANT DATA whenever necessary. Data also needs to be CONSISTENT. The structure of the database needs to be FLEXIBLE. You also need to check for data INTEGRITY: the data have to be compatible with all the tables, so that each table has to be considered in association with the others.
DATABASE DESIGN
Before you start designing your database, before even sitting in front of the computer, plan out the following details - it's difficult to alter structure and definitions once you're under way.
Names of the components of the database: TABLES, FORMS, QUERIES and REPORTS.
How the information is going to be stored. If storing names, how are these going to be held ? Allow for the necessary flexibility: for example, think about addresses - what if a person has no house number, or a 5 line address ? Could the database cope with all these changes ?
How are you going to link the tables ?
What about the fields, and the names of the fields ? Choose obvious names e.g. a table storing details on MUSIC could be called MUSIC...
What about DATA VALIDATION ?
The size of the fields: this affects the size of the final database, and the speed at which data may be retrieved...
Data FORMATS or DATA TYPES: there are various options for each field...
DATE / TIME
CURRENCY - useful for financial information / price of objects etc.
AUTO NUMBER: useful when you want each record to have the next available number...
HYPERLINK
YES / NO: a logical entry...
OLE: Object Linking and Embedding: graphical objects...
Important to remember GARBAGE IN - GARBAGE OUT...
Data can be taken from more than one table at the same time as a result of a QUERY.
Data can be RELATED between the tables, but you must create a RELATIONSHIP between tables as long as there is a common field.
Relationships take several forms:
ONE TO MANY
The most common relationship is ONE TO MANY, where each record can be associated with many records in another table.
Each PRIMARY KEY can appear many times as the FOREIGN KEY.
Must have a PRIMARY KEY at the 'ONE' end of the relationship..
ONE TO ONE
Each record in the primary table can be associated with only one record: the related fields must be the primary keys.
To find a particular record, the database will use an INDEX. It is possible to set up a number of indices so that the databases can be set up to be queried on a number of fields.
A good example of the production of DATABASES is contained in the SKILLSBUILDER books which are used for GCSE work.