Pages

Showing posts with label An introduction to data and information. Show all posts
Showing posts with label An introduction to data and information. Show all posts

19.8.11

7 Unit summary

7.1 What have you learnt in this unit?

This unit began by exploring some basic issues involving computers:
  • the nature of data and information;

  • why human beings need (and want) computers;

  • the prevalence of computers in modern life.

The unit looked briefly at how a computer-based society affects the average person who (whether he or she knows it or not) has a persona that consists of data about them held by many diverse organisations.
Much of this unit consisted of case studies illustrating the possibilities for computer use. They raised some of the issues posed by computing technologies, such as:
  • the distinction between data and information;

  • what computers can do with data to produce information;

  • how computers can be used to work with data and search for it, control machines, and support commercial operations.

There are a number of themes running through this unit.
  • Data requires encoding.

  • In order to function, a computer requires data which may be stored in databases.

  • Data has to be transmitted from place to place.

  • At the heart of a computer system there are one or more programs.

  • Many current computer systems are distributed, in that they consist of a number of computers which cooperate and communicate with each other in order to function.

  • Information has to be fit-for-purpose.

  • Security and trustworthiness are major concerns with many systems.

  • Computer systems also have drawbacks and adverse effects. They also have social, political, legal and ethical implications.

You should be able to define the following terms in your own words.











case study hit
computer information
computer program internet
computer system keyword
data parameter
database perceptual data
database server search engine
distributed system sensation
gateway sign/symbol
global positioning system (GPS) World Wide Web (the web)

6.2.3 Security: are my credit card details safe?

Many people now shop regularly on the web. However, many others don't because they fear that an unscrupulous person could obtain their credit card details. They also fear that if they provide their names and addresses to a firm on the web, they will be bombarded with junk mail (or its electronic equivalent, junk email). Some worry that, since anyone can put up a website, the seller may be bogus and no goods will appear after the sale has been completed or won't be as advertised.
Consequently, other important issues raised by this case study are security and trustworthiness. The internet is a remarkably open medium. It does not take too much effort to ‘capture’ the data that flows along communication lines. Someone could theoretically read your credit card details as they are transmitted between your computer and that of the seller. (I use the term ‘theoretically’ because there exist techniques which enable the data to be transformed into a form which would be virtually impossible to read.)
You can be reasonably confident of buying from a website if it displays one of two things.
  1. The address shown in the bar at the top of the screen should start with ‘https’ instead of ‘http’. The letter ‘s’ means you are connected to a secure web server using techniques to protect your details from electronic snoopers.

  2. An icon representing a small key is present. This also indicates that the web server you are connected to is a secure one.

Another safety precaution is to deal only with web sellers you know are reputable. Consumer organisations often have schemes for accrediting web sellers who are legitimate and secure dealers. Friends and neighbours may also be able to recommend reputable and secure web sellers.

SAQ 9

Is web selling, as practised by a firm like Lakeland, an example of a distributed system? Explain.

Answer

Yes, it is a distributed system. It consists of user PCs, web servers, and database servers, with data and information being transferred between them using networks (in this case the internet).

6.3 Summary

This section examined how computers can be used to control machines. It used the household washing machine as a case study and explored how the microcomputer contained in such a machine is programmed to:
  • provide an interface for the user to operate the machine;

  • control the way the machine carries out the operations chosen by the user.

The washing machine case study also illustrated the necessity of building safety features into computer-controlled mechanisms.
Computers are also used to support selling goods and services via the web. A case study of a successful company showed what the information requirements for such a system are, and examined how two or more computers can cooperate as part of a distributed system to satisfy these requirements in a way that is secure.

6.2.1 Using a sales website

A visitor to a sales website is usually able to:
  • browse through the details of the goods for sale;

  • search for a particular product;

  • check on the availability of goods;

  • read reviews of the products by other purchasers;

  • register to receive newsletters which detail new items of interest;

  • buy products using credit or debit cards, and in some cases, other payment methods such as cheques.


Some sales sites also allow the user to:
  • see what items are most popular;

  • check the status of their order.

An indication of what Lakeland's website offers can be gained from its home page shown in Figure 16.



Figure 16
Figure 16 The Lakeland home page offers product and keyword searches, a quick way to browse various categories of products, what's selling best, and a product of the day

Exercise 17

The Lakeland website offers both a product search and a keyword search.
  1. Describe what you think are the differences between these two types of search.

  2. Why should there be two ways to search?

    Discussion

    • 1. A keyword search allows the user to type in words, such as ‘clothes storage’ and the search engine will look through Lakeland's website for products which fit this description.
      A product search is associated with the catalogues that customers receive. The product numbers in the catalogue can be used to access a product very quickly. For example, typing in ‘5692’ will display the product designated by that number.

    • 2. The two types of search serve different customers: those who are interested in a particular type of product (keyword search) and those who have their eyes on an actual product (product search).



6.2.2 Database servers

To be able to search a website like Lakeland's requires not only a web server but a database server. Like a web server, a database server is a computer that responds to requests from other computers. Its task is to find and extract data from a database.
The web and database servers form part of a distributed system. This means that separate computers exchange data and information across a network (in this case the internet) to produce results for a user. For example, suppose I use the keyword search to ask for ‘kitchen cleaners’. This request is transferred to the web server, which has an index of products which can be categorised as kitchen cleaners. It then sends these product numbers to the database server, which locates the correct items in the product database and returns information about them (pictures, description and price) via the web server to my browser.
Compared with the simple data from which the complex DNA database is built, the data processed by the database servers at a company like Lakeland is complex (text, graphics, pictures).

Exercise 18

For a company like Lakeland:
  1. what are its customers' information requirements?

  2. what are the company's information requirements?

    Discussion

    1. Customers are likely to want to know:
      • the items for sale;

      • what items look like;

      • details about performance;

      • cost;

      • other information, such as availability, guarantees, delivery costs, and time of delivery.


    2. Lakeland's requirements will be much broader and include information relating to:
      • product suppliers;

      • wholesale cost;

      • availability and delivery arrangements;

      • its own storage and distribution system;

      • who buys products from Lakeland and why;

      • what other products such buyers might be interested in;

      • where there might be sufficient customers to warrant opening a new shop;

      • hiring, training, promoting and retaining staff;

      • competitors and what they offer that Lakeland does not;

      • accounts and finance;

      • legal matters such as product and supplier liability, employment law, contract law.


    Your answer will probably differ from this list, but you should have a number of points that are similar to items above.
    Let's see how Lakeland's system addresses some of these requirements in terms of the data they store in their databases. This will include:

  3. information about customers:
    • names and contact details,

    • credit card information,

    • the passwords (if any) they use to gain access to the Lakeland website;


  4. information about their products for sale:
    • pictures,

    • specifications,

    • price,

    • special offers;


  5. the current stock (inventory) of a particular item;

  6. the position of a product in terms of its popularity;

  7. orders that have been made:
    • when they were made,

    • when they were or will be dispatched,

    • whether the full order can be satisfied from stock.


What is interesting is that although the data listed above is quite ‘rich’ (i.e. complex), the processing required to extract the data is not very complicated. For example, it takes very little computational effort to extract a customer's current order. All that is necessary is for the area of the database containing order details to be searched for a match with the customer's name information or an order number.

6.1.4 Controlling the machine

The major task of a washing machine microcomputer is to control the actions of the machine in accordance with the wash programme selected. To do this, the computer is electrically attached to a variety of:
  • actuators that cause mechanical parts of the system to work;

  • sensors that sense the state of some aspect of the machine, such as water temperature.

There is an actuator to open or close the water input valve. Another controls the turning of the drum, and another the pumping of water through the machine. There is also one for pumping the water to the drain and one for controlling the water that washes through the tray holding washing powder and fabric softener. Lastly, there is an actuator that turns on a heating coil if the water temperature is below the desired temperature for the wash.
As regards sensors, my machine has one to check water level, and another to check water temperature. It also has a sensor that weighs the dry laundry at the start of the cycle, providing data that the microcomputer program uses to determine how much water to use for each wash. This enables my machine to optimise water use, and hence conserve resources (water and the electricity to heat the water).

Exercise 16

The general wash programme cycle on my machine goes through the following steps.
  1. It weighs the dry laundry once the door is shut and the ‘Start’ button is pressed.

  2. It determines how much water should be used and opens the valve.

  3. It pumps water through the washing powder tray for a period of time to flush the powder into the drum, and then pumps the water directly into the drum.

  4. It locks the main door when the drum is in motion, or when the water level is higher than the bottom of the door (a safety feature).

  5. It heats the water to the correct temperature for the programme and point in the washing cycle.

  6. It turns the drum and recycles the water onto the laundry for the duration of the washing cycle.

  7. It drains the water from the drum.

  8. It spins the laundry at the correct speed for a certain time.

The programme stops.
For each of the eight steps listed above, list the actuator or sensor involved.

Discussion

  1. The weight sensor.

  2. The valve actuator.

  3. The actuators for the water pump and for the valves channelling water in different ways through the machine.

  4. The actuator for the door lock and the water level sensor.

  5. The water temperature sensor, and the actuator for the heating coil.

  6. The actuator for the motor spinning the drum, the actuator for the water pump and the actuators for the valves leading into and out of the drum.

  7. The actuator for the pump and the actuator for the drain valve (or the actuator for water channelling, depending on the design or type of the machine).

  8. The actuator for the motor spinning the drum and the actuators that drain water from the drum (as in point 7).

SAQ 8

What role do computer control systems play in many mechanical devices?

Answer

They manage sensors and actuators to control the actions of mechanical devices such as cars, microwaves and washing machines.
They provide interfaces that enable the user to control the workings of the machine.
They control the machine's actions in response to the user's choices and the state of the machine.


6.2 Selling on the weba

The web is fast becoming a medium for selling everything from books to clothes, gardening tools to beauty products, investment advice to travel services. Web-based selling seems to be concentrated in three main categories of company:
  • existing catalogue sales companies which have put their catalogues online to allow customers to buy using the web;

  • existing companies whose products are largely information and which have used the web as a means of providing a personalised service or one with a very quick response;

  • companies which have started from scratch using the web as their only sales medium.


Of these companies, the most successful financially have been those in the second category. For example, Pergamon Press, which publishes many professional journals, has moved to making all its journals available electronically. Consequently, not only is its selling done over the web, so is its distribution! The company still charges a subscription fee which remains more or less equal to what it was before Pergamon began using the web. However, most subscribers (largely the medical and legal professions) have chosen to use the convenience of electronic delivery, and this has drastically reduced printing, warehousing and distribution costs for Pergamon.
One successful company in the first category is Lakeland. This is a British company which began by providing plastic bags for the farming industry, and now sells a wide variety of household items, food ‘treats’, and seasonal items for Christmas and summer. The company ran an established mail order business before venturing onto the web, and it also has a chain of shops in major UK towns. Thus the venture onto the web was, for Lakeland, an extension into a new selling medium. The company already had expertise in presenting their products from their mail order catalogue business. It also had an established infrastructure of suppliers, warehouses and links to delivery companies.
In the last category of companies, one of the success stories of the web has been the online bookseller, Amazon, which has expanded into all kinds of consumer goods such as smaller electronics, videos and DVDs. It also acts as an agent for sales of second-hand goods. So if a book is out of print, you may be able to buy it second-hand through the Amazon website.

6.1.3 Ensuring safety

Ensuring that a user can't choose a wash temperature that's too hot for the ‘hand wash’ programme is an example of ensuring safety. In other words, the washing machine microcomputer is trying to prevent the user making choices that are not sensible. Of course, I could put a load of delicate washing in and choose the ‘cotton’ programme which has a temperature of 90°C. The computer program controlling the machine has no way of knowing that I've put silks or woollens in and not cottons. The worst that would happen, however, is that I would ruin some expensive clothing due to my own negligence.
What about the safety of the user? A washing machine could be dangerous if anyone could put their hand into the drum when it was moving, or when the water was very hot (anything over 40°C can scald), or when the water level is high enough to spill out of the door. The programme on my machine does allow the user to open the door to insert additional items during the cycle, but only when safety conditions are met (drum not moving, water not too hot, water not too high). It is incumbent on the designer of any such system to ensure that basic safety requirements are met. While it may not result in serious harm if, for example, one can open the door when water is above the level of the bottom of the door, customer satisfaction would surely plummet were this to happen.
Some computer-controlled applications (e.g. controlling a flying aircraft) have to go further towards ensuring that an operator doesn't jeopardise situations due to negligence. These are not discussed in this unit, but you should be aware that they exist. They are called safety-critical systems, which means that serious harm or loss of life could occur if these systems break down, or do not function properly.

Exercise 15

It is common in modern cars to have central locking. This usually involves pressing a button on a key fob and sending a signal to the car from a short distance which locks or unlocks all doors simultaneously. A button on the control panel may work in a similar way to lock and unlock all the doors from inside.
  1. Can you identify any safety situations that would affect the lock-control program in the car's microcomputer?

  2. What kind of information might a driver need about the door locks?

    Discussion

    1. It might be dangerous to allow someone to unlock the doors while the car is in motion. For example, a child might press the button on the control panel, unlocking the doors, then accidentally open the door and fall out. With very small children, it might be dangerous for the child to be able to unlock any door (even when the car is stationary) without the driver knowing. Thus one safety consideration might be to ensure that it is not possible to override child-proof locks accidentally or through carelessness.

    2. The driver might simply need a light to tell him or her whether the locks were engaged or not.



6.1.2 Choosing programmes and parameters

Another part of the interface shown in Figure 15 allows the user to select one from a variety of predetermined washing programmes, and to change some of the parameters. If I choose the ‘cotton’ programme, for example, the microcomputer's program assumes that I wish to wash this load of laundry at 60°C, use the main wash programme, and spin the washing at the highest speed. Sometimes this programme is fine, but at other times I may want to select the higher temperature of 90°C in order, say, to sterilise the laundry (e.g. nappies), or a lower temperature (e.g. to prevent dark colours fading).
I may also select the pre-wash if my laundry is especially dirty or the additional rinse if a member of the family has sensitive skin which may react to residues of washing powder.
Finally, the microcomputer's program ensures I don't do anything silly. For example, if I select the ‘hand wash’ programme, it will not allow me to change the temperature to one higher than the pre-programmed 30°C.

Exercise 14

What kind of interface would you expect on a very simple microwave oven (one without predetermined programmes)?

Discussion

Since power level (e.g. defrost, low, medium and high) and time are important when microwaving food, the user needs to be able to select these two parameters.
Typically, a microwave interface will have buttons or a dial for selecting the power level, and a numeric keypad or dial for setting the time in terms of minutes and seconds. The display might indicate the power level chosen, and will certainly show the time remaining.
The interface will also have two other important controls: a ‘Start’ button and an ‘Open door’ button.
You may have said something a bit different, depending on your familiarity with microwave ovens.

6 Controlling things; selling things

6.1 Controlling things

As you learned in Section 1, computers can collect, process, store and distribute information. This section shows that they can also be used to:
  • control machines and simple mechanisms;

  • conduct a special kind of commerce: selling on the web.

Let us examine more closely that common household appliance, the automatic washing machine. Virtually all such machines sold in the last decade or so are controlled using a microcomputer of some type. Before that, such control was provided by mechanical systems. However, because these had moving parts they suffered from wear, and tended to break down frequently or require replacement. Also, the nature of mechanical control systems limited how complex they could be. Consequently, they tended to be quite simple, and therefore less ‘automatic’.
The main tasks of a microcomputer in a modern washing machine are to:
  • present an interface to the users that lets them know what possibilities there are and what the current state of the machine is;

  • allow the user to select one from a variety of predetermined washing programmes;

  • change some of the parameters (such as water temperature) to suit particular conditions;

  • initiate, control and finally halt the actions of the machine in accordance with the wash programme selected;

  • in some machines, ensure that the washing is done efficiently with minimum inputs of water or washing powder, in the interests of reducing resource use and maximising environmental protection;

  • ensure safe operation of the machine.

Let's examine some of these tasks in more detail.

5.2.1 Transforming the natural to the designed

The artist Christine Martell lives in Oregon in the United States and works with beads and visual images. I asked her to describe how she makes use of a computer to create her visual images of flowers and trees. She writes of her work:
I start by finding flowers that are compelling in some way, most often in form and colour. I take photographs with a 35 mm camera having a macro lens.
I'm usually looking for a line that might suggest movement or gesture. I find a place that might be the resting place in the movement, and focus the camera there. Often times the background is out of the focal range.
When I have the film developed, I choose a lab that tends to make the photographs saturated [with little or no admixture of white] and rich. I prefer to bring the colour ‘down’ electronically rather than try to enhance it. I have the prints made in 5 by 7 inch size.
I scan the photograph into the computer, using a simple consumer grade scanner. I copy the image to make a working copy. I keep the original photo scan as a separate file, so I can move back and forth between the images to restore original edges and details.
When I draw into the images electronically using a drawing tablet, I am usually looking to create a dynamic energy; to express a movement and visually emphasise the contrast between that energy and the stillness of the flower. I draw back into the images with my digitising tablet, using Painter software. I hardly ever use filters [standard effects made available by ‘painting’ software, equivalent to using a lens filter on a conventional camera] as the effect is too uniform for my taste. Once in a while, if I need a uniform texture for a background, I'll use a filter… or I might start with a filtered texture, then draw into it.
The computer gives me the freedom to mix the visual effects of media that would not readily combine in traditional media. I also can work through many more ideas electronically.
Figure 14 shows one of Christine Martell's original scanned-in photographs beside her final image. Of course, in a way even her original photograph is art. She has been careful to use her skill to select a viewpoint, a moment and a field of focus, and then to choose a developing laboratory that will do what she needs with the colour saturation. Finally, in order to achieve the result she wants, she uses a drawing tablet with a ‘painting’ program to ‘paint’ effects.



Figure 14
Figure 14 (a) The photograph, already ‘art’, and (b) the completed visual image (courtesy of Christine Martell). She used an ordinary 35 mm camera with a special, but commonly available, lens to produce the original photograph, and a computer, scanner, drawing tablet and a painting program to produce the final image.

Most of us will never be professional artists. However, we can aspire to be creative for our own pleasure and the pleasure of those around us. The computer offers considerable scope for doing this.
Note that Christine Martell uses a standard scanner to scan in her photographs. Having an electronic drawing tablet is more unusual, but these are easily purchased and anyone with sufficient manual control can use one.
Perhaps the biggest advantage the computer gives Christine Martell as an artist is that she can make as many electronic copies of her scanned image as she requires. This allows her to try different effects, freely discard those that she is not satisfied with, throw away mistakes, or use the power of the computer to make many different images from the same original photograph. To do the equivalent by hand using an actual photograph would be far costlier, and some mistakes which are not erasable would require the artist to throw away prints.
The other main advantage of the computer as an artist's tool is that it can produce effects that would be difficult using traditional media. Christine Martell mentions one, i.e. to be able to reduce the saturation of her colours. She can do this selectively to parts of a photograph – something which is virtually impossible using conventional film-developing methods. By choosing to have her film developed in such a way that the colours are deep and saturated, she gives herself the freedom, using her computer, to alter those colours to whatever saturation she desires.

SAQ 7

  1. What characteristic of computer systems enables them to be used creatively? In which part of a computer system does this characteristic reside?

  2. Give an advantage a computer system offers the creative artist and state why it is an advantage.

    Answer

    1. The flexibility of a computer system is key to being able to use them creatively. It is the computer program that makes such flexibility possible.

    2. It is possible to make many copies of something. This enables the artist to experiment freely, throwing away mistakes or results that don't please, or making many different versions from a single original.



5.3 Summary

This section made an interesting contrast between simple data that generates large and complex structures that require large and complex programs to handle them, and complex data which a complex but easy to use program helps a non-expert handle in some interesting, creative, flexible ways.
The case study on DNA illustrated how simple data (consisting of only four elements) can be combined into very large and complex structures (genes and chromosomes). You learned how such large and complex structures, when stored in databases, present certain computational problems. The difficulty of finding anything in such large databases where data may be very repetitive or partial, or its location not known means that huge computational effort is required, both to build the database in the first place, and then to use it effectively.
In contrast, the second case study examined how complex data, such as the graphical representation of a scene, can be made relatively easy to use by a non-expert. The case study showed how the flexibility of a computer and its ability to make and store multiple copies provides great scope for creativity.

5.2 Art and the common computer

Art is difficult to define. But all art involves the Exercise of human skill. A natural object, such as a piece of driftwood, a flower, a bird song, can move us to admire it as beautiful or intriguing or comforting, but it isn't art. Artists (be they photographers, painters, sculptors, actors, musicians, authors or dancers) use their skill to transform natural objects, materials or signs (paint, clay, their own body or voice, the sounds of musical instrument, words) into something else: something with value in its own right rather than for the way in which it might be used.
And what, you may ask, do computers have to do with art?
Central to this unit is the idea that a computer is essentially a tool. And because of the flexibility of programming, it is an exceedingly flexible tool. With the right sort of program and appropriate peripheral devices, a computer can be used by artists to produce art. This subsection will examine how computers can be used to produce visual art.
If you examine a photograph, a painting or a view out of your window carefully, you will notice that what you are looking at is, for the most part, incredibly complex. Colours vary across an almost infinite colour spectrum. There are apparent lines or edges, and objects within the view will be clearly or fuzzily defined depending upon lighting conditions and distance from the person viewing the scene.

Exercise 13

If you were asked to develop a coding system that enabled you to store the view from your window in the form of perceptual data in a computer, how do you think it would compare, in terms of complexity, with that of the DNA code?

Discussion

DNA has a very simple code: just four values or letters. A scene such as the one I see out of my window at the moment is highly complex. It contains innumerable colours, light and shade, lines and edges, and visual depth, with objects nearby appearing focused and those further away progressively less distinct. So I would say that encoding this for use by a computer would require a complex code.
You may have a somewhat different answer, but your answer should have taken into account the complexity of virtually any scene.
Fortunately, applications for processing graphical data (even complex graphical data like photographs of scenes outside my window) take care of this complexity. They let the user work with such graphical data, not at the level of individual codes, but at higher levels of abstraction, such as deepening a colour's hue, altering the contrast, and so on.

Screening for genetic defects

Now that scientists have mapped the human genome, computers can be used to detect genetic defects.
Screening for genetic diseases existed before the application of computers. Family histories were used, together with a knowledge of inheritance patterns and statistics, to determine the likelihood of a couple having offspring with genetic disorders such as sickle cell anaemia.
Some genetic disorders such as phenylketonuria have had simple chemical detection tests available for some time. Once detected, careful control of diet prevents mental retardation, demonstrating the value of detecting the presence of a genetic disease before any symptoms have appeared.
What the computer adds to the screening process is the power to compare very long genetic sequences (i.e. sequences of base pairs) against the human genome in a way that would be far too time consuming (and therefore expensive) to be carried out by hand. Once a particular gene and type of defect has been identified, it becomes possible to develop a test to find out whether a patient has that genetic defect well before any signs of it appear.
Genetic tests are used for several reasons, including:

  • prenatal diagnostic testing;

  • testing to predict adult-onset disorders such as Huntington's and Alzheimer's disease;

  • forensic and identity testing.

Example 5 Breast cancer and genetics

Breast cancer is one of the commonest cancers in women (it occurs in men as well, albeit rarely). The success of treatment following early diagnosis led to a great deal of research in ways of identifying the cancer in the population at large. Some time before the mapping of the human genome it was already known that between 10 and 15 per cent of breast cancers are familial in origin (i.e. groups of related individuals show a greater than average tendency to develop the disease).
Following the mapping of the human genome, it was determined that about one-third of familial cancers are attributable to defects in two genes known as BRCA1 and BRCA2. Now there is a genetic test to determine whether or not a woman whose family history includes a high incidence of breast cancer is carrying these defective genes. If she is, her risk of developing breast cancer over her lifetime is between 56 and 85 percent; and she has a greater than average probability of developing ovarian cancer.
However, there is little point in having a test if there are not corresponding means of providing help. In the case of breast cancer, increased frequency in screening can help detect the cancer at an early stage (and thereby increase the effectiveness of treatment). More controversial is the preventive removal of breast tissues, which imposes a heavy emotional and physical burden without being completely effective. As with so many technological developments, there are costs associated with their use.
It is hoped that using information related to the human genome will lead to ways in which genetic defects can be corrected or their effects lessened.
There are a number of genetic databases that can be accessed over the internet. Using them to detect defects involves searching enormous databases containing genetic sequences which requires huge computational effort.
This case study on DNA has illustrated three main points:
  1. DNA data is coded in a very simple way (with just four letters of the alphabet);

  2. such a simple code can still generate complex, multiple structures;

  3. searching such a structure is a time-consuming task.

SAQ 6

How can a simple code, such as the DNA bases, become such a complex problem for computing?

Answer

Although the code is simple, the bases combine in very complex structures called genes (analogous to words in a language) that can be combined into more complex structures called chromosomes (analogous to a volume of a large encyclopedia). Searching for a particular genetic defect in the genetic structure of the human being is not a trivial task. Apart from the size of the search, there are likely to be many instances of the same combination of base pairs (just as searching for the word ‘king’ in the collected works of Shakespeare would yield a large number of hits, including some false ones such as ‘lurking’).

5.1.2 The human genome

All life is ‘encoded’ chemically in genes. What this means is that the structure of an organism, the organs it possesses, its colouring, and so on are all determined by different genes. A very simple organism may have just a few genes, and a complex one tens of thousands. The ‘map’ of an organism's genes is referred to as its genome. It shows, in essence, which genes give rise to which characteristics or traits of the organism. The word ‘template’ would describe the genome better than ‘map’.
Figure 13 shows the 23 pairs of human chromosomes that constitute the structure of the human genome. These chromosomes contain between 30,000 and 40,000 genes in total. For each human characteristic, such as eye or hair colour, the human genome shows where the genes are that control that characteristic.

Until recently, the idea of mapping the genome of even a simple organism was just that, an idea. The work involved in extracting genetic material, examining it and mapping it to known traits, would be analogous to sitting a dozen people down at typewriters and asking them to write a multi-volume encyclopedia. It could have been done, but it would have been time consuming (and therefore costly).
Why do it? DNA acts like a computer program. Just as programs instruct a computer to produce certain outputs, DNA instructs the body to develop proteins that make up tissues, cells, antibodies, and so on in a certain way. If there is a defect in a person's genetic makeup then problems can occur; for example, that person might be more susceptible than average to certain diseases. Mapping the human genome offered some enticing possibilities:
  • better understanding of diseases, particularly complex and threatening diseases like cancers;

  • an understanding of the relationship between different human groups. For example, are we descended from one pair of proto-humans, or did different groups have many different origins?




Figure 13
Figure 13 The human chromosomes. An X and Y chromosome is shown as the final pair, meaning that the individual would be a male (females have two X chromosomes)
International effort to map the human genome began in 1995, when it was estimated that the project would require US$3 billion and take eight years. But, due to the development of computer-controlled robotic laboratory techniques and improvements in information technology (IT) systems, the Sanger Centre announced in 2000 the first draft of the human genome.

5.1.1 What is DNA?

DNA (deoxyribonucleic acid) is frequently in the news for four main reasons.
  1. DNA can be used in crime detection to eliminate innocent suspects from enquiries or, conversely, to identify with a very high degree of probability the guilty.

  2. DNA is now used in medicine to detect the possibility that diseases having a genetic origin may occur in an individual. This enables doctors to prescribe preventative treatments.

  3. It is hoped that discoveries about DNA will yield important new treatments for hitherto intractable diseases and conditions.

  4. DNA can be used to identify victims of disasters, and establish whether people are related.



Figure 12 illustrates the following characteristics of DNA.
  • DNA has the shape of an immensely long twisted ladder (the famous double helix) in which each pair of chemical bases in the strand can be thought of as a rung in the ladder.

  • It consists of pairs of chemical bases called adenine (A), cystosine (C), guanine (G) and thymine (T).

  • The bases (which in Figure 12 are colour coded) can only be paired according to the rules: A to T and C to G.

  • A ‘rung’ or pair of bases (e.g. A–T) is called a base pair.

  • A nucleotide is a base pair plus its attached ‘structural’ molecules (i.e. the sides of the ladder).

  • Sequences of base pairs constitute genes which are the sections of a DNA strand that form discrete units of heredity (such as eye colour).

  • A complete DNA strand constitutes a chromosome (a human being has 46 of these combined into 23 pairs).

  • The four letters (A, C, G, and T) representing the DNA bases constitute ‘signs’ symbolising the building blocks of DNA. You can think of a set of signs as a code.

Figure 12
Figure 12 A DNA strand, bases, nucleotides, genes, and a chromosome (a) A small section of a DNA strand as though it were untwisted. Each box represents a base (A, C, G or T). Each pair of bases forms one nucleotide. Several nucleotides make up a gene (shown by brackets) (b) How the strand of DNA in (a) is twisted into the famous double helix (c) A chromosome formed from one DNA strand.

Exercise 11

The English alphabet is a system of signs that consists of 26 letters, from A to Z. There are rules that govern how letters can form words in English. For example, the combination ‘m-s’ cannot be used to begin a word, but is acceptable within or at the end of a word. This limits the number of English words it is possible to form.
Words, and parts of words, can be combined to make longer words. For example, adding an ‘s’ to ‘dog’ makes ‘dogs’, and preceding ‘mill’ with ‘wind’ gives ‘windmill’. Rules also determine that ‘windmill’ is all right, but ‘millwind’ is not.
  1. Considering these facts, how many words do you think the English language has?
    Now think about things that can be said using the English language: utterances. These consist of words strung together according to a set of rules known as grammar.

  2. How many utterances do you think it's possible to make in English?

    Discussion

    1. A standard, reputable dictionary will have between 30,000 and 50,000 entries. Even this is only part of the story since most dictionaries do not include slang, dialect words or words that exist for only a very short period of time. Neither do they contain specialised vocabularies that exist in certain professional and trade groups (e.g. among doctors). Thus the likely total vocabulary of English is (at a guess) in excess of 100,000 words.

    2. The number of utterances possible in English is virtually infinite. This is because, even given the rules of grammar, they can vary in length and word order.

    Exercise 11 shows how a relatively simple code (signs like the alphabet) can be combined in simple and complex ways to produce an enormous variety of possible ‘products’ (utterances in English).

    Exercise 12

    Think of the DNA bases (A, C, G, and T) as forming a code similar to the alphabet, i.e. four ‘signs’ that can be combined according to rules to form genes. The genes in turn are combined into structures called chromosomes (i.e. DNA strands) of which the human being has 46 in 23 pairs. Given this structure, a gene is analogous to an English word, a chromosome to a volume of English utterances, and all 23 pairs of chromosomes to the volumes of an encyclopedia.
    1. At a guess, how many base pairs, like A–C, do you think the 23 pairs of human chromosomes have?

    2. What might that answer tell you about how difficult a problem it is to develop a full understanding of the human genetic structure?

    Discussion

    1. The longest human chromosome has about 263 million base pairs, the shortest 50 million. For all 23 pairs the total exceeds 3.2 billion (i.e. 3,200,000,000).

    2. The base pairs in a gene can vary, which is what gives us genetic diversity. So the problem of trying to understand the genetic structure of humans is roughly analogous to trying to read and understand all the sentences in a huge, multi-volume encyclopedia!

    These two Exercises demonstrate that having a simple code is no guarantee of a simple system! What can be produced lies not in the simplicity or complexity of the code, but in the possibilities for combinations and the stringing together of small parts to form larger products. In other words, simple elements of data can generate a huge amount of information.




5 Computers as tools for working with data

5.1 Genetic databases and disease

Section 2 looked at data and information from two different perspectives: that of the individual and that of commercial organisation. The type of data you have will dictate both why you want to process it using a computer and, to a large extent, how that is done.
This section contains two short case studies whose unifying theme is that the computer and its programs are tools for working with data. The two studies provide an interesting contrast between:
  • simple data in large and complex structures (which require large and complex programs to handle them), and

  • complex data which a complex program helps a non-expert to handle in some interesting, creative, flexible ways.

This subsection uses a case study to show how simple data (the four bases in DNA) can be combined in different ways to create a huge and complex collection of information.

4.4 Summary

This section described how computers can be used in geographical applications (and in doing so it discussed maps and showed how modern maps are composed of layers of different data).
It discussed the GPS to demonstrate how computers can communicate in order to solve a problem, such as navigation.
It also showed how the geographical data that supports both map-making and the GPS navigation system can be presented in different forms such as a map, a list of directions, a moving graphical display on a navigation device such as a GPS receiver or as spoken directions. The reasons why one form of presentation is preferable over another were discussed: it depends on fitness-for-purpose, i.e. on the requirements of the user and/or the situation in which the information is needed.
Finally the section described how computers can be used to find information on the web. The two activities associated with this section introduced you to gateways and to the simple and advanced use of search engines.

4.2.3 Using a search engine more effectively

The search shown in Figure 9(b) is an example of how to use a search engine in a simple way. However, one of the problems with finding information on the web is that there is so much! And not all of it is relevant to what you want. My search for ‘rugby’ and ‘wales’ using the Google search engine yielded about 420,000 results or ‘hits’ (see the information contained in the blue strip on Figure 9(b)). The first few sites listed will probably tell me what I want to know. But what about all the others? Are they all about the game of rugby in Wales?
The answer is ‘no’. A website about rugby in New South Wales, Australia also appeared as a result of this search. Google didn't make a mistake since the site contains the chosen keywords. However, it wasn't smart enough to distinguish between Wales and New South Wales.
If you are just looking (‘surfing’) for information in a general way, too much information isn't always a problem. Where it becomes irritating and counterproductive is when you are looking for some quite specific information.

Example 4

Suppose you're interested in genealogy, and your surname is Bird. If you search on the web by typing in the keywords ‘bird’ and ‘family’, the web server will return every website it finds with those two words in it, so you'll probably find scientific and hobby sites on bird ‘families’ such as the passerines! It's clearly not what you want, but do you need to examine all the websites returned (which could run into hundreds) to find the one you're looking for?
The answer is that there are ‘tricks’ that you can use to narrow down your search to eliminate at least some of the things you aren't looking for. Each search engine has its own ‘tricks’, though the concepts of making more targeted searches are common to most search engines. Search engine screens will generally have a selectable topic called something like ‘Advanced Search’ or ‘Search Tips’.
One obvious trick is to choose your keywords carefully. The more specific the keywords you choose, the more likely you are to get what you want. For example, if you want to find information on antique chairs, typing in just the keyword ‘antique’ will return all websites that use the word antique, and typing in the keyword ‘chair’ by itself will return all websites that use the word chair. But typing in both keywords will only return websites that use both words. The more keywords you add, the more targeted will be the websites returned to you. So adding ‘British’ to ‘antique’ and ‘chair’ will only return websites that have all three words in them.

Exercise 10

How could you adapt this trick of using more keywords to help you look for the Bird family?

Discussion

You could choose to enter the keywords ‘bird’ and ‘genealogy’ (the study of family lineages). This will almost certainly eliminate websites about storks and flamingos, or you could add an additional term to ‘bird’ and ‘family’ by specifying ‘bird family history’.

Interestingly, if you have misspelled the keyword ‘genealogy’ as ‘geneology’ some search engines will not match it to websites containing the term ‘genealogy’. Others will respond with the closest word possible. Google, for example, will respond to ‘geneology’ with the message ‘Did you mean genealogy’ together with some websites related to genealogy. Some search engines can't match ‘family’, say, with its plural ‘families’. So if, in a particular search you don't get any matches (called hits), one strategy is to try making plural keywords singular and vice versa. Also remember to check your spelling carefully.
Another useful strategy is to look for phrases rather than individual words. In Exercise 10, I mentioned that you might use ‘bird family history’ to look for information on the Bird family. This might yield a response that includes anything about the animal ‘bird’ using the scientific term ‘family’ and any use in any context of the word ‘history’. However, if you were to enclose the words ‘family’ and ‘history’ in quotation marks (as ‘family history’), the web server will only return websites that contain the word ‘bird’ and the phrase ‘family history’.

SAQ 5

  1. What is a search engine? How does it differ from a browser?

  2. In carrying out a web search, how many computers (at least) are involved?

  3. What makes a computer actually do work?

  4. In what way is a gateway useful?

    Answer

    1. A search engine is a computer program that uses keywords to help users locate websites containing information they want.

    2. At least two are involved: the user's computer (the client) and the web server.

    3. A program of instructions, stored in the computer, called a computer program.

    4. A gateway provides a pre-chosen set of links on the web for a particular topic. Instead of searching the whole of the web for information, a gateway provides a very focused means of getting information that usually has been compiled by an expert.





4.2.2 Using the web more effectively: gateways

A gateway on the web is a website intended to direct users to other preselected websites containing information on a particular topic. It can also refer to a computer that acts as a message router on the internet
University librarians often set up gateways for particular areas of study, although they may be set up by anyone with sufficient expertise in a topic. Gateways may be fairly general, such as a gateway site for sciences, or more specific, such as a gateway for particle physics.
Professional or vocational bodies may also develop gateways useful to their members, as may hobby organisations. A well-known gateway for people interested in family history and genealogy is Cyndi's List. This is updated by volunteers who notify new links relevant to topics of interest such as seventeenth and eighteenth century ships' passenger lists, local history websites, lists of names of war veterans, and so on.

Many gateway sites are searchable, often using the same search engines (e.g. Google) that are available directly through browsers. Because the search engine limits its search to the gateway site's indexes, this can prove to be a more focused way to search, particularly if the topic is one that is likely, in the wider web, to yield lots of spurious results.
Figure 11 shows the main page of a gateway website about historical maps and cartography aimed at academics, students, historians and map collectors. It contains the following:
  • a selectable list of main topics on the left, each of which may contain links to other pages or other websites;

  • selectable boxes at the top giving the index to the site, a site map page explaining how the site is organised, an ‘ABOUT’ link telling the user who hosts the site (the Institute of Historical Research at the University of London), and a ‘WHAT'S NEW’ link with information about recent changes to the site;

  • welcoming messages (stating who the intended audience of the site is);




  • Figure 11
    Figure 11 The main page of the gateway website for map history and the history of cartography