- Significance of Statistics
- Types of Statistical Data
- Nature of Statistical Data
- Sources of Statistical Data
- Methods of Collecting Data (Statistical Techniques)
- Methods of Recording Data
- Analysis of Data - Examining the Numerical Figures in Detail
Statistics - numerical figures collected systematically and arranged for a particular purpose.
Statistical data - information presented inform of numbers e.g
- No. of students in a school
- Mean daily temperature of a place
- Amount of milk produced daily from a farm 4. Amount of money earned from exports annually.
Statistical methods - techniques of collecting, recording, analysing, presenting and interpreting statistical data.
- Illustrates relationship between 2 or more varying quantities e.g. beans production and acreage under cultivation.
- Summarises geographical information which saves time and space.
- Makes comparison between components e.g. province with the highest number of people.
- Prediction of future trends of weather and climate.
- Prediction of natural disasters e.g. droughts and floods.
- Planning for provision of social amenities e.g. hospitals and schools.
- First hand or original information from the field e.g.
- Mean daily temperature from a weather station
- 2nd hand information available in stored sources compiled by other researchers e.g.
- Reference books
- Video/audio tapes
- Census reports
- Discrete Data - Which is given in whole numbers e.g. 16 elephants, 1093 tonnes of wheat
- Continuous Data -Facts and figures which can take any value e.g.
- Fractions e.g.23 ¼
- Decimals e.g. 6.20 mm
- Values within range e.g. 0-30◦c
- Grouped Data -Which is non precise/exact but values range in groups e.g.
Age group Number of boys 15-19 32 20-24 8
- Primary Sources
- People or places which have 1st hand or original information. The information can be collected by observation, measuring, counting, photographing etc.
- Give first hand information
- The information cant be got from other sources
- Secondary sources
-Materials in which information collected by others was stored e.g. text books, reference books, etc.
-Use of eyes to observe features or weather then information is recorded immediately e.g. cloud cover, rocks, soil, land forms, vegetation, etc.
- Gives 1st hand information which is reliable.
- Relevant material to the study is collected.
- Time saving since one doesn’t have to look for data in many places.
- Data on past activities isn’t available.
- May be hindered by weather conditions e.g. mist and dust storms.
- Ineffective for people with visual disabilities.
- Tiresome and expensive as it involves a lot of travelling because physicalpresence is required.
-Gathering information from people by direct discussions then answers are recorded. It may be face to face or on a telephone. A questionnaire prepared in advance is used.
- One should be polite
- Warm and friendly
- Respondents/ interviewees should be assured information is confidential.
- Respondent should not be interrupted when answering questions.
- They should not be given clues but answers should come from them.
- Reliable first hand information is collected.
- Interviewer can seek clarification incase of ambiguity of answers.
- Can be used on illiterate.
- Interviewer can gauge the accuracy of responses.
- Time consuming since one person can be handled at a time.
- Expensive and tiresome as extensive travelling is required to meet the respondents.
- May encounter language barrier if the respondent doesn’t speak the same language as the interviewer.
- A respondent may lie, exaggerate or distort facts leading to collection of wrong information.
- Administering questionnaires
-Set of systematically structured questions printed on paper used on interviews or sent to respondents to fill answers.
- Open-ended questionnaire-in which respondent is given a chance to express his views. The disadvantage is that different answers are given which are difficult to analyse.
- Closed-ended (rigid) questionnaire-in which respondents are given answers to choose from.
Characteristics of a good questionnaire
- Uses simple language
- Systematically arranged from simple to difficult
- Clear questions
- Doesn’t touch on respondent’s privacy
- Comparisons can be made since questions are similar.
- First hand information which is relevant to current trends and situation is collected.
- Saves money on travelling as physical presence isn’t required.
- Saves time as all respondents are handled at the same time.
- A lot of information can be collected.
- Difficult analysis due to different answers.
- Some questionnaires may be sent back while blank by lazy respondents.
- Can’t be used on illiterate respondents.
- Some respondents may write wrong information.
- Content analysis
- Technique of collecting data from secondary sources.
- This is by reading, watching films, viewing photographs and listening to get what is relevant.
- Easy to get data if analysed.
- Cheap as there isn’t extensive travelling
- Saves time as all information is in one place.
- Possible to get old data
- Difficult to verify accuracy of data
- Data may be irrelevant to current trends
- Up to date data may not be readily available
-Determining distances, areas, height or depth using instruments and recording.
- Distance can be estimated by pacing or taking steps of equal and unknown length.
- Collecting Samples
-Getting a small part e.g. of soil, rock or vegetation to represent the whole to be used to carry out tests in the laboratory.
- Counting/census taking -Arithmetical counting and recording.
- Photographing -Capturing on film or video and still photographs.
-Using tools such as hoe pick axe, spade or soil auger to get samples of soil and rocks.
- Feeling and touching
-Using fingers to feel the surfaces of soils and rocks to get their textures.
-Examining by taking a sample -a part representing the whole (population).
Types of Sampling
- Random Sampling
-Selection of members of a group haphazardly where every item has an equal chance of being selected e.g. to select 5 students to go for a tour from a class:
• Class members write their names on pieces of paper
• They are folded and put in a basket
• The basket is shaken and fives papers are taken out
- Systematic Sampling
-Selection of members of a sample from an evenly distributed phenomena at regular intervals e.g. after every 10 items/members.
- Stratified sampling
-Selection of members of a sample by breaking the population into homogenous groups e.g. to select 6 students to go for a tour:
• Break the class into boys and girls
• Select 3 student from each group by random or systematic sampling Combine units from each group to form the required sample.
- Cluster Sampling
-Selection of sample by dividing the sample into clusters with similar characteristics then a sample is taken from each cluster and representative choices from each cluster are combined to form a sample e.g. to sample the housing cost an estate is chosen to represent each group and representative choices are chosen from each estate and combined to form a sample.
- It’s less expensive
- It saves time
- It avoids bias
- A poor selected sample can lead to misleading information
- Systematic sampling to an evenly distributed population
- Random Sampling
–Conducting a test or investigation to provide evidence for or against a theory e.g. to determine the chemical composition of rocks and soils.
- First hand data is obtained
- Gives accurate results if properly conducted.
- It can lead to further discoveries
- May be expensive as it involves use of expensive equipment.
- May be time consuming
- Use of defective instruments may lead to inaccurate results
- Improper handling of equipment and chemicals may lead to accidents
-Methods off storing information to avoid losing it.
- Note Taking
- Writing in a note book what is being observed, answers during interviews and then notes are compiled in school or office when writing report.
- Filling In Questionnaires
- Filling answers in questionnaires which are responses from a respondent by an interviewer or respondent himself which he/she then sends back.
- Making 4 vertical or slanting strokes and the 5th across the 4 to record data obtained by counting or measuring similar items.
- Drawing of tables and filling in data systematically e.g. weather recording sheets.
Month J F M A M J J A S O N D Temp(◦c) 24 24 23 22 19 17 17 18 19 20 22 23 Rainfall(mm) 109 122 130 76 52 34 28 38 70 108 121 120
- Field Sketching
- Summarising information observed in the field by making a rough drawing of landscape and labelling the essential information.
- Mapping/Drawing Maps
- Drawing of a rough map of an area of study and labelling in words or symbols accompanied by key.
- Tape Recording
- Recording image of an object or landscape on a film which is processed to get a photograph then the photographs are labelled to avoid mix up during storage.
- Labelling samples
- Recording conversations during interviews on audio tapes using a tape recorder.
- Permission should be got from the respondent to record his/her responses.
- It’s used if responses are too many to be recorded on a note book.
- It allows smooth flow of discussion as asking respondents to repeat answers would irritate them.
- Calculation of Percentages
-If in the study of a farm 10 hectares are devoted to coffee, what is the % of the area under coffee?
The table below shows the number of tourists who visited Kenya from various parts of the world in 2006.
Place of origin No of tourists per year 2005 2006 Europe
Total 1159000 1247000
- Calculate percentage increase of tourists from Africa between 2005 and 2006.
- Measures of Central Tendency
-Outstanding general characteristics of the data.
- Arithmetic Mean
- Easy to calculate for a small data
- Summarises data using a single digit
- Easy to understand and interpret
Difficult to calculate for grouped data Affected by extreme values
-The middle value in a set of data arranged in order. M= (N+1)/2
(I) 20, 50, 90, 100, 150, 180, 200, 220, 240, 300, 360.
(II) 20, 50, 90, 100, 150, 180, 200, 220, 240, 300.
- Easy to calculate in a small data set
- Easy tounderstand as it’s the value at the middle
- Difficult to calculate in a large data set
- Doesn’t show data distribution
- Calculation of Ranges
-Difference between the largest and smallest values. Calculate the range of for the data above.
-Most frequently recurring value in a set of data.
10, 2, 5, 9, 10, 11, 20, 15, 18, 10.
The mode is 10.
- Easy to find as no calculation is involved
- Easy to understand
- Rarely used as a measure of central tendency
- Arithmetic Mean
- 2 dimensional drawings which show relationships between 2 types of data representing two items also called variables.
- These are dependent variable which is affected by the other e.g. temperature (on y axis) and independent variable whose change is not affected by the other e.g. altitude (on x axis).
- Draw x and y axis.
- Choose suitable scale to accommodate the highest and lowest value.
- Plot the values accurately using faint dots.
- Join the dots using curved line. If it’s a bar graph the dots should be at the middle of the top line. Years should also be at the middle. You should have also decided on the width of the bars.
- In data without continuity e.g. crop production there should be gaps between bars and for one with continuity e.g. rainfall bars should not have gaps.
- Draw vertical lines on either side of the dot then draw horizontal line to join them with the dot.
- Shade uniformly if they are representing only one type of data and differently if representing one type of data.
- In combined line and bar graph temperature figures are plotted on the right hand side of y-axis while rainfall on the left
- Don’t start exactly at zero.
- Include temperature and rainfall scales.
- Start where the longest bar ends.
- Labelled and marked x and y axis starting at zero.
- Key if required e.g. in comparative bar graph.
- Accurately plotted and lines, curves or bars properly drawn.
- Easy to construct
- Easy to interpret
- Easy to read/estimate exact values.
- Shows trend or movement overtime.
- Doesn’t give a clear impression on the quantity of data.
- May give false impression on the quantity especially when there was no production.
- Poor choice of vertical scale may exaggerate fluctuations in values.
- Difficult to find exact values by interpolation.
- Easy to construct.
- Easy to interpret.
- Easy to read.
- Gives a clear visual impression on the quantity of data.
- Poor choice of vertical scale may cause exaggeration of bars.
- Doesn’t show continuity/ variation of data overtime.
- Unsuitable technique when values exist in continuity.
- Not possible to obtain intermediate values from the graph.
- Easy to construct.
- Easy to read.
- It shows relationship between two sets of data.
- Difficult to choose suitable scale when values of variables differ by great magnitude.
- Considerable variation of data represented by the line may cause the line the bars thus obscuring the relationship.
- Doesn’t show relationship between the same sets of data of more than one place.
Temperature and Rainfall for Thika
Analysis and Interpretation
- The month with heaviest rainfall is May.
- The month with lowest rainfall is July.
- The hottest month was January and February.
- The months with lowest temperature were June and July.
Crop Production in Kenya in the Years 2001 and 2002
Amount in metric tonnes
Value of export Crops from Kenya (ksh million)
- If the data has large figures e.g. 195262 plot in 1000s=195, 184,988=185.
- You can draw comparative/group/multiple line and bar graphs from the data.