Data can be defined as a representation of facts, concepts, or instructions in a formalized manner,
which should be suitable for communication, interpretation, or processing by human or electronic
machine.
Data is represented with the help of characters such as alphabets (A-Z, a-z), digits (0-9) or special
characters (+,-,/,*,<,>,= etc.)
What is Information?
Information is organized or classified data, which has some meaningful values for the receiver.
Information is the processed data on which decisions and actions are based.
For the decision to be meaningful, the processed data must qualify for the following characteristics
−
Timely − Information should be available when required.
Accuracy − Information should be accurate.
Completeness − Information should be complete.
Data Processing Cycle
Data processing is the re-structuring or re-ordering of data by people or machine to increase their
usefulness and add values for a particular purpose. Data processing consists of the following basic
steps - input, processing, and output. These three steps constitute the data processing cycle.
Input − In this step, the input data is prepared in some convenient form for processing. The form
will depend on the processing machine. For example, when electronic computers are used, the
input data can be recorded on any one of the several types of input medium, such as magnetic
disks, tapes, and so on.
,Processing − In this step, the input data is changed to produce data in a more useful form. For
example, pay-checks can be calculated from the time cards, or a summary of sales for the month
can be calculated from the sales orders.
Output − At this stage, the result of the proceeding processing step is collected. The particular
form of the output data depends on the use of the data. For example, output data may be pay-
checks for employees.
The foundation of data analysis in statistics lies in the collection of data. Data is nothing but
unorganized facts and figures which are collected for a certain purpose, like an analysis. The
medium through which data is collected is termed as a source of data.
Sources of data are of two types; these are the following –
Statistical Data
This type of data source refers to collection of data which are used for official purposes, such as
population census, official surveys, etc.
Non-Statistical Data
This type of data source refers to collection of data which are used for various administrative
purposes, mainly in the private sectors.
Different Sources of Data
Sources of data can also be classified based on its collection methods, which are –
Internal Sources of Data
In several cases for a certain analysis, data is collected from records, archives, and various other
sources within the organisation itself. Such sources of data are termed as internal sources of data.
Example:
A school is performing an analysis to figure out the highest marks achieved in class 8 science
subjects for the last 10 years.
External Sources of Data
Data may also be collected from various sources outside the organisation for analytical purposes.
Such sources of data collection are known as external sources of data.
Example:
As a patient, you are analysing the price charts of your nearby hospitals for treatment of ulcer.
Check Your Progress –
Q. What are Different Types of Data Sources?
Sources of data can be categorized as per two basis points, i.e. purpose of data collection and type
of data source. This can be explained with the help of an illustration given below –
Types of Data
,Data can be classified into two types –
Primary Data
Data which is considered as first-hand information collected by a surveyor, investigator, etc. is
defined as Primary Data. The sources from which such data is collected is termed as the primary
source of data collection for the concerned information.
Moreover, data is regarded as primary only if it has never undergone any prior statistical treatment.
Such data is usually published, and more data is derived from the published source for other
purposes. For example, a country’s population is an application of collection of primary data.
Features of Primary Data
Primary Data has the following characteristics –
i.Such data is being collected for the first time.
ii.Primary Data is original and thereby more reliable than other types of data
iii.This kind of data has not been used for any statistical analysis before.
Secondary Data
Data which has already been collected, analyzed, published and has undergone statistical treatment
can be defined as Secondary data. Such type of data is tailored from primary data sources.
However, this kind of data can also be collected by surveyors, investigators, etc. to conduct
statistical analysis in order to derive newer information.
For example, the address you insert in food delivery apps is a common application for the use of
secondary data. Your address is not new information unless you just purchased a property.
In such cases, information regarding the address of your new property will be considered as
primary data. From this example, you can get a clear understanding of the sources of data primary
and secondary.
Features of Secondary Data
Secondary Data consists of the following features –
Secondary data is considered as ‘second-hand information’.
Secondary data is not original.
This kind of data has gone through statistical analysis at least once.
Secondary data is not reliable.
Another simple example of Secondary Data is information which is found in unapproved websites
such as Wikipedia, etc. where any user at any given time can edit the data, as per his or her wish,
provided in any page of this website.
Methods of Collecting Data in Statistics
Data collection is a standout procedure carried out by most analysts during research. As an analyst,
if you are unable to collect the necessary data for your research, your whole venture will lose its
credibility.
, So, data collection is an essential element in statistical analysis; it is a challenging duty which
requires dedication, determination, proper planning and the capability to finish the assignment.
The primary step of data collection is figuring out what kind of data is required and then starting
your analysis by collection of a sample through a specific sampling method from a certain part of
the population.
There are various methods of data collection which can be classified as per the type of data
involved, which are –
A. Collection of Primary Data
Collection of Primary Data can be done through various methods, which are –
Direct Personal Investigation
In this method, surveyors or investigators collect the data themselves. This method is suitable for
small projects where the required data needs to be reliable and excessive effort is not mandatory.
Collection with the Help of Investigators
In this method, a single or a group of correspondents collects the data for the surveyor. These
correspondents are trained investigators who are employed for this course of action. This type of
data collecting method is useful for a large population.
Collection Assisted by Questionnaires
When the amount of data which is required to be collected is significantly large, questionnaires
are used to make the data collecting process easier. Questionnaires are nothing but a set of
questions which, when answered, provide the required data. Surveyors can also mail
questionnaires to the respondents for added convenience.
B. Collection of Secondary Data
Collection of secondary data is much easier than collecting primary data. Secondary data is
available on various sources, both published and unpublished.
However, the investigator of this kind of data must ensure that the data is reliable, suitable for
analysis, whether bias is involved during sampling of the said data, etc.
Questions:
1. What are the Different Types of Data Sources Based on the Type?
Ans. Based on the type, sources of data are of two types, which are internal sources and external
sources.
2. State the Methods of Collecting Primary Data.