Assignment 2#
Water is the basic component of human life. It plays an important role in the
economic and social development activities in the world.
It plays a pivotal role in poverty alleviation through the enhancement of food
security and the environment. The availability of safe and clean water raises the
standard of living while the inadequacy of it poses serious health risks and leads
to the decline in the living standards and life expectancy.
The presence of clean water is almost non-existent in the presence of poverty
conditions and the abundance of numbers. This is what is happening in Tanzania.
Tanzania has many fresh water such as lakes, rivers, streams, dams and
groundwater. However, these are not reasonably distributed throughout the
country. Tanzania has some areas that lack surface and ground water sources.
We are using the data from Tanzanian Ministry of Water to predict which water
pumps are functional, functional needs repairs, and non functional.
The dataset has features such as the location of the pump, water quality, source
type and extraction technique used. The training set has 59,401 rows and 41
features including an output column. The output column specifies the status of
the water pump in the category of functional, functional needs repairs, or non
functional.
, Features Attribute
Id : Unique Id Number - Ordinal Attribute
Amount_tsh : amount water available to water points - Numeric Attribute.
Data_recorded : The date the row was entered - Interval Attribute.
Funder : Who funded the well – Nominal Attribute.
Gps_height : Altitude of the well - Ratio Attribute.
Installer : Organization that installed the well – Nominal Attribute.
Longitude : GPS coordinate - Ratio Attribute.
Latitude : GPS coordinate - Ratio Attribute.
Wpt_name : Name of the waterpoint if there is one – Nominal Attribute.
Num_privete : Interval Attribute.
Basin : Geographic water basin – Nominal Attribute.
Subvillage : Geographic location – Nominal Attribute.
Region : Geographic location - Ordinal Attribute.
Region_Code : Geographic location (coded) - Interval Attribute.
District_Code : Geographic location (coded) - Interval Attribute.
Lga : Geographic location - Ordinal Attribute.
Ward : Geographic location - Ordinal Attribute.
Population : Population around the well - Interval Attribute.
Public_meeting : True/False Binary Attribute.
Recorded_by : Group entering this row of data – Nominal Attribute.
Scheme_management : Who operates the waterpoint - Nominal Attribute.
Scheme_name : Who operates the waterpoint - Nominal Attribute.
Permit : True/False If the waterpoint is permitted - Binary Attribute.
Construction_year : Year the waterpoint was constructed - Ordinal Attribute.
Exraction_type : The kind of extraction the waterpoint uses – Nominal Attribute.
Extraction_type_group : The kind of extraction the waterpoint uses – Nominal Attribute.
Extraction_type_class : The kind of extraction the waterpoint uses – Nominal Attribute.
Management : How the waterpoint is managed – Nominal Attribute.
Management_group : How the waterpoint is managed – Nominal Attribute.
Payment : What the water costs - Ordinal Attribute.
Payment_type : What the water costs - Ordinal Attribute.
Water_quality : The quality of the water - Ordinal Attribute.
Quality_group : The quality of the water - Ordinal Attribute.
Quantity : The quantity of the water - Ordinal Attribute.
Quantity_group : The quantity of the water - Ordinal Attribute.
Source : The source of the water – Nominal Attribute.
Source_type : The source of the water – Nominal Attribute.
Source_class : The source of the water – Nominal Attribute.
Waterpoint_type : The kind of waterpoint – Nominal Attribute
Waterpoint_type_group : The kind of waterpoint – Nominal Attribute
Status_group : Functionality – Nominal Attribute
Water is the basic component of human life. It plays an important role in the
economic and social development activities in the world.
It plays a pivotal role in poverty alleviation through the enhancement of food
security and the environment. The availability of safe and clean water raises the
standard of living while the inadequacy of it poses serious health risks and leads
to the decline in the living standards and life expectancy.
The presence of clean water is almost non-existent in the presence of poverty
conditions and the abundance of numbers. This is what is happening in Tanzania.
Tanzania has many fresh water such as lakes, rivers, streams, dams and
groundwater. However, these are not reasonably distributed throughout the
country. Tanzania has some areas that lack surface and ground water sources.
We are using the data from Tanzanian Ministry of Water to predict which water
pumps are functional, functional needs repairs, and non functional.
The dataset has features such as the location of the pump, water quality, source
type and extraction technique used. The training set has 59,401 rows and 41
features including an output column. The output column specifies the status of
the water pump in the category of functional, functional needs repairs, or non
functional.
, Features Attribute
Id : Unique Id Number - Ordinal Attribute
Amount_tsh : amount water available to water points - Numeric Attribute.
Data_recorded : The date the row was entered - Interval Attribute.
Funder : Who funded the well – Nominal Attribute.
Gps_height : Altitude of the well - Ratio Attribute.
Installer : Organization that installed the well – Nominal Attribute.
Longitude : GPS coordinate - Ratio Attribute.
Latitude : GPS coordinate - Ratio Attribute.
Wpt_name : Name of the waterpoint if there is one – Nominal Attribute.
Num_privete : Interval Attribute.
Basin : Geographic water basin – Nominal Attribute.
Subvillage : Geographic location – Nominal Attribute.
Region : Geographic location - Ordinal Attribute.
Region_Code : Geographic location (coded) - Interval Attribute.
District_Code : Geographic location (coded) - Interval Attribute.
Lga : Geographic location - Ordinal Attribute.
Ward : Geographic location - Ordinal Attribute.
Population : Population around the well - Interval Attribute.
Public_meeting : True/False Binary Attribute.
Recorded_by : Group entering this row of data – Nominal Attribute.
Scheme_management : Who operates the waterpoint - Nominal Attribute.
Scheme_name : Who operates the waterpoint - Nominal Attribute.
Permit : True/False If the waterpoint is permitted - Binary Attribute.
Construction_year : Year the waterpoint was constructed - Ordinal Attribute.
Exraction_type : The kind of extraction the waterpoint uses – Nominal Attribute.
Extraction_type_group : The kind of extraction the waterpoint uses – Nominal Attribute.
Extraction_type_class : The kind of extraction the waterpoint uses – Nominal Attribute.
Management : How the waterpoint is managed – Nominal Attribute.
Management_group : How the waterpoint is managed – Nominal Attribute.
Payment : What the water costs - Ordinal Attribute.
Payment_type : What the water costs - Ordinal Attribute.
Water_quality : The quality of the water - Ordinal Attribute.
Quality_group : The quality of the water - Ordinal Attribute.
Quantity : The quantity of the water - Ordinal Attribute.
Quantity_group : The quantity of the water - Ordinal Attribute.
Source : The source of the water – Nominal Attribute.
Source_type : The source of the water – Nominal Attribute.
Source_class : The source of the water – Nominal Attribute.
Waterpoint_type : The kind of waterpoint – Nominal Attribute
Waterpoint_type_group : The kind of waterpoint – Nominal Attribute
Status_group : Functionality – Nominal Attribute