Pandas
,Data Manipulation
• Data Science includes everything which is necessary to create and
prepare data, to manipulate, filter and clense data and to analyse
data.
• Pure Python without any numerical modules couldn't be used for
numerical tasks Matlab, R and other languages are designed for
• One of the reasons behind Python’s popularity is the efficiency in
• Data Analysis, manipulation, processing, visualization etc
• Many libraries, written in C- like languages
,Dealing with Big data
• The following concepts are associated with big data:
• volume: the sheer amount of data, whether it will be giga-, tera-, peta- or
exabytes
• velocity: the speed of arrival and processing of data
• veracity: uncertainty or imprecision of data
• variety: the many sources and types of data both structured and unstructu
1. How useful Python is for these purposes?
If we would only use Python without any special modules, this language co
only poorly perform on all the manipulation tasks
, • If we use Python in combination with its modules NumPy, SciPy, Matplotlib and Pan
belongs to the top numerical programming languages.
• It is as efficient - if not even more efficient - than Matlab or R.