Written by students who passed Immediately available after payment Read online or as PDF Wrong document? Swap it for free 4.6 TrustPilot
logo-home
Other

p programming

Rating
-
Sold
-
Pages
47
Uploaded on
07-04-2023
Written in
2022/2023

introduction to r programming is complete details to the p rogramming

Institution
Course

Content preview

Unit 1
Introduction to R
1 Introduction
Statistical computing and high-scale data analysis tasks needed a new category of computer language
besides the existing procedural and object-oriented programming languages, which would support these
tasks instead of developing new software. There is plenty of data available today which can be analysed
in different ways to provide a wide range of useful insights for multiple operations in various industries.
Problems such as the lack of support, tools and techniques for varied data analysis have been solved
with the introduction of one such language called R.

1.1 History of R
Ross Ihaka and Robert Gentleman developed R as a free software environment for their teaching classes
when they were colleagues at the University of Auckland in New Zealand. Because they were both
familiar with S, a commercial programming language for statistics, it seemed natural to use similar
syntax in their own work. After Ihaka and Gentleman announced their software on the S-news mailing
list, several people became interested and started to collaborate with them, notably Martin Mächler.
Currently, a group of 18 people has rights to modify the central archive of source code. This group is
referred to as the R Development Core Team. In addition, many other people have contributed new
code and bug fixes to the project.

Here are some milestone dates in the development of R:

Early 1990s: The development of R began.

August 1993: The software was announced on the S-news mailing list. Since then, a set of active R
mailing lists has been created. The web page at www.r-project.org/mail.html provides descriptions of
these lists and instructions for subscribing.

June 1995: After some persuasive arguments by Martin Mächler (among others) to make the code
available as “free software,” the code was made available under the Free Software Foundation’s GNU
General Public License (GPL), Version 2.

Mid-1997: The initial R Development Core Team was formed (although, at the time, it was simply known
as the core group).

February 2000: The first version of R, version 1.0.0, was released. Ross Ihaka wrote a comprehensive
overview of the development of R. The web page http://cran.r-project.org/doc/html/interface98-
paper/paper.html provides a fascinating history.

,1.2 What is R?
R is a scripting or programming language which provides an environment for statistical computing, data
science and graphics. It was inspired by, and is mostly compatible with, the statistical language S
developed at Bell laboratory (formerly AT & T, now Lucent technologies). Although there are some very
important differences between R and S, much of the code written for S runs unaltered on R. R has
become so popular that it is used as the single most important tool for computational statistics,
visualisation and data science.

1.3 Why R?
R has opened tremendous scope for statistical computing and data analysis. It provides techniques for
various statistical analyses like classical tests and classification, timeseries analysis, clustering, linear and
non-linear modelling and graphical operations. The techniques supported by R are highly extensible.

S is the pioneer of statistical computing; however, it is a proprietary solution and is not readily available
to developers. In contrast, R is available freely under the GNU license. Hence, it helps the developer
community in research and development.

Another reason behind the popularity and widespread use of R is its superior support for graphics. It can
provide well-developed and high-quality plots from data analysis. The plots can contain mathematical
formulae and symbols, if necessary, and users have full control over the selection and use of symbols in
the graphics. Hence, other than robustness, user-experience and user-friendliness are two key aspects
of R.

Why Learn R?
The following points describe why R language should be used (Figure ):

If you need to run statistical calculations in your application, learn and deploy R. It easily integrates with
programming languages such as Java, C++, Python and Ruby.

If you wish to perform a quick analysis for making sense of data.

If you are working on an optimisation problem.

If you need to use re-usable libraries to solve a complex problem, leverage the 2000+ free libraries
provided by R.

If you wish to create compelling charts.

If you aspire to be a Data Scientist.

If you want to have fun with statistics.

R is free. It is available under the terms of the Free Software Foundation’s GNU General Public License in
source code form.

,It is available for Windows, Mac and a wide variety of Unix platforms (including FreeBSD, Linux, etc.).

In addition to enabling statistical operations, it is a general programming language so that you can
automate your analyses and create new functions.

R has excellent tools for creating graphics such as bar charts, scatter plots, multipanel lattice charts, etc.

It has an object oriented and functional programming structure along with support from a robust and
vibrant community.

R has a flexible analysis tool kit, which makes it easy to access data in various formats, manipulate it
(transform, merge, aggregate, etc.), and subject it to traditional and modern statistical models (such as
regression, ANOVA, tree models, etc.)

R can be extended easily via packages. It relates easily to other programming languages. Existing
software as well as emerging software can be integrated with R packages to make them more
productive.

R can easily import data from MS Excel, MS Access, MySQL, SQLite, Oracle etc. It can easily connect to
databases using ODBC (Open Database Connectivity Protocol) and ROracle package.




Figure: Advantages of learning R language

1.4 Advantages of R over Other Programming Languages
Advanced programming languages like Python also support statistical computing and data visualisation
along with traditional computer programming. However, R wins the race over Python and similar
languages because of the following two advantages:

 Python needs third party extensions and support for data visualisation and statistical computing.
However, R does not require any such support extensively. For example, the lm function is

, present for linear regression analysis and data analysis in both Python and R. In R, data can be
easily passed through the function and the function will return an object with detailed
information about the regression. The function can also return information about the standard
errors, coefficients, residual values and so on. When lm function is called in the Python
environment, it will duplicate the functionalities using third party libraries such as SciPy, NumPy
and so on. Hence, R can do the same thing with a single line of code instead of taking support
from third party libraries.
Note: SciPy is used for performing data analysis tasks and NumPy is used for representing the
data or objects.
 R has the fundamental data type, i.e., a vector that can be organised and aggregated in different
ways even though the core is the same. Vector data type imposes some limitations on the
language as this is a rigid type. However, it gives a strong logical base to R. Based on the vector
data type, R uses the concept of data frames that are like a matrix with attributes and internal
data structure similar to spreadsheets or relational database. Hence, R follows a column-wise
data structure based on the aggregation of vectors.

Note: There are also some disadvantages of R. For example, R cannot scale efficiently for larger
data sets. Hence, the use of R is limited to prototyping and sandboxing. It is rarely used for
enterprise-level solutions. By default, R uses a single-thread execution approach while working
on data stored in the RAM which leads to scalability issues as well. Developers from open source
communities are working hard on these issues to make R capable of multi-threading execution
and parallelisation. This will help R to utilise more than one core processor. There are big data
extensions from companies like Revolution R and the issues are expected to be resolved soon.
Other languages like SPlus can help to store objects permanently on disks, hence, supporting
better memory management and analysis of high volume of massive datasets.

1.5 Benefits of Using R

Of the many attractive benefits of R, a few stand out: It’s actively maintained, it has good connectivity
to various types of data and other systems, and it’s versatile enough to solve problems in many
domains. Possibly best of all, it’s available for free, in more than one sense of the word.
1. It comes as free, open-source code: R is available under an open-source license, which means that
anyone can download and modify the code. This freedom is often referred to as “free as in
speech.” R is also available free of charge — a second kind of freedom, sometimes referred to as
“free as in beer.” In practical terms, this means that you can download and use R free of charge.
Another benefit, albeit slightly more indirect, is that anybody can access the source code, modify it,
and improve it. As a result, many excellent programmers have contributed improvements and fixes to
the R code. For this reason, R is very stable and reliable.
Any freedom also has associated obligations. In the case of R, these obligations are described in the
conditions of the license under which it is released: GNU General Public License (GPL), Version 2. The
full text of the license is available at www.r-project.org/COPYING. It’s important to stress that the GPL
does not pertain to your usage of R. There are no obligations for using the software — the obligations

Connected book

Written for

Institution
Course

Document information

Uploaded on
April 7, 2023
Number of pages
47
Written in
2022/2023
Type
OTHER
Person
Unknown

Subjects

$8.49
Get access to the full document:

Wrong document? Swap it for free Within 14 days of purchase and before downloading, you can choose a different document. You can simply spend the amount again.
Written by students who passed
Immediately available after payment
Read online or as PDF

Get to know the seller
Seller avatar
shivampandey2

Get to know the seller

Seller avatar
shivampandey2
Follow You need to be logged in order to follow users or courses
Sold
-
Member since
3 year
Number of followers
0
Documents
1
Last sold
-

0.0

0 reviews

5
0
4
0
3
0
2
0
1
0

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Working on your references?

Create accurate citations in APA, MLA and Harvard with our free citation generator.

Working on your references?

Frequently asked questions