In these notes, I want to lay the
foundation of some core concepts you will
need throughout these notes. Let's
get started with the basics. So what is a
data structure? one definition that I really
like is a data structure is a way of organizing
data so that it can be used efficiently. And
that's all data structure really is it is
a way of organizing data in some fashion so
that later on it can be accessed, queried,
or perhaps even updated quickly and easily.
So why data structures? Why are they important?
Well, they are essential ingredients in creating
fast and powerful algorithms. Another good
reason might be that they help us manage and
organize our data in a very natural way. And
this last point, is more of my own making.
And it's that it makes code cleaner and easier
to understand. As a side note, one of the
major distinctions that I have noticed from
bad mediocre to excellent programmers is that
the ones who really excel are the ones who
fundamentally understand how and when to use
the appropriate data structure for the task
they're trying to finish. data structures
can make the difference between having an
okay product and an outstanding one. It's
no wonder that every computer science undergraduate
student is required to take a course in data
structures. It is strange that before we even
begin talking about data structures that we
need to talk about the abstraction of data
structures. What I'm talking about is the
concept of an abstract data type. What is
an abstracted type and how does it differ
from a data structure? Well, the answer is
that an abstract data type is an abstraction
of a data structure, which provides only the
interface to which that data structure must
adhere to. The interface does not give any
,specific details on how
a specific data structure should be implemented,
or in what programming language. One particular
example I like to give is to suppose that
your abstract data type is for a mode of transportation
to get from point A to point B. Well, as we
both know, there are many modes of transportation
to get from one place to another. So which
one do you choose? some specific modes of
transportation might be walking or biking,
perhaps even taking a train and so on, these
specific modes of transportation would be
analogous to the data structures themselves.
We want to get from one place to another through
some mode of transportation, that was our
abstract data type. How did we do that? Exactly?
Well, that is the data structure. Here are
some examples of abstract data types on the
left, and some possible underlying implementation
on the right hand side. As you can see, a
list can be implemented in two ways. You can
have a dynamic array or a linked list. They
both provide ways of adding, removing and
indexing elements in the list. Next, we have
a queue and the map abstract data types, which
themselves can be implemented in a variety
of ways. Notice that under the implementation
for a queue, I put a stack based queue because
yes, you can have a queue, which is implemented
using only stacks. This may not be the most
efficient way of implementing a queue. But
it does work and it is possible. The point
here is that the abstract data type only defines
how a data structure should behave, and what
methods it should have, but not the details
surrounding how those methods are implemented.
Alright, now that we were done with abstract
data types, we need to have a quick look at
the wild world of computational complexity,
,to understand the performance that our data
structures are providing. So as programmers,
we often find ourselves asking the same two
questions over and over and over again. That
is, how much time does this algorithm need
to finish and also how much space Does this
algorithm need for my computation. So, if
your program takes a lifetime of the universe
to finish, then it's no good. Similarly, if
your program runs in constant time, but requires
a space equal to the sum of all the bytes
of all the files on the internet, internet,
your algorithm is also useless. So just standardize
a way of talking about how much time and how
much space is required for an algorithm to
run. theoretical computer scientists have
invented big O notation, amongst other things
like big theta, big omega, and so on. But
we're interested in Big O because it tells
us about the worst case. Big O notation only
cares about the worst case. So if your algorithm
sorts numbers, imagine the input is the worst
possible arrangement of numbers for your particular
sorting algorithm. Or, as a concrete example,
suppose you have an unordered list of unique
numbers, and you're searching for the number
seven, or the position where seven occurs
in your list. And the worst case isn't one
seven, that is at the beginning. Or in the
middle of the list, it's one seven is the
very last element of the list. for that particular
example, the time complexity would be linear
with respect to the number of elements in
the size view list. Because you have to traverse
every single element until you find seven.
The same concept applies to space, you can
just consider the worst possible amount of
space your algorithm is going to need for
that particular input. There's also the fact
, that big O really only cares about what happens
when your input becomes arbitrarily large.
We're not interested in what happens when
the input is small. For this reason, you'll
see that we ignore things like constants and
multiplicative factors. So in our big O notation,
you'll often see these coming up. Again, again,
again. So to clarify one thing,
when I say n, n is usually always want to
be the input size coming into your algorithm,
there's always going to be some limitation
of size n. So constant time referred to as
a one wrapped around a big O. If your algorithm
case in logarithmic amount of time to finish,
we say that's big O of a log event. If it's
linear, then we just say n, if it takes like
quadratic time or cubic time, then we say
that's n raised to that power, it's exponential.
Usually, this is going to be something like
two to the n three, the N, N number be greater
than one to the n. And then we also have n
factorial. But we also have things in between
these like square root of n log log of n,
n to the fifth and so on. Actually, almost
any mathematical expression containing n can
be wrapped around a big O. And is big O notation
valid. Now, we want to talk about some properties
of Big O. So to reiterate what we just saw
in the last two slides, Big O only really
cares about what happens when input becomes
really big. So we're not interested when n
is small, only
what happens when n goes to infinity. So this
is how and why we get the first two properties.
The first that we can simply remove constant
values added to our big O notation. Because
if you're adding a constant to infinity, well,
let's just infinity, or if you're multiplying
a constant by infinity, yeah, that's still
foundation of some core concepts you will
need throughout these notes. Let's
get started with the basics. So what is a
data structure? one definition that I really
like is a data structure is a way of organizing
data so that it can be used efficiently. And
that's all data structure really is it is
a way of organizing data in some fashion so
that later on it can be accessed, queried,
or perhaps even updated quickly and easily.
So why data structures? Why are they important?
Well, they are essential ingredients in creating
fast and powerful algorithms. Another good
reason might be that they help us manage and
organize our data in a very natural way. And
this last point, is more of my own making.
And it's that it makes code cleaner and easier
to understand. As a side note, one of the
major distinctions that I have noticed from
bad mediocre to excellent programmers is that
the ones who really excel are the ones who
fundamentally understand how and when to use
the appropriate data structure for the task
they're trying to finish. data structures
can make the difference between having an
okay product and an outstanding one. It's
no wonder that every computer science undergraduate
student is required to take a course in data
structures. It is strange that before we even
begin talking about data structures that we
need to talk about the abstraction of data
structures. What I'm talking about is the
concept of an abstract data type. What is
an abstracted type and how does it differ
from a data structure? Well, the answer is
that an abstract data type is an abstraction
of a data structure, which provides only the
interface to which that data structure must
adhere to. The interface does not give any
,specific details on how
a specific data structure should be implemented,
or in what programming language. One particular
example I like to give is to suppose that
your abstract data type is for a mode of transportation
to get from point A to point B. Well, as we
both know, there are many modes of transportation
to get from one place to another. So which
one do you choose? some specific modes of
transportation might be walking or biking,
perhaps even taking a train and so on, these
specific modes of transportation would be
analogous to the data structures themselves.
We want to get from one place to another through
some mode of transportation, that was our
abstract data type. How did we do that? Exactly?
Well, that is the data structure. Here are
some examples of abstract data types on the
left, and some possible underlying implementation
on the right hand side. As you can see, a
list can be implemented in two ways. You can
have a dynamic array or a linked list. They
both provide ways of adding, removing and
indexing elements in the list. Next, we have
a queue and the map abstract data types, which
themselves can be implemented in a variety
of ways. Notice that under the implementation
for a queue, I put a stack based queue because
yes, you can have a queue, which is implemented
using only stacks. This may not be the most
efficient way of implementing a queue. But
it does work and it is possible. The point
here is that the abstract data type only defines
how a data structure should behave, and what
methods it should have, but not the details
surrounding how those methods are implemented.
Alright, now that we were done with abstract
data types, we need to have a quick look at
the wild world of computational complexity,
,to understand the performance that our data
structures are providing. So as programmers,
we often find ourselves asking the same two
questions over and over and over again. That
is, how much time does this algorithm need
to finish and also how much space Does this
algorithm need for my computation. So, if
your program takes a lifetime of the universe
to finish, then it's no good. Similarly, if
your program runs in constant time, but requires
a space equal to the sum of all the bytes
of all the files on the internet, internet,
your algorithm is also useless. So just standardize
a way of talking about how much time and how
much space is required for an algorithm to
run. theoretical computer scientists have
invented big O notation, amongst other things
like big theta, big omega, and so on. But
we're interested in Big O because it tells
us about the worst case. Big O notation only
cares about the worst case. So if your algorithm
sorts numbers, imagine the input is the worst
possible arrangement of numbers for your particular
sorting algorithm. Or, as a concrete example,
suppose you have an unordered list of unique
numbers, and you're searching for the number
seven, or the position where seven occurs
in your list. And the worst case isn't one
seven, that is at the beginning. Or in the
middle of the list, it's one seven is the
very last element of the list. for that particular
example, the time complexity would be linear
with respect to the number of elements in
the size view list. Because you have to traverse
every single element until you find seven.
The same concept applies to space, you can
just consider the worst possible amount of
space your algorithm is going to need for
that particular input. There's also the fact
, that big O really only cares about what happens
when your input becomes arbitrarily large.
We're not interested in what happens when
the input is small. For this reason, you'll
see that we ignore things like constants and
multiplicative factors. So in our big O notation,
you'll often see these coming up. Again, again,
again. So to clarify one thing,
when I say n, n is usually always want to
be the input size coming into your algorithm,
there's always going to be some limitation
of size n. So constant time referred to as
a one wrapped around a big O. If your algorithm
case in logarithmic amount of time to finish,
we say that's big O of a log event. If it's
linear, then we just say n, if it takes like
quadratic time or cubic time, then we say
that's n raised to that power, it's exponential.
Usually, this is going to be something like
two to the n three, the N, N number be greater
than one to the n. And then we also have n
factorial. But we also have things in between
these like square root of n log log of n,
n to the fifth and so on. Actually, almost
any mathematical expression containing n can
be wrapped around a big O. And is big O notation
valid. Now, we want to talk about some properties
of Big O. So to reiterate what we just saw
in the last two slides, Big O only really
cares about what happens when input becomes
really big. So we're not interested when n
is small, only
what happens when n goes to infinity. So this
is how and why we get the first two properties.
The first that we can simply remove constant
values added to our big O notation. Because
if you're adding a constant to infinity, well,
let's just infinity, or if you're multiplying
a constant by infinity, yeah, that's still