Learn AI With Me: AnDs

Showing posts with label AnDs. Show all posts

Big O for Python Data Structures

Lists:

In Python lists act as dynamic arrays and support a number of common operations through methods called on them. The two most common operations performed on a list are indexing and assigning to an index position. These operations are both designed to be run in constant time, O(1).

Let's imagine you wanted to test different methods to construct a list that is [0,1,2...10000]. Let go ahead and compare various methods, such as appending to the end of a list, concatenating a list, or using tools such as casting and list comprehension.

For example:

def method1():

l = []

for n in range(10000):

l = l + [n]

def method2():

l = []

for n in range(10000):

l.append(n)

def method3():

l = [n for n in range(10000)]

def method4():

l = range(10000) # Python 3: list(range(10000))

> %timeit method1()

%timeit method2()

%timeit method3()

%timeit method4()

1 loop, best of 3: 197 ms per loop

1000 loops, best of 3: 771 µs per loop

1000 loops, best of 3: 407 µs per loop

The slowest run took 7.05 times longer than the fastest. This could mean that an intermediate result is being cached.

1000000 loops, best of 3: 308 ns per loop

We can clearly see that the most effective method is the built-in range() function in Python!

It is important to keep these factors in mind when writing efficient code. More importantly begin thinking about how we are able to index with O(1). We will discuss this in more detail when we cover arrays general. For now, take a look at the table below for an overview of Big-O efficiencies.

Dictionaries:

Something that is pretty amazing is that getting and setting items in a dictionary are O(1)! Hash tables are designed with efficiency in mind, and we will explore them in much more detail later on in the course as one of the most important data structures to undestand. In the meantime, refer to the table below for Big-O efficiencies of common dictionary operations:

Worst Case vs Best Case

Many times we are only concerned with the worst possible case of an algorithm, but in an interview setting its important to keep in mind that worst case and best case scenarios may be completely different Big-O times. For example, consider the following function:

def matcher(lst,match):

'''

Given a list lst, return a boolean indicating if match item is in the list

'''

for item in lst:

if item == match:

return True

return False

> lst =[1,2,3,4,5,6,7,8]

matcher(lst,1)

False

Note that in the first scenario, the best case was actually O(1), since the match was found at the first element. In the case where there is no match, every element must be checked, this results in a worst case time of O(n). Later on we will also discuss average case time.

Big-O notation

Big-O notation describe how quickly runtime grow relatively to the input as the input get arbitrarily large.

Now we want to develop a notation to objectively compare the efficiency of these two algorithms. A good place to start would be to compare the number of assignments each algorithm makes.

The original sum1 function will create an assignment n+1 times, we can see this from the range based function. This means it will assign the final_sum variable n+1 times. We can then say that for a problem of n size (in this case just a number n) this function will take 1+n steps.

This n notation allows us to compare solutions and algorithms relative to the size of the problem, since sum1(10) and sum1(100000) would take very different times to run but be using the same algorithm. We can also note that as n grows very large, the +1 won't have much effect. So let's begin discussing how to build a syntax for this notation.

Now we will discuss how we can formalize this notation and idea.

Big-O notation describes how quickly runtime will grow relative to the input as the input get arbitrarily large.

Let's examine some of these points more closely:

Remember, we want to compare how quickly runtime will grows, not compare

Remember, we want to compare how quickly runtime will grows, not compare exact runtimes, since those can vary depending on hardware.
Since we want to compare for a variety of input sizes, we are only concerned with runtime grow relative to the input. This is why we use n for notation.
As n gets arbitrarily large we only worry about terms that will grow the fastest as n gets large, to this point, Big-O analysis is also known as asymptotic analysis

As for syntax sum1() can be said to be O(n) since its runtime grows linearly with the input size. In the next lecture we will go over more specific examples of various O() types and examples. To conclude this lecture we will show the potential for vast difference in runtimes of Big-O functions.

Why Algorithm Analysis:

Before we begin, let's clarify what an algorithm is. In this course, an algorithm is simply a procedure or formula for solving a problem. Some problems are famous enough that the algorithms have names, as well as some procedures being common enough that the algorithm associated with it also has a name. So now we have a good question we need to answer:

How do analyze algorithms and how can we compare algorithms against each other?

Imagine if you and a friend both came up with functions to sum the numbers from 0 to N. How would you compare the functions and algorithms within the functions? Let's say you both cam up with these two separate functions:

Learn AI With Me

Search

Big O for Python Data Structures

Lists:

Dictionaries:

Worst Case vs Best Case

Big-O notation

Introduction to Algorithm analysis and Big "O"

Why Algorithm Analysis:

Contact Form

Wikipedia

Total Pageviews

Translate

Followers