{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Computational Complexity and Sorting$^1$\n", "Computational complexity refers to the resources (for example time or memory) required by different algorithms. In this lab, you'll see how different algorithms can solve the same task, sorting, with very different computational complexities. Unlike other problems we will explore in this school, sorting is interestingly a problem where quantum algorithms provide no generic speedup. \n", "\n", "## In this lab, you'll learn:\n", "* How three different sorting algorithms are implemented\n", "* What Big O notation is, and how we can use it to compare the computational complexity of different algorithms\n", "\n", "$^1$ This excerise is a barely modified version of the amazing tutorial of Santiago Valdrrama in [\"Sorting Algorithms in Python\"](https://realpython.com/sorting-algorithms-python/)\n", "\n", "In order for this code to run properly, it is important that cells are executed in sequential order.\n", "\n", "# Preliminaries and function definitions\n", "\n", "First, lets import some functions and modules for use as we go along. Everything in this section shouldn't need modification. The most important one for you might be randint, so I'll define it here\n", "\n", "randint(a, b)\n", "\n", "\n", " Return a random integer N such that a <= N <= b. Alias for randrange(a, b+1).\n" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "from random import randint # (for sampling random integers)\n", "from timeit import repeat #(for performing repeated timings)\n", "import matplotlib.pyplot as plt #for plotting results\n", "import numpy as np #for numerical tricks" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "To time your codes, I provided a function for you, run_sorting_algorithm. This function takes in the string name of a sorting algorithm, for example \"Hanks_very_fast_sort\" and a list \"Hanks_important_numbers\" via run_sorting_algorithm(\"Hanks_very_fast_sort\",Hanks_important_numbers, number_of_repeats, number_of_shots) and it will print out the minimum time taken to sort, obtained by taking \"number_of_shots\" measurements in batchs of \"batchs\". You don't need to worry how the inner workings of this function actually works. Just what it does." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "def run_sorting_algorithm(sortlist,algorithm=sorted, batches=3,shots=10):\n", "\n", " # Set up the context and prepare the call to the specified\n", " # algorithm using the supplied list called list. Only import the\n", " # algorithm function if it's not the built-in sorted().\n", " setup_code = f\"from __main__ import {algorithm}\" \\\n", " if algorithm != \"sorted\" else \"\"\n", "\n", " stmt = f\"{algorithm}({sortlist})\"\n", "\n", " # Execute the code \"number\" different times in batchs of \"repeat\" and return the time\n", " # in seconds that each execution took where number and repeat are gotten from the variables passed in\n", " times = repeat(setup=setup_code, stmt=stmt, repeat=batches, number=shots)\n", "\n", " # Finally, display the name of the algorithm and the\n", " # minimum time it took to run\n", "\n", " print(f\"Algorithm: {algorithm}. Minimum execution time: {min(times)} seconds\")\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Python, like all sensible languages, has a built-in sorting algorithm sorted that many people have spent years optimizing. Let's get you familar with run_sorting_algorithm by testing it on sorted. To do this, in the next cell:\n", "* Define a variable that stories the size of the list you would like to sort. Set the size to be 10.\n", "* Define an list of the size of the variable just defined which has each element randomly sampled from the integers 0 to 1000. Hint: This is a good place to use randint\n", "* Call run_sorting_algorithm with as inputs: algorithm=sorted, the list you have created, batches=3, and shots=10" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Algorithm: sorted. Minimum execution time: 1.4000002011016477e-06 seconds\n" ] } ], "source": [ "# define a variable that stores the size of the list you would like to sort\n", "\n", "\n", "# create a list with random elements from 0 to 1000 with a length given by your variable above\n", "\n", "\n", "#call run_sorting_algorithm\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Hopefully, your code eventually executed without errors, and you obtained a single line output that looks something like:\n", "\n", " Algorithm: sorted. Minimum execution time: 7.719267159700394e-06 seconds\n", "\n", "Don't fret if you get a time different than 7.7 microseconds. Runtime is depends upon your CPU, the amount of RAM you have, and what other applications are running. For me, this was done on my five year old labtop while copious browser tabs, including youtube, zoom, and slack were open. It's for these reasons that minimum executation time is a better metric for your computer's speed than an average. While an average will fluctuation wildly depending on what everything else on your computer is doing, the minimum should better reflect the optimal case for you.\n", " \n", "\n", "* In the cell below, copy the code used to time sorted and retime for 3 different lists of length 10.\n", "* Are the times the same? Do they differ?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There is one more reason that your times can differ, which is important to computational complexity. That is the list itself. Consider the two lists:\n", "\n", " [1,2,3,4,5,6,7,8,9,10] & [989,452,223,1,653,545,594,823,129,210] \n", "\n", "Which one do you think should take longer to sort, and why?\n", "\n", "In general, the computational complexity of an algorithm cannot be described by a single number. Instead, there are many different ones. In the example of these two lists, you probably suspected that the first list, having smaller numbers that are already completely sorted, would be sorted by faster than the second list, which has numbers of differing sizes and is mixed up. When a sorting algorithm can take advantage of such features, then instead of having a single, fixed computational complexity, it can have a best-case scaling, a worse-case scaling, and a so-called average-case scaling based on the typical list it is passed. \n", "\n", "Since the actual runtime depends on many factors, it isn’t necessarily the best way to compare the complexity of different algorithms. Another option is to consider the theoretical runtime by use of Big $O$ notation. Big $O$ can also be used to compare different algorithms and decide which one is the most efficient.\n", "\n", "# Measuring Efficiency With Big O Notation\n", "\n", "Take $n$ as the size of the input to an algorithm, the Big $O$ notation represents the relationship between $n$ and the number of steps required to find a solution, while neglecting overall factors. Big $O$ uses a capital letter “O” followed by this relationship inside parentheses. For example, $O(n)$ represents algorithms that execute a number of steps proportional to the size of their input.\n", "\n", "Here are five commonly occuring complexities:\n", "\n", "--$O(1)$: The runtime is independent of input size. This would be amazing to have! No matter how much data we throw at it, it takes the same time.\n", "\n", "--$O(n)$: The runtime grows linearly with the input. A function that looks at every item of a list would be $O(n)$.\n", "\n", "--$O(n^2)$ The runtime is a quadratic function of the input. A simple code for finding pairs of values in a list that are identical is $O(n^2)$.\n", "\n", "--$O(2^n)$ The runtime grows exponentially with input. These algorithms are considered extremely inefficient. Many problems where the best known classical algorithms are $O(2^n)$ are known to have quantum algorithms that are polynomial. \n", "\n", "--$O(\\log n)$ The runtime grows linearly while the size of the input grows exponentially." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Bogosort, or things one would never do\n", "\n", "Just to demonstrate how chosing a bad algorithm can lead to different complexity, consider perhaps the silliest algorithm one could propose for sorting a list, randomly shuffling it until you get a sorted list. A helpful thing for translating your ideas into code is so-called pseudocode. Pseudocode is an artificial and informal language that helps programmers develop algorithms. Pseudocode for bogosort might look something like:\n", "\n", " while list is not SORTED:\n", " shuffle(list)\n", "\n", "which a programmer would then translate into the specific language they hope to use.\n", "\n", "The average runtime complexity of this algorithm is $O((n+1)!)$, worse than even the $O(2^n)$ we discussed above. \n", "\n", "# The Bubble Sort Algorithm in Python\n", "\n", "Its name comes from the way the algorithm works: With every new pass, the largest element in the list “bubbles up” toward its correct position.\n", "\n", "Bubble sort consists of making multiple passes through a list, comparing elements one-by-one, and swapping adjacent items that are out of order. A small optimization is that at at the end of each bubble loop, you keep track of if any \"bubblings\" have been perfomed. If none occur, that means the list is sorted and you can stop.\n", "\n", "Translating these sentences into pseudocode (which won't run in python), we have:\n", "\n", " bubble_sort(list)\n", " integer n = length of list\n", " for i from 1 to n\n", " boolean already_sorted = TRUE\n", " for j from 1 to n-i-1\n", " if list[j] > list[j+1]\n", " swap list[j] and list[j+1]\n", " already_sorted = FALSE\n", " if already_sorted = TRUE\n", " break\n", " return list\n", " \n", "In the next cell, try to write a python function called bubble_sort which implements this pseudocode.\n", "* Define a function bubble_sort which takes in a list, and returns a sorted version. The body of this function should follow the pseudcode above\n", "* To debug, it may prove useful to print your list at each j-th step, to compare to the example case below" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To test your code, I've included the following code for you with a preset list, such that you can debug step-by-step. Starting from the list \$8,2,6,4,5\$, the steps of progression should be\n", "\n", " [2, 8, 6, 4, 5]\n", " \n", " [2, 6, 8, 4, 5]\n", " \n", " [2, 6, 4, 8, 5]\n", " \n", " [2, 6, 4, 5, 8]\n", " \n", " [2, 4, 6, 5, 8]\n", " \n", " [2, 4, 5, 6, 8]\n", "\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# Generate an list of consisting five integers\n", "list = [8,2,6,4,5]\n", "\n", "# Call the function using the name of the sorting algorithm\n", "# and the list you just created\n", "run_sorting_algorithm(\"bubble_sort\",list,1,1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Measuring Bubble Sort’s Big O Runtime Complexity\n", "\n", "When running the bubble sort, most of the time is spend making comparisons between elements of the list, and therefore the total time spend is roughly proportional to how many comparisons you perform. The bubble sort consists of two nested for loops in which the algorithm performs $n - 1$ comparisons, then $n - 2$ comparisons, and so on until the final comparison is done. This comes at a total of $\\frac{1}{2}n^2 - \\frac{1}{2}n$ comparisons.\n", "\n", "Since numerical prefactors aren't important to Big O, and $n^2$ grows much faster than $n$ for large $n$ we are left with bubble sort having an average- and worst-case complexity of $O(n^2)$.\n", "\n", "If for some weird reason, the algorithm receives an already-sorted list, the runtime complexity becomes $O(n)$ because the algorithm visit every element only once.\n", "\n", "$O(n)$, then, is the best-case runtime complexity of bubble sort. But keep in mind that best cases are an exception, and you should focus on the average case when comparing different algorithms.\n", "\n", "\n", "Now, lets run your bubble_sort for some larger lists, to see if we can observe this behavior in the next cell:\n", "* Make a for loop that increments a variable i from 1 to 3\n", "* During the loop, set list length variable to be $10^i$\n", "* Create a list of the length $10^i$ whose elements are random integers between 0 and 1000\n", "* Use run_sorting_algorithm with 3 batches of 5 shots" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for i in range(1, 4):\n", " # set a length of list variable dependent on i\n", " \n", " #create your list of length 10^i random integers\n", " \n", " #Call run_sorting_algorithm" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Plotting your results\n", "\n", "You should have obtained 4 times. \n", "\n", "* Putting them into the list \"bubble_results\" in the cell below\n", "* Running this cell will plot your times on a log-log scale\n", "* By comparing to the lines drawn, can you extract an empircal value of the runtime complexity? (Hint: which line is closest to your data over the entire range)\n", "\n", "When you plot your data, you will also observe four lines are being plotting on the log-log plot with different $n$ dependences. Are the results in agreement with your theoretical expectations?\n" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "bubble_results=[]\n", "\n", "list_sizes=[10,100,1000,10000]\n", "\n", "plt.rcParams[\"figure.figsize\"]=(10,8)\n", "plt.loglog(list_sizes, bubble_results,linestyle=\"None\", marker='o',label=\"Data\")\n", "plt.loglog(list_sizes, np.power(np.list(list_sizes),0)*bubble_results/10**0,label=\"$O(1)$\")\n", "plt.loglog(list_sizes, np.power(np.list(list_sizes),1)*bubble_results/10**1,label=\"$O(n)$\")\n", "plt.loglog(list_sizes, np.power(np.list(list_sizes),2)*bubble_results/10**2,label=\"$O(n^2)$\")\n", "plt.loglog(list_sizes, np.power(np.list(list_sizes),3)*bubble_results/10**3,label=\"$O(n^3)$\")\n", "plt.legend()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Insertion Sort\n", "\n", "Like bubble sort, the insertion sort is straightforwardish. But unlike bubble sort, it builds the sorted list one element at a time by comparing each item with the rest of the list and inserting it at the correct position.\n", "\n", "Translating these words into pseudocode, we have\n", "\n", " insertion_sort(list)\n", " n = length of list\n", " for i from 1 to n\n", " current_elem = list[i]\n", " j= i-1\n", " while j >= 0 and list[j] > current_elem\n", " list[j+1] = list[j]\n", " j = j - 1\n", " list[j+1] = current_elem\n", " return list\n", "\n", "In the next cell, try to write a python function called insertion_sort which implements this pseudocode.\n", "\n", "* Define a function insertion_sort which takes in an list, and returns a sorted version. The body of this function should follow the pseudcode above\n", "* To debug, it may prove useful to print your list at each j-th step, to compare again with the example case below" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "def insertion_sort(sort_list):\n", " # insert the python code that implements the pseudo code listed above\n", " \n", " \n", " return sort_list" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To test your code, I've included the same list, such that if necessary you can debug step-by-step. Starting from the list [8,2,6,4,5], the steps of progression are\n", "\n", " [2, 8, 6, 4, 5]\n", " [2, 6, 8, 4, 5]\n", " [2, 4, 6, 8, 5]\n", " [2, 4, 5, 6, 8]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Generate an list of list_LENGTH items consisting\n", "# of random integer values between 0 and 999\n", "list = [8,2,6,4,5]\n", "\n", "# Call the function using the name of the sorting algorithm\n", "# and the list you just created\n", "run_sorting_algorithm(\"insertion_sort\",list,3,5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Measuring Insertion Sort’s Big O Runtime Complexity\n", "\n", "Comparing your insertion_sort to bubble_sort, you might think these are dramatically different algorithms. In fact, like bubble_sort, insertion_sort has two nested loops (one for and one while). Thus, it still has an $O(n^2)$ average runtime complexity.\n", "\n", "Although insertion_sort and bubble_sort have the same complexity, in practice, insertion_sort is often more efficient than bubble_sort because within the while loop, fewer average comparsions are required.\n", "\n", "Let's test these ideas in practice by running your insertion_sort for some larger lists, to see if we can observe this behavior in the next cell (Hint, you should be able to copy your previous code with a minor change):\n", "\n", "* Make a for loop that increments a variable i from 1 to 4\n", "* During the loop, set list length variable to be $10^i$\n", "* Create a list of the length $10^i$ whose elements are random integers between 0 and 1000\n", "* Use run_sorting_algorithm with 3 batches of 5 shots\n", "* Copy your times into the \"insertion_results\" list and check that it also follows $O(n^2)$ scaling\n", "* On the same plot, you should find your original bubble_sort data. Does one method appear faster or slower than the other?\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for i in range(1, 4):\n", " # length of list\n", " \n", " # create a list of random elements\n", " \n", " #Call run_sorting algorithm\n", " " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "insertion_results=[]\n", "\n", "list_sizes=[10,100,1000,10000]\n", "\n", "plt.rcParams[\"figure.figsize\"]=(10,8)\n", "plt.loglog(list_sizes, bubble_results,linestyle=\"None\", marker='o',label=\"Bubble Data\")\n", "plt.loglog(list_sizes, insertion_results,linestyle=\"None\", marker='o',label=\"Insertion Data\")\n", "plt.loglog(list_sizes, np.power(np.list(list_sizes),0)*insertion_results/10**0,label=\"$O(1)$\")\n", "plt.loglog(list_sizes, np.power(np.list(list_sizes),1)*insertion_results/10**1,label=\"$O(n)$\")\n", "plt.loglog(list_sizes, np.power(np.list(list_sizes),2)*insertion_results/10**2,label=\"$O(n^2)$\")\n", "plt.loglog(list_sizes, np.power(np.list(list_sizes),3)*insertion_results/10**3,label=\"$O(n^3)$\")\n", "plt.legend()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# The Merge Sort Algorithm in Python\n", "\n", "merge_sort is a much more efficient sorting algorithm. It’s based on the divide-and-conquer approach, a powerful algorithmic technique used to solve complex problems.\n", "\n", "To properly understand divide and conquer, you should first understand recursion. Recursion breaks a problem down into smaller subproblems until they’re small enough to manage. In programming, recursion is usually expressed by a function calling itself.\n", "\n", "Divide-and-conquer algorithms typically follow the same structure:\n", "\n", "1. The original input is broken into several parts, each one representing a subproblem that’s similar to the original but simpler.\n", "2. Each subproblem is solved recursively.\n", "3. The solutions to all the subproblems are combined into a single overall solution.\n", "\n", "In the case of merge_sort, the divide-and-conquer approach divides the list into two equal-sized parts, sorts each half recursively, and finally merges these two sorted parts into a single sorted list." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Understanding an Implement of Merge Sort\n", "\n", "The implementation of merge_sort needs two different pieces:\n", "\n", "1. A function that recursively splits the input in half\n", "2. A function that sorts the halves and recombines them\n", "\n", "Here’s the code to merge two different lists:" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "def merge(left, right):\n", " # If the first list is empty, then nothing needs\n", " # to be merged, and you can return the second list as the result\n", " if len(left) == 0:\n", " return right\n", "\n", " # If the second list is empty, then nothing needs\n", " # to be merged, and you can return the first list as the result\n", " if len(right) == 0:\n", " return left\n", "\n", " result = []\n", " index_left = index_right = 0\n", "\n", " # Now go through both lists until all the elements\n", " # make it into the resultant list\n", " while len(result) < len(left) + len(right):\n", " # The elements need to be sorted to add them to the\n", " # resultant list, so you need to decide whether to get\n", " # the next element from the first or the second list\n", " if left[index_left] <= right[index_right]:\n", " result.append(left[index_left])\n", " index_left += 1\n", " else:\n", " result.append(right[index_right])\n", " index_right += 1\n", "\n", " # If you reach the end of either list, then you can\n", " # add the remaining elements from the other list to\n", " # the result and break the loop\n", " if index_right == len(right):\n", " result += left[index_left:]\n", " break\n", "\n", " if index_left == len(left):\n", " result += right[index_right:]\n", " break\n", "\n", " return result" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "With the above function in place, the other piece is a function that recursively splits the input list in half and uses merge() to produce the final result:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "def merge_sort(list):\n", " # If the input list contains fewer than two elements,\n", " # then return it as the result of the function\n", " if len(list) < 2:\n", " return list\n", "\n", " midpoint = len(list) // 2\n", "\n", " # Sort the list by recursively splitting the input\n", " # into two equal halves, sorting each half and merging them\n", " # together into the final result\n", " return merge(\n", " left=merge_sort(list[:midpoint]),\n", " right=merge_sort(list[midpoint:]))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Measuring Merge Sort’s Big O Complexity\n", "\n", "To analyze the complexity of merge_sort, you should consider the two steps separately:\n", "\n", "1. merge() receives two lists whose combined length is at most n, and it combines both lists by looking at each element at most once. This leads to a runtime complexity of $O(n)$.\n", "\n", "2. The outer function, merge_sort() splits the input list recursively and calls merge() for each half. Since the list is halved until a single element remains, the total number of halving operations performed by this function is $\\log_2(n)$. Since merge() is called for each half, we get a total runtime of $O(n\\log_2(n))$.\n", "\n", "Interestingly, $O(n\\log_2(n))$ is the best possible worst-case runtime that can be achieved by a sorting algorithm.\n", "\n", "* Let's run the same analysis of different sized lists for the merge_sort().\n", "* Copy the results into the \"merge_results\" list and plot.\n", "* Do the results best agree with an $O(n\\log_2(n))$ scaling? How does it compare to the other two sorting algorithms? Is it always faster? Sometimes?\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for i in range(1,5):\n", " list_LENGTH=10**i\n", "\n", " list = [randint(0, 1000) for i in range(list_LENGTH)]\n", "\n", " run_sorting_algorithm(\"merge_sort\",list,3,5)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "merge_results=[]\n", "list_sizes=[10,100,1000,10000]\n", "\n", "plt.rcParams[\"figure.figsize\"]=(10,8)\n", "plt.loglog(list_sizes, bubble_results,linestyle=\"None\", marker='o',label=\"Bubble Data\")\n", "plt.loglog(list_sizes, insertion_results,linestyle=\"None\", marker='o',label=\"Insertion Data\")\n", "plt.loglog(list_sizes, merge_results,linestyle=\"None\", marker='o',label=\"Merge Data\")\n", "plt.loglog(list_sizes, np.power(np.list(list_sizes),0)*merge_results/10**0,label=\"$O(1)$\")\n", "plt.loglog(list_sizes, np.power(np.list(list_sizes),1)*merge_results/10**1,label=\"$O(n)$\")\n", "plt.loglog(list_sizes, np.power(np.list(list_sizes),1)*np.log2(np.list(list_sizes))*merge_results/(10**1*np.log2(10)),label=\"$O(n\\log_2(n))$\")\n", "plt.loglog(list_sizes, np.power(np.list(list_sizes),2)*merge_results/10**2,label=\"$O(n^2)$\")\n", "plt.loglog(list_sizes, np.power(np.list(list_sizes),3)*merge_results/10**3,label=\"$O(n^3)$\")\n", "plt.legend()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.13" }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": {}, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 4 }