Saturday, 29 March 2014

Sorting and efficiency

This week we're supposed to do a post about sorting algorithms and their efficiency. Last semester in CSC108 we learned about some sorting algorithms like bubble sort, insertion sort and so on, but we didn't focus much on what makes them unique in terms of efficiency. For the purposes of this entry I'll be focusing on three specific sorting algorithms, selection sort, merge sort and quick sort.

Before I get in to it, lets discuss efficiency of algorithms. When we're talking about the efficiency of algorithms we use Big O notation. We write it like this O(x) where x represents whatever argument describes the algorithm's efficiency. For example, if we're given a list of size n, and an algorithm that checks each value in the list once, in general it would mean the algorithm's efficiency is O(n). If an algorithm compares each value in the list against every other value in the list, we would have an algorithm with efficiency O(n^2). This is a simple way of looking at it, but I hope it illustrates the point. Now to the algorithms!

We'll start with Selection Sort. I chose to look at selection sort first because it's one of the most basic sorting algorithms. Unfortunately it also happens to be one of the more inefficient algorithms that we use. What does selection sort do? Well, given an unordered list, we find the lowest element in the list and then place it in the beginning of the new list. Now repeat on all the remaining elements (a list of length n - 1). Eventually this will give you an ordered list. Sounds pretty simple and intuitive right? Well it is, but in order to find the lowest element in a list, we have to compare a given value against every other value in the list. As we said above (when discussing efficiency of algorithms), this would require n^2 separate checks, thus giving us an efficiency of O(n^2) for Selection Sort.

Lets move on to Merge Sort. This sorting algorithm is actually pretty cool, but harder to grasp then selection sort. Given a list, we divide the list up until we have sublists each consisting of one element. Now we compare the first element of each list (at the start it's just one) and append the smaller element to a new list (note, we could also append the larger, we just have to be consistent). Now we will have n/2 sublists of 2 elements. We repeat by merging these new sublists by comparing the first element of each list and appending the smaller one to a new list. Eventually we will have a merged and ordered list of the initial n elements. This algorithm generally takes lgn splits to split up a list of n elements, and n elements will be inspected lgn times (once for each split when merging). Although the efficiency is more like lgn + nlgn, we drop the smaller term when describing big O efficiency and are left with O(nlgn) efficiency for Merge Sort.

Finally lets discuss Quick Sort. The way it works is given an unordered list, the first element is taken, and used as a partition. Then every element in the list that is smaller than this element is put into one list, while everything larger is put into another list. We then repeat this process on the sublists until eventually (once we combine all these sublists from smallest to largest) we end up with a sorted list. Since this algorithm has n comparisons per split, this algorithm like merge sort has O(nlgn) efficiency...In the best case scenario. The risk with Quick Sort is that in the worst case, the split points may not be anywhere close to the middle of the list, and in fact could be all the way to the left or the right, resulting in incredibly uneven divisions. The result of this scenario is an algorithm with O(n^2) efficiency. 


Sunday, 2 March 2014

Recursion

Recursion! (shakes fist). My worst enemy and my best friend. Recursion has been the source of much anguish recently because it took me a long time to wrap my head around the idea of a function calling itself in order to do something. I found myself constantly thinking why is this so useful, whats the point? Well, recursion makes incredibly complex functions, incredibly easy to write. Recursion constantly breaks down bigger problems in to smaller versions of the original problem until it reaches the end of the problem, or the "base case." Then the result of these base cases are returned up the recursive chain back to the original call of the function. Something a friend told me which helped me to write recursive functions, was "just focus on the base cases, once you get those right the rest will sort itself out."

An example that really helped me grasp the concept of recursion was the fibonacci sequence. A fibonacci number is the sum of the previous two fibonacci numbers. Seems like a simple enough formula right? F(n) = F(n - 1) + F(n - 2). But how do we write a program to give us say the 5th fibonacci number we want? With recursion no doubt! This problem was interesting for me because it requires two base cases. By the definition of the fibonacci number we need two previous numbers in order to find the next. Therefor we need to create these base cases, the first fibonacci number (0) and the second (1). Once we have these numbers the rest just falls in to place. Now once we declare these base cases (if the first fibonacci number is requested, return 0, and if the second fibonacci number is requested, return 1) we can solve for any general fibonacci number n. This is where I see the beauty in recursion. So we want to find the 5th fibonacci number right? Well all we need is the 4th and 3rd number. We don't know the 4th yet but from our base cases we know that the 3rd is 1 since the 1st and 2nd add to 1. Now we know the 4th fibonacci number is 2 by adding the 3rd and 2nd fibonacci numbers. And finally we know the 5th fibonacci number is 3! How cool is that? With recursion we could find any fibonacci number we want. Writing recursive functions has been incredibly rewarding, efficient and just fun. Now That I have a grasp of it, it has become one of the most useful tools in my programming knowledge, and I can see how useful it is in the world of programming.