COSC 1P03 Tutorial Exercise 10 Apr. 4-7, 2022 Sorting — QuickSort pivot selection Background: In lecture, we had a brief introduction to QuickSort. Your lab covered (or will cover) how to code the...

1 answer below »
Its about quick sorting. Answer the questions of the last page. That's it!


COSC 1P03 Tutorial Exercise 10 Apr. 4-7, 2022 Sorting — QuickSort pivot selection Background: In lecture, we had a brief introduction to QuickSort. Your lab covered (or will cover) how to code the algorithm, but it glosses over one point: how to select the pivot. • Recall that the QuickSort algorithm relies on taking a list of unsorted values, selecting some value as the 'pivot', and then partitioning the records into three groups: the records that precede the pivot, the pivot, and the records following it For example, without even considering indices at all: Suppose we were to select OCP as our pivot, we'd end up with: Recursing first on the left side, and choosing CAB; and then on the right side, and choosing TCP; would yield: And, of course, recursing into the remaining slices (e.g. ABA) would identify them as 'trivial cases', so they're sorted as well. For these particular records, and choices of pivots, everything was very 'balanced'. • i.e. the problem was broken up into two equal-sized 'sub-problems' at each step, meaning the execution tree had a maximum depth bounded at log(N) But what if we'd chosen poorly? COSC 1P03 Tutorial 10 Pivot selection — picking the 'last' record: Recall that, for each call, we'll have some 'lower-bound' and some 'upper-bound', marking the left and right boundaries of that slice to sort. Before getting into specific strategies for pivot selection, it's common to go with the easiest: picking the record in that upper-bound position. e.g.: But what would happen if we were to try the exact same algorithm, and sort the same (already-sorted) records? That is… far less impressive. COSC 1P03 Tutorial 10 But it makes sense, right? We're picking a record, and then breaking up the slice into everything before that record, that record, and what comes after it. If the array (and by extension, that slice) is already sorted, and we pick the right-most record within the slice, then clearly nothing could come after it, right? That means, at each step, we're decreasing the length of the problem by one. This means two things: • The maximum depth of recursion will be O(N) • Since each depth has a cost of O(N), the total algorithm is O(N×N), i.e. O(N²) Whether or not this is 'a problem' ostensibly depends on the expected usage. Sorting an already-sorted list can be quite common: for some applications, minor operations might have only slightly disorganized the array. For those cases, this would be catastrophic. To answer the question of how to mitigate the risk here, first consider why the initial example worked out so well: because it always picked a pivot that evenly-divided the remaining records within the slice. Can we force that to happen? Picking the median record: By coincidence, each 'rightmost record' happened to be the median record of that slice: • OCP was the median (or middle) record of the whole array • CAB was the median of FDD, ABC, and CAB • TCP was the median of XYZ, RNA, and TCP So what if we were to simply manually search for the median record within each slice? • This is the first question for the exercise at the end • Consider what would be involved with finding that median value ◦ Definitely give this some thought, before the TA explains it For now, we'll assume this is too expensive, and look for alternatives. Picking the median position: What if we were to simply pick the 'middle record' in the most literal sense possible? i.e. pivot_position = ( lower_bound + upper_bound ) ÷ 2 This is a more interesting question to answer, because its expected efficacy also depends on the use case: • By definition, it must solve the problem of 'sorting a sorted list' ◦ Since it's sorted, the 'middle record' must also be the 'median record' • But what about for unsorted values? Can we think of, say, a possible sequence of values where picking the 'middle values' will be far worse than a depth of O(log(N))? This is the second question for the exercise. COSC 1P03 Tutorial 10 Assuming you fiddled with the numbers long enough, you probably made some decent progress towards recognizing that there are some sequences that don't 'play as nicely' as others. And that's the real catch: there are an infinite number of sorting tasks wherein picking the median position would work terribly. Of course, there's also an infinite number where it'd work very well. But the point is even this approach leads to a worst-case scenario of O(N²) overall. And that's not really going to change. Our immediate goal isn't to guarantee a worst possible bound of O(N·log(N)). Rather, in the absence of additional domain knowledge, we'll commonly target the average case. This is counterintuitive to how we learned asymptotic complexity, where we normally focused on the upper-bounds, but sometimes performance in practice is what matters. Median of three: We've agreed that it isn't computationally feasible to pick the median record of a slice. But what if we simply tried to minimize the damage? The median of three algorithm is a very simple, but powerful, pivot selection scheme. For a given slice, you compare exactly three records: the leftmost, the rightmost, and the middle record ('middle' referring to the position). You choose as the pivot the median record of those three. As an exercise, and the third question at the end, re-attempt your selection of integers from the second question, but use median of three instead. COSC 1P03 Tutorial 10 Name: _____________________________________________ St. #: _____________________________________________ Picking the median value Is searching each slice for the 'median record' a good idea? Why or why not? Picking the middle value Can you think of a sequence of integers, such that picking the 'middle value' as the pivot would be a bad idea? You don't have to worry about getting to O(N) for the recursion depth. Just a few lines is enough to show how things can go wrong. Using the median of three Retry your numbers from the previous question, but use median of three this time. Did it help?
Answered Same DayApr 03, 2022

Answer To: COSC 1P03 Tutorial Exercise 10 Apr. 4-7, 2022 Sorting — QuickSort pivot selection Background: In...

Gaurav answered on Apr 04 2022
99 Votes
1:
It depends on the algorithm which will be used to find the median element, suppose if we able to
get the median element in O(n) time the during the recursion process, then the time complexity of whole sorting process would be: t(n/2) + t(n/2) + O(n) + O(n)
And whole process depends on the time...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here