CSCE 411 Design and Analysis of Algorithms Project 1 I. Some Basics on Line Fitting Suppose that there is a set P of n points in the two-dimensional plane, denoted by (x1, y1), (x2, y2), · · · , (xn,...

1 answer below »
I attached the file.


CSCE 411 Design and Analysis of Algorithms Project 1 I. Some Basics on Line Fitting Suppose that there is a set P of n points in the two-dimensional plane, denoted by (x1, y1), (x2, y2), · · · , (xn, yn), and suppose x1 < x2="">< ·="" ·="" ·="">< xn.="" given="" a="" line="" l="" defined="" by="" the="" equation="" y="ax+" b,="" we="" say="" that="" the="" error="" of="" l="" with="" respect="" to="" p="" is="" the="" sum="" of="" its="" squared="" “distances”="" to="" the="" points="" in="" p="" :="" error(l,p="" )="n∑" i="1" (yi="" −="" axi="" −="" b)2.="" the="" above="" error="" shows="" how="" well="" the="" line="" l="" approximates="" the="" points="" in="" p="" :="" the="" smaller="" the="" error="" is,="" the="" better="" it="" fits.="" we="" show="" an="" example="" in="" fig.="" 1.="" in="" this="" example,="" there="" are="" n="7" points,="" and="" a="" line="" that="" optimally="" fits="" them="" is="" shown.="" fig.="" 1.="" a="" line="" that="" optimally="" fits="" 7="" points.="" it="" is="" known="" that="" when="" n="" ≥="" 2,="" given="" the="" set="" of="" points="" p="" ,="" the="" line="" l="" that="" minimizes="" the="" error="" error(l,p="" )="" is="" y="ax+" b="" with="" a="n" ∑n="" i="1" xiyi="" −="" (="" ∑n="" i="1" xi)(="" ∑n="" i="1" yi)="" n="" ∑n="" i="1" x="" 2="" i="" −="" (="" ∑n="" i="1" xi)="" 2="" and="" b="∑n" i="1" yi="" −="" a="" ∑n="" i="1" xi="" n="" .="" if="" n="1," then="" any="" line="" passing="" through="" the="" only="" point="" trivially="" fits="" the="" point="" perfectly="" well,="" and="" the="" error="" would="" be="" error(l,p="" )="0." in="" the="" following,="" given="" a="" set="" of="" n="" ≥="" 1="" points="" p="" ,="" let’s="" use="" ε(p="" )="" to="" denote="" the="" minimum="" error="" that="" a="" line="" fitting="" them="" can="" achieve.="" (clearly,="" we="" can="" achieve="" the="" minimum="" error="" ε(p="" )="" by="" using="" the="" solution="" above.)="" fig.="" 2.="" two="" lines="" that="" fit="" 12="" points.="" fig.="" 3.="" three="" lines="" that="" fit="" 18="" points.="" ii.="" multi-line="" fitting="" problem="" sometimes="" it="" is="" hard="" to="" fit="" points="" well="" using="" just="" one="" line.="" instead,="" it="" is="" more="" natural="" to="" fit="" the="" points="" with="" multiple="" lines.="" for="" example,="" for="" the="" points="" in="" fig.="" 2,="" it="" is="" more="" natural="" to="" fit="" them="" using="" two="" lines.="" and="" for="" the="" points="" in="" fig.="" 3,="" it="" is="" more="" natural="" to="" fit="" them="" using="" three="" lines.="" it="" is="" not="" always="" simple,="" however,="" to="" know="" how="" many="" lines="" are="" the="" best="" choice="" beforehand.="" on="" the="" other="" hand,="" if="" there="" is="" no="" restriction="" on="" how="" many="" lines="" can="" be="" used,="" it="" would="" become="" trivial="" to="" fit="" the="" points="" perfectly:="" just="" have="" a="" different="" line="" pass="" through="" each="" pair="" of="" consecutive="" points="" in="" p="" .="" but="" that="" would="" make="" the="" problem="" meaningless.="" the="" problem="" is="" essentially="" a="" “modeling”="" problem:="" we="" want="" to="" model="" a="" set="" of="" “observed="" points”="" using="" a="" simple="" yet="" accurate="" model,="" and="" in="" this="" case,="" the="" model="" is="" a="" set="" of="" lines,="" each="" for="" a="" different="" interval="" of="" x.="" if="" too="" many="" lines="" are="" used,="" the="" model="" would="" not="" be="" “simple”="" any="" more.="" thus,="" intuitively,="" we="" would="" like="" a="" problem="" formulation="" that="" requires="" us="" to="" fit="" the="" points="" well,="" using="" as="" few="" lines="" as="" possible.="" let’s="" formulate="" our="" problem,="" which="" we="" shall="" call="" the="" “multi-line="" fitting="" problem”:="" input:="" we="" have="" the="" following="" inputs:="" 1)="" a="" set="" of="" n="" points="" in="" a="" 2-dimensional="" plane="" p="{(x1," y1),="" (x2,="" y2),="" ·="" ·="" ·="" ,="" (xn,="" yn)}="" with="" x1="">< x2="">< ·="" ·="" ·="">< xn.="" we="" shall="" use="" pi="" to="" denote="" the="" point="" (xi,="" yi),="" for="" i="1," 2,="" ·="" ·="" ·="" ,="" n.="" 2)="" a="" real="" number="" c=""> 0. Output: A partition of the points P into k segments: P1 = {p1, p2, · · · , pm1} P2 = {pm1+1, pm1+2, · · · , pm2} P3 = {pm2+1, pm2+2, · · · , pm3} ... Pk = {pmk−1+1, pmk−1+2, · · · , pmk = pn} that minimizes the cost function cost(P,C) = ε(P1) + ε(P2) + · · ·+ ε(Pk) + Ck. In the cost function cost(P,C), the part ε(P1) + ε(P2) + · · · + ε(Pk) is the total error of using k lines to fit the n points, and Ck is the penalty for using k lines (namely, for partitioning the n points into k segments). The greater k is, the greater the penalty term Ck is. Note that the value of k, as well as the turning points pm1 , pm2 , · · · , pmk−1 , are what an algorithm solving the problem needs to decide on. III. What to Submit In this project, you need to submit three items: (1) an algorithm report, (2) a program that implements your algorithm, (3) output of your program on test instances. A. Submission Item One: An Algorithm Report First, you are to design an efficient algorithm for the “Multi-Line Fitting Problem”, and submit it as a report. The report is just like what we usually do for algorithm design in a homework, and should have the four elements as usual: 1) The main idea of your algorithm. 2) The pseudo code of your algorithm. 3) The proof of correctness for your algorithm. 4) the analysis of time complexity for your algorithm. B. Submission Item Two: A Program That Implements Your Algorithm You need to implement your algorithm using a commonly used programming language, such as Python, C++, Java, etc. We encourage you to use Python if possible, but other languages are fine, too. Your program will take a list of instances of the “Multi-Line Fitting Problem” as input, and return a list of solutions (corresponding to those instances) as its output. To test your program, we will provide you with input instances as a “dictionary” in Python (after you have submitted the program), and ask you to submit the output solutions as a “dictionary” in Python. (Details on the two files, including their formats, will be explained in the next subsection for “Submission Item Three”.) When you submit your program, you need to explain clearly how to run your code. If your programming language is not Python, then you also need to include an explanation on how you turned the provided “dictionary” of “input instances” in Python to your programming language, and how you turned your “output solutions” in your programming language to a “dictionary” in Python (following the formats specified in the next subsection). C. Submission Item Three: Output of Your Program on Test Instances To test your algorithm/program, we will provide you with three sets of “input instances”, all in the Python language: 1) A set of instances of relatively small sizes. 2) A set of instances of medium sizes. 3) A set of instances of relatively large sizes. Of course, if your algorithm/program is correct, then you should get the correct solutions for all three sets of input instances. However, an algorithm is not only about correctness, but also about efficiency. If your algorithm’s time complexity is low, then computing should be easy. (When we tested the algorithm in Google Colab using Python, the set of small instances took less than 1 minute 20 seconds, the set of medium-sized instances took about 14 minutes 30 seconds, and the set of large instances took about 13 minutes 30 seconds.) However, if your algorithm’s time complexity is high, you may experience much longer running times, or may not be able to get the solutions to the large instances. To give you an idea what the three sets of input instances are like, we provide three “example” files (down- loadable from our course webpage): “examples_of_small_instances”, “examples_of_medium_instances”, “exam- ples_of_large_instances”. They are all Python dictionaries, whose format will be explained below. You can down- load them by clicking on the links in our course webpage. After downloading each file, you can open it using pickle.load(open(filePath, ‘rb’)), where “filePath” is the path of your downloaded file. (Of course, these three files are different from the three files we will send you for testing your program. But they are similar.) After you run your program on the three sets of “input instances”, save your results as three Python “dictionaries” in three files: 1) File “small_solutions” corresponding to the input instances of small sizes. 2) File “medium_solutions” corresponding to the input instances of medium sizes. 3) File “large_solutions” corresponding to the input instances of large sizes. Now let’s explain the formats of the “input instances” and “output solutions”. The “input instances” is a “dictionary” (in Python) that contains 4 key-value pairs (in the following, let’s use m to denote the number of provided instances of the “Multi-Line Fitting Problem”): 1) “key” is “n_list”, “value” is a list of m numbers. For i = 0, 1, · · · ,m − 1, n_list[i] is the number of points in the i-th instance (that are to be fitted by lines). (Note that here and in the following, the points will be indexed by 0, 1, 2, · · · instead of by 1, 2, 3, · · · .) 2) “key” is “ x_list, “value” is a list of m items. For i = 0, 1, · · · ,m − 1, x_list[i] is a list of n_list[i] numbers. For i = 0, 1, · · · ,m − 1 and j = 0, 1, · · · , n_list[i] − 1, x[i][j] is the x-coordinate of the j-th point in the i-th instance. Here x[i][0] < x[i][1]="">< x[i][2]="">< ·="" ·="" ·="">< x[i][n_list[i]− 1]. 3) “key” is “ y_list, “value” is a list of m items. for i = 0, 1, · · · ,m − 1, y_list[i] is a list of n_list[i] numbers. for i = 0, 1, · · · ,m − 1 and j = 0, 1, · · · , n_list[i] − 1, y[i][j] is the y-coordinate of the j-th point in the i-th instance. 4) “key” is c_list, “value” is a list of m numbers. for i = 0, 1, · · · ,m−1, c_list[i] is a positive real number, which is the input parameter c specified in the definition of the “multi-line fitting problem” for the i-th instance. (that is, for the i-th instance, c_list[i] is the extra cost for every extra line we use to fit points.) we can see that in the above dictionary, for i = 0, 1, · · · ,m−1, the four items – n_list[i], x_list[i], y_list[i], c_list[i] – describe all the information we need to know about the i-th instance. the “output solutions” x[i][n_list[i]−="" 1].="" 3)="" “key”="" is="" “="" y_list,="" “value”="" is="" a="" list="" of="" m="" items.="" for="" i="0," 1,="" ·="" ·="" ·="" ,m="" −="" 1,="" y_list[i]="" is="" a="" list="" of="" n_list[i]="" numbers.="" for="" i="0," 1,="" ·="" ·="" ·="" ,m="" −="" 1="" and="" j="0," 1,="" ·="" ·="" ·="" ,="" n_list[i]="" −="" 1,="" y[i][j]="" is="" the="" y-coordinate="" of="" the="" j-th="" point="" in="" the="" i-th="" instance.="" 4)="" “key”="" is="" c_list,="" “value”="" is="" a="" list="" of="" m="" numbers.="" for="" i="0," 1,="" ·="" ·="" ·="" ,m−1,="" c_list[i]="" is="" a="" positive="" real="" number,="" which="" is="" the="" input="" parameter="" c="" specified="" in="" the="" definition="" of="" the="" “multi-line="" fitting="" problem”="" for="" the="" i-th="" instance.="" (that="" is,="" for="" the="" i-th="" instance,="" c_list[i]="" is="" the="" extra="" cost="" for="" every="" extra="" line="" we="" use="" to="" fit="" points.)="" we="" can="" see="" that="" in="" the="" above="" dictionary,="" for="" i="0," 1,="" ·="" ·="" ·="" ,m−1,="" the="" four="" items="" –="" n_list[i],="" x_list[i],="" y_list[i],="" c_list[i]="" –="" describe="" all="" the="" information="" we="" need="" to="" know="" about="" the="" i-th="" instance.="" the="" “output="">
Answered 1 days AfterSep 25, 2022

Answer To: CSCE 411 Design and Analysis of Algorithms Project 1 I. Some Basics on Line Fitting Suppose that...

Ximi answered on Sep 27 2022
51 Votes
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here