The goal of this project is to use the concepts taught in this course to develop an efficient way of working with Big Data. You should have 2 files in your Linux system: hugefile1.txt and...

1 answer below »

The goal of this project is to use the concepts taught in this course to develop an efficient way of working with Big Data.


You should have 2 files in your Linux system:hugefile1.txtandhugefile2.txt, with one billion lines in each one. If you do not, please go back to the Module 7 Portfolio Reminder and complete the steps there.


Create a program, using a programming language of your choice, to produce a new file:totalfile.txt, by taking the numbers from each line of the two files and adding them. So, each line in file #3 is the sum of the corresponding line inhugefile1.txtandhugefile2.txt.


For example, if the first 5 lines of your files look as follows:


$head -5 hugefile*txt


==> hugefile1.txt


4131


29929


6483


7659


25003


==> hugefile1.txt


8866


19171


11029


4889


27069


then the first 5 lines oftotalfile.txtlook like this:


$head -5 totalfile.txt


12997


49100


17512


12548


52072


Because the files of such large sizes cannot be read into memory in their entirety at the same time, you need to use concurrency. Reading the files one line at a time will take a long time, so use what you have learned in this course to optimize this process. Be sure to record the amount of time it takes for each version of your program to complete this task.


Create two programs, where one program reads the first half of the files, and another program reads the second half. Use the OS to launch both programs simultaneously.


Now, break uphugefile1.txtandhugefile2.txtinto 10 files each, and run your process on all 10 sets in parallel. How do the run times compare to the original process?

Answered Same DayJul 05, 2022

Answer To: The goal of this project is to use the concepts taught in this course to develop an efficient way of...

Jahir Abbas answered on Jul 06 2022
72 Votes
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here