COMP2017 / COMP9017 Assignment 1 Full assignment due: May 24th, 11:59pm AEST (Week 12 Friday) Milestone due: May 12th, 11:59pm AEST (Week 10 Sunday) This assignment is worth 15% of your final...

uploaded file


COMP2017 / COMP9017 Assignment 1 Full assignment due: May 24th, 11:59pm AEST (Week 12 Friday) Milestone due: May 12th, 11:59pm AEST (Week 10 Sunday) This assignment is worth 15% of your final assessment Task Description In this assignment, you will be implementing your own virtual filesystem. You will first program a library that simulates virtual file operations on a virtual disk. After this, you will extend the function- ality of the library using the Filesystem in Userspace (FUSE) module in Linux to enable it to behave as a real filesystem. Your assignment will also be tested for performance. You will need to implement a multithreaded solution and consider the synchronisation of parallel operations to ensure that your filesystem behaves correctly. Milestone To ensure you are making regular progress on this assignment, you must have achieved a minimum level of functionality in your submission by May 12th, 11:59pm AEST (Week 10 Sunday) to receive a portion of the marks. See the Submission section at the end of this document for further details. Working on your assignment Staff may make announcements on Ed (https://edstem.org) regarding any updates or clarifi- cations to the assignment. You can ask questions on Ed using the assignments category. Please read this assignment description carefully before asking questions. Please ensure that your work is your own and you do not share any code or solutions with other students. You can work on this assignment using your own computers, lab machines, or Ed. However, it must compile and run on Ed and this will determine the grade. It is important that you continually back up your assignment files onto your own machine, flash drives, external hard drives and cloud storage providers. You are encouraged to submit your assignment while you are in the process of completing it to receive feedback and to check for correctness of your solution. For Part 3 of this assignment, the FUSE kernel module and library need to be installed. This is not available on Ed. For Part 3, we recommend using your own Linux computer, a Linux virtual machine, 1 https://edstem.org COMP2017 / COMP9017 or the lab machines. We are using FUSE version 2.9 for this assignment. Please ensure that if you use FUSE on your own machine, that version 2.9 is installed (not version 3.0 or higher). Version 3.0 or higher of FUSE is NOT compatible. Minor version number differences such as 2.9.7 vs 2.9.9 are not an issue. If you use a Linux computer or virtual machine, you generally need to install the “fuse” package and also the libfuse development library. On Ubuntu, these packages are fuse and libfuse-dev, which you can install by sudo apt-get install fuse libfuse-dev. Make sure you don’t install fuse3 or libfuse3-dev which correspond to version 3. Once you have fuse installed, you can check the version you have with fusermount -V, which should show a 2.9.x version. In other distributions such as Fedora, the FUSE package may be called fuse2. The development library may be called fuse-devel. Post on Ed if you are having any difficulties installing the required packages. The School lab machines should have the required packages installed. Make sure you boot into Linux. Page 2 of 16 COMP2017 / COMP9017 Introduction Low-level storage devices, such as hard drives, are presented to the operating system as an array of fixed-size blocks of data. Filesystems are (usually) parts of the operating system that present an abstraction of this raw data. It is the filesystem’s job to internally organise these blocks, and present a consistent interface of “files” (and directories) to user programs, via system calls (such as read(), write(), open()). For this reason, filesystems are typically kernel space code. However, in this assignment you will only require the user space programming which you have been learning. As the last part of the assignment, you will use the special FUSE library and module which allows your user space filesystem to link into kernel code and present actual files as part of the real Linux directory tree. Real filesystems utilise storage devices as mentioned - such as a hard drive, or solid state drive, accessed over a variety of interfaces such as USB. For this assignment, your virtual filesystem will use 3 real files as a simulated storage device, and store the data of your “virtual files” in these. The file_data file contains the contents of all virtual files stored. Files will be stored as a contigu- ous series of bytes starting at various offsets in the file_data file. Note that file_data contains only the contents of your virtual files and not their filenames. The file_data file consists of 2n blocks, each of size 256 bytes. n is a positive integer with maximum value 24. Files do not have to start or end on block boundaries. Files may span multiple blocks or parts of blocks, but are always a contiguous section of bytes. There may be two or more files contained within one block if they are all less than 256 bytes in size. The directory_table file maps filenames to where the corresponding virtual file is stored in file_data. It consists of a series of 72 byte records. Each record first contains a 64 byte region for a null terminated string that represents the filename. A filename that starts with the null byte is considered to have been deleted. This is followed by the offset and length fields, both of which are 4 bytes, which are unsigned integer values stored in little-endian format. These represent the position of the first byte of the file from the beginning of file_data in bytes and the length of the file in bytes respectively. The maximum number of records the directory_table file can contain is 216. Your filesystem will not support any directories, links, users, owners, permissions, etc. The layout of an example file_data, with the corresponding directory_table, is shown in the figure below. Page 3 of 16 COMP2017 / COMP9017 The hash_data file contains verification data for your virtual filesystem. You will implement verifi- cation in Part 2. You will construct a Merkle hash tree (https://en.wikipedia.org/wiki/ Merkle_tree) over your 256-byte blocks from file_data. You will implement the simplest form in this assignment, a perfect binary tree of nodes which each contain a hash (all levels of the tree are fully complete with nodes). Every node in the tree, except the leaves, has 2 children. All leaves exist at the same level. A hash function H(x) takes arbitrary input data x and returns a fixed length output h, which is referred to as the hash of that input data. Each leaf node of your tree corresponds to one block of data in file_data and stores its hash value. Every other node in your tree stores the following hash of its 2 child nodes: if its child nodes contain hashes H(a) and H(b) (a corresponding to the smaller offset from the start of the file), the parent node stores value H(H(a)+H(b)) where + means to concatenate the two hash values. This continues up the Merkle tree to obtain the hash value of the root node. The figure below demonstrates conceptually how the Merkle hash tree is organised on top of the file_data example from the first figure. Page 4 of 16 https://en.wikipedia.org/wiki/Merkle_tree https://en.wikipedia.org/wiki/Merkle_tree COMP2017 / COMP9017 You will implement a variant of a Fletcher hash function, which will produce a 16 byte hash. Pseu- docode is included below. The hash tree is stored in flat array form in hash_data. The root hash value is stored at hash_data[0]. Hash values at level k in the tree (counting from the root node at level 0) will be stored low to high in the array hash_data index 2k − 1 to 2k+1 − 2 inclusive. For a file_data containing 2n blocks, the hash_data therefore has a file size of 16× (2n+1 − 1) bytes. The figure below demonstrates an example of how a Merkle tree would be stored in hash_data for the case of n = 2. Page 5 of 16 COMP2017 / COMP9017 Pseudocode for hash function 1 function fletcher_hash(uint32_t * data, length): //Consider data 4 bytes at a time 2 uint32_t a = 0; //Treat 4 byte blocks as little-endian integers 3 uint32_t b = 0; 4 uint32_t c = 0; 5 uint32_t d = 0; 6 for i = 0 ... length-1: 7 a = (a + data[i]) mod 2^32-1; 8 b = (b + a) mod 2^32-1; 9 c = (c + b) mod 2^32-1; 10 d = (d + c) mod 2^32-1; 11 uint8_t hash_value[16] = concatenate a, b, c, d //4x4 = 16 bytes in total 12 return hash_value If data passed into your hash function is not a multiple of 4 bytes in length, pad it to a multiple of 4 bytes with zero bytes. Page 6 of 16 COMP2017 / COMP9017 Part 1 Implement the following functions for your virtual filesystem in myfilesystem.c. This file has been included in the scaffold provided. Do not write any main() function; your code will be tested by directly calling the functions you implement. Notes for all functions For all functions accepting a filename, truncate the filename to 63 characters if the parameter exceeds this length. If this would result in an error (such as a duplicated filename), return the specified error code. You can assume that test cases will only test your function on a single error at a time. void * init_fs(char * f1, char * f2, char * f3, int n_processors); This function you provide will be called first and once before any further operations are performed. You need to initialise any data structures and perform any preparation you wish. Return a pointer to a memory area (void *) where you can store data you wish to use during filesystem operations. For all other functions, this pointer will be passed in so you can access your data as void * helper. As parameters, you are provided with the filenames of the 3 files that represent your virtual storage device, as described above in the introduction, as well as an indication of how many processors you have access to. f1 is the filename of your file_data file, f2 is directory_table and f3 is hash_data. You have the opportunity to setup threads or processes at this point to use in further op- erations, or you can create threads/processes during each individual operation. Your program needs to be ready to handle any number of consecutive filesystem operation calls after init_fs() completes. void close_fs(void * helper); This function will be called once and at the end of all operations. You need to clean up any data structures, threads etc. that you created in init_fs, including deallocating any dynamic memory. int create_file(char * filename, size_t length, void * helper); Create a file with the given filename and size length bytes, and fill it with zero
May 25, 2021
SOLUTION.PDF

Get Answer To This Question

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here