COMP 3380 Winter 2023 Project Overview This project is an opportunity for you to design and implement your own database based on publicly available data. You will model the database using the ER...

2 answer below »
I only Need Part 2. I already complete the part 1. the project on project.pdf and my database on the mainDara.zip


COMP 3380 Winter 2023 Project Overview This project is an opportunity for you to design and implement your own database based on publicly available data. You will model the database using the ER and EER concepts learned in class. You will design an interface for an analyst with no programming experience to query your database . The project will be 20% of your final grade. Project Teams The project is to be completed in team of two to five people. You are free to select your own teams by assigning yourself to a group in UMLearn. Anyone who is not in a team by January 26,2023 will be assigned a team by the course instructor. A note about team: some teams will have communication problems, or work distribution problems. ALL team members are responsible for effective team functioning. In extreme cases only contact your instructor to mediate a group meeting. Waiting just before a deadline to solve the problem is too late. Project Deliverables The project is divided into three parts. Parts of the project build on each other. When starting the project, read through every part’s description to have an idea of what’s coming. For example, in Part 3, you will have to explain decisions you made for Part 1, taking good notes while doing Part 1 will help with Part 3. Each part will be submitted to UMLearn. • Part 1 (35%): Finding data and designing a database. Due February 28th 2023 at 8pm. • Part 2 (50%): Creating and populating a database and implementing an interface. March 28th , 2023 at 8pm. • Part 3 (15%): Writing a report. Due April 6,2023 at 8pm. Project Marking Please see the marking rubrics in the individual part descriptions to help prioritize your work. In general, we will be looking for ambitious and creative projects that are well executed. Details listed in the descriptions are but minimum requirements, an excellent project deserving of an A+ will seek to surpass what is described in this document. All rubrics use a non-linear scale as follows to represent the quality of the work submitted. Each item will be graded /5. Furthermore, some items might carry more weight and are denoted by multipliers. • 5 – top of the class, amazing • 4 – great work, all required components covered and done well • 3 – OK work, some components missing, some things done well • 2 – poor work, some components missing, few components done well • 1 – very poor work, many components missing, components not done well • 0 – not submitted or does not demonstrate understanding the deliverable Part 1: Designing a Database A) Data Discovery The first part of the project involves finding some data to analyze. One viable source is the Winnipeg Open Data portal (https://data.winnipeg.ca), though you are free to use other sources (e.g., wikis). Data from any public source is acceptable but remember to acknowledge your source(s)! (Be aware of any copyright or licensing issues if you want to use this data beyond this course.) You can aggregate data from multiple sources, but all data must ultimately be connected. In other words, the ER diagram you draw with this data for Part 1.B must be a connected graph (even better, try to have a tightly connected graph, i.e., the graph remains connected if you start removing tables). Aim to find and/or create a dataset that will ultimately break down to more than 10 tables and 1000 rows with relatively few support tables compared to main tables after completing Part 1.B. A support table is a table that is usually small in both arity and cardinality, and is mostly used for lookup purposes (e.g., a table that just has Rank and Salary, where you can lookup a Salary based on Rank). B) Database Design You will draw an ER model (including EER components, if appropriate) which represents the database you have chosen to create. The model must include participation and cardinality constraints, as well as a brief justification for each. Justifications should explain the “why” of constraints, not merely putting them into words (e.g., “not every Song is written by an Artist” = bad, “some Songs are written by unknown Artists, and so aren’t in the Wrote table” = better). You will then convert your ER model to a relational model and normalize it as much as possible using the rules and standards discussed in class and in the lectures. Part 1: Deliverable A PDF containing a brief summary of your dataset (e.g., a short description of what it is, how much data), your ER model, and the final relational model (post-merge and post-normalization). Include your justifications for participation/cardinality constraints and the steps you took for translating your ER model to your final relational model (including steps for merging and normalizing). Part 1: Marking • Quality of dataset (e.g., size, connectedness) • Ratio of support tables to main tables • ER/EER diagram • Justification of participation and cardinality constraints • Translating and merging ER/EER diagram to relational model • Normalization Part 2: The Database and its Interface A) Database Creation and Population You will implement the database you designed in Part 1. If you were given feedback from Part 1, it should be incorporated with You will be given a Microsoft SQL Server schema you must create your database on. https://data.winnipeg.ca/ Once your database is created, you will populate it with your data. You are must to use a code-based method to add records, you will submit your code for the project. B) Implementing an Interface You will create a front-end interface which allows a person (say, an analyst) to access and use your database. It can be as simple (e.g., command line interface) or as feature-rich (e.g., complete GUI) as you want, and can be implemented using any programming language you want (e.g., Java, Python, as a website). However, you must consider the following requirements: • When using the interface, the database should be relatively secure according to what was discussed in class (e.g., can’t allow freely entering SQL commands to prevent SQL injection). • Your interface should support an analyst trying to answer interesting questions they might have of the data, which are not easily answerable. You might consider taking some time to come up with interesting questions an analyst might have and allow your interface to execute the relevant queries. You should support at least one query which includes GROUP BY, one that includes ORDER BY, and one that includes an aggregate function. Note that your interface does not necessarily have to make these components explicit, at a minimum, it should simply allow someone to run those queries. Some tips: o GROUP BY and aggregate functions are hard for humans to do on the fly, so are an easy way to create interesting queries. o Queries should be relevant and potentially useful to an analyst. Complex queries that are hundreds of words long, nested four layers deep could be interesting for you to implement but might be too convoluted for an analyst to ever reasonably run. o Don’t worry too much about optimizing your queries, as long as they run in a reasonable amount of time. However, consider informing the user with a message of some kind if a query is currently being executed (so that they don’t think your interface has crashed). • Content from all tables should be accessible one way or another. Part 2: Deliverable Everything required to create database tables, relationships, populate the tables, and finally the program you wrote for interacting with your database. Include a readme.md file with instructions on how to create and populate the database and run your program. Part 2: Marking • Database created and populated • Queries – correctness • Queries – complexity and interestingness • Interface – functionality • Interface – level of abstraction and ease-of-use Part 3: Project Report For this last part, you will write a short report (3 - 5 pages) detailing your progress through the project. If you took good notes when working through Parts 1 and 2, this should be relatively quick and easy to put together. The report should at least include the following information, but you should consider adding anything else that you find interesting or that might help someone reading your report understand what you did. • A summary of the data: Why was it chosen? What does if consist of? How large is it (file size, number of records)? Don’t forget to acknowledge the sources of your data! Include an ER diagram. • A discussion of the data model o Why was it broken down into those tables? o Did you face any difficult choices when designing the model (e.g., tricky participation/cardinality ratio decisions)? o Did the data model cleanly fit into the relational database? o Do you regret any decisions you made in your model? Did you change your model from part 1 when implementing it in part 2? What changes, and why? • Could the data be modelled in a different way, why or why not? Given the work completed, would you choose this model? • A list of interesting queries you can run using the interface: Explain what the queries return, you don’t have to include the SQL code. • Does this dataset require a relational database? Would other database system be a better choice in modelling this data? Why or why not? Would the “interesting queries” you wrote be easier or harder to re-create if you were using an alternative database? • Would this database be a good teaching tool for COMP 3380? Are there good problems for future students to solve in this database? (Optional) Part 3: Deliverable A PDF of your report. Part 3: Marking • Summary of data • Discussion of data model • Summary of the database • Summary of the interface • List of interesting queries • Other interesting discussions or summaries • Writing quality Comp 3380 Winter 2023 ASSIGNMENT 3 - Due Date Thursday March 30,2023 at 8:30 am This assignment is individual work. Be sure to fill out the honesty declaration available on the course website. You will not be able to see your grade without filling out the declaration. Please submit your assignment in Crowdmark Question 1 (10 marks) (a) Briefly describe the 3 approaches to database programming including at least one advantage and one disadvantage of each approach (3 marks) (b) Describe impedance mismatch issues, and possible solution (2 marks) (c) Describe any 2 issues arising from SQL injection any 2 protecting techniques (2 marks) (d) Describe at least 2 uses and 2 issues of indexing in relational database (2 mark) (e) Compare and contrast VIEW and TRIGGER in relational database (1 marks) Question 2 (20 marks) Consider the Parking Winnipeg dataset provide in UMLEARN in answering the following questions using a programming language (Python or Java) and clearly showing how you CONNECT-EXECUTE-CLOSE. a. Create a database table for the dataset introducing a SURROGATE key as the primary key. Submit you programming code for the create table showing the addition of surrogate key to the existing attributes (5 marks) b. Write an embedded SQL statement within your programming language to classify the dataset to year and day name hierarchy. See expected sample output in table 1 (5 marks) Year Day Count 2010 Monday 2343 c. Write an embedded SQL statement to retrieve the street, year, month, day, and the hour when parking authority issued the highest number of tickets. (5 marks) d. Write an embedded SQL query to create a database procedure (classifier) to classify the full fine into 3 categories (5 marks) a. “Common Ticket” if the FULLFINE is less than or equal to 70 b. “Valued Ticket” if the FULLFINE is between 71 and 200 inclusive c. “Most valued Ticket “ if the FULLFINE is greater than 200 d. “Unknown” if the full fine is null Bonus: Call and execute the procedure within your program (1 mark) Question 3 (5 MARKS) Proof using Armstrong inference rule 3 Given the following relation R and its functional dependencies, Find the closure of C Using Armstrong’s Axioms. R(A, B, C, D, X, Y, W) BDW -> C Y -> W DXY -> AB C -> BW BC -> Y Question 4 (20 marks) Use the Library Database for the following questions in Neo4j a. Create the database including all relationships in Neo4j (5 marks) Submit the queries and the outputs b. Submit the query, graph, and text for the following i. Write a Cypher
Answered 15 days AfterMar 20, 2023

Answer To: COMP 3380 Winter 2023 Project Overview This project is an opportunity for you to design and...

Sumit Kumar answered on Apr 04 2023
27 Votes
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here