Advanced Data Structures: Assignment 1 DrugBank and Drug DesignIn this assignment, we play with the DrugBank data to build a graph. You are expected to finish the following tasks.2 Your Tasks...

1 answer below »
Data Structure Assignment. Please read the attached pdf line by line. Must use intelliJ. Please and must comment for my understanding. Please complete before the deadline!


Advanced Data Structures: Assignment 1 DrugBank and Drug Design In this assignment, we play with the DrugBank data to build a graph. You are expected to finish the following tasks. 2 Your Tasks (Total: 12 marks) You should define classes named Vertex and DrugGraph with the following requirements (feel free to define extra variables, classes, and methods if needed). 1. You should define class Vertex with data attributes drugBankID, (this is used as node label), genericName, SMILES, url, drugGroups, and score, wasVisited (or known), dist, and path. This class has at least one method, named displayDrug to print out the information on the screen. (0.5 mark) 2. Define a method named readData under the DrugGraph class to realize the following functionality: • Load all the provided information from the given text file dockedApproved.tab to an array variable named vertices which is an attribute of the DrugGraph class. Each element in array vertices is an object of class Vertex. You should give reasonable initial values to variables wasVisited, dist, and path. (0.5 mark) • Load the provided data from the given text file sim mat.tab to an array variable named W which is also an attribute of the DrugGraph class. Note that, the data in sim mat.tab is the similarity matrix of the drugs, where the value at the i-th row and the j-th column is the Tanimoto similarity between the i-th drug and the j-th drug; the row and column orders of drugs are same as the drug order in file dockedApproved.tab. Since the values in sim mat.tab are similarities, we need to convert these values into dissimilarities (i.e., distances) in order to have proper values in array W. Suppose a value in the file is denoted by vij , the corresponding value in matrix W should be wij = { 1− vij if (1− vij) ≤ threshold and i 6= j ∞ otherwise , (1) where threshold ∈ (0, 1). We use threshold = 0.7 in this assignment. Here W is actually an weighted adjacency matrix (in terms of dissimilarity/distance). (1 mark) • Based on W, obtain the discrete (unweighted) adjacency matrix A which is an array variable under class DrugGraph. A value aij in matrix A is calculated as aij = { 1 if wij 6=∞ 0 otherwise , (2) which means that matrix A is a 0-1 matrix. (0.5 mark) 1 3. Define a method named BFS(int i) for breath-first search on a graph starting from a drug index i. (1 mark) 4. Define a method named findModules under the DrugGraph class to find the number of disconnected components (module) in the graph. This method should print the number of vertices in each module in an sorted (descending) order. Hint: you may create a variable of integer type named module under class Vertex to record the module ID for a drug. Within this method, you should repeatedly call the BFS method to traverse through unvisited vertices. For each BFS call, the traversed drugs are given the same module ID (e.g., one of 0, 1, 2, ...). Module 0 should be larger then Module 1, and so on. That is, Module 0 is the largest module, Module 1 the second, ... (1.5 mark) 5. Define a method named keepAModule(int moduleID) that keeps the connected component with a given module ID and removes all other modules. Note that, all the methods defined below are based on the kept module rather than the entire graph which is disconnected. (1 mark) 6. Define a wrapper method named findShortestPath(fromDrug, toDrug, method) to find the shortest path from fromDrug to toDrug on the connected module, where fromDrug and toDrug are the labels of two drugs. method is either ‘‘unweighted’’ or ‘‘weighted’’. This method should print the sequence of DrugBankIDs of the shortest path on the screen. To realize this wrapper method, you may need to define two extra methods to implement the unweighted algorithm (BFS-based algorithm) and weighted algorithm (Dijkstra’s algorithm) for finding the shortest path starting from a vertex. (3 marks) 7. Define a method named MSTPrim to find the minimum spanning tree on the connected module. In this method, you should write the list of DrugBank IDs (can the same order as the drugs in your array attribute) on the MST to a text file named MSTPrimResult.txt with each row for a DrugBank ID and associated dist value. This function should also print the total length (i.e., the sum of dist values on the MST) of the MST on screen. (2 marks) 8. In the main function, instance (named dg) of the DrugGraph should be created. The following methods of this instance should be called sequentially: (0.5 mark) (a) dg.readData(...) (b) dg.findModules(...) (Hint: there are 87 modules, the first module has 1801 vertices.) (c) dg.keepAModule(0) This call is to keep the largest connected component only and remove other components. (d) dg.findShorestPath(’DB01050’, ’DB00316’, ’unweighted’) (The result should be: ’DB01050’ - ’DB08868’ - ’DB00291’ - ’DB00265’ - ’DB00316’) (e) dg.findShorestPath(’DB01050’, ’DB00316’, ’weighted’) (The shortest path should be ’DB01050’ - ’DB00788’ - ’DB00264’ - ’DB01297’ - ’DB00316’.) (f) dg.MSTPrim(...) (Hint: the returned total cost should be 625.65.) 9. Your code should be well commented. (0.5 mark) 3 Submission • Your source code. • A PDF printout of your source code. • Output text file MSTPrimResult.txt (from Task 7). • A text file that contains all the printed out information on the screen/console. • If any of the above require files are not submitted, 0 mark will be given to the whole assignment. 2
Answered 5 days AfterNov 28, 2022

Answer To: Advanced Data Structures: Assignment 1 DrugBank and Drug DesignIn this assignment, we play with...

Vikas answered on Dec 03 2022
38 Votes
Source Code 1
Source Code
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.List;
import java.util.Queue;
import java.util.Random;
import java.io.*;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.HashMap;
import java.util.HashSet;
import java.util.LinkedList;

/**
* This is the class that represents an element of dockedapproved.tab using genericNam
e, SMILES, drugBankID, url,
* drugGroups, score
*
*/

class Vertex {
String genericName;
String SMILES;
String drugBankID;
String url;
String drugGroups;
String score;
boolean visited;
int dist;
String path;

public Vertex(String genericName, String SMILES, String drugBankID, String url, St
ring drugGroups, String score) {
this.genericName = genericName;
this.SMILES = SMILES;
this.drugBankID = drugBankID;
this.url = url;
this.drugGroups = drugGroups;
this.score = score;
this.visited = false;
this.dist = 0;
this.path = "";
}

/**
* This is the Method to display the information of Vertex class
*
* @param void
*
* @return void
*/
Source Code 2

void displayVertex() {
System.out.println(genericName + " " + SMILES + " " + drugBankID + " " + url +
" " + drugGroups + " " + score +
" " + visited + " " + dist + " " + path);
}
}

/**
* This is the main class of the program containing main function and all the calculati
ons and outputs are
* generated in this class
*
*/

class DrugGraph {
private HashMap mapWeighted = new HashMap<>();
private HashMap mapUnWeighted = new HashMap<>();
private Vertex vertices[];
private double tot;
private int size = 1932;
private int moduleToKeep;
private double W[][];
private int A[][];

/**
* This is the readData() method used to read the input from dockedapproved.tab
*
* @param void
*
* @return void
*/

void readData() throws IOException {
// read the text file and fill the data array
List list = new ArrayList<>();
String line = "";
size = 1932;

try {
BufferedReader br = new BufferedReader(new FileReader("dockedapproved.ta
b"));

while((line = br.readLine()) != null) {
String[] values = line.split("\\s+");

int idx = whileAlphabetic(values);

String name = "";
for(int i = 0; i < idx; i++) name += values[i] + " ";

String group = "";
for(int i = idx+3; i < values.length-1; i++) group += values[i] + " ";

String ID = "";
for(int i = 0; i < values.length; i++) {
if(values[i].startsWith("DB")) ID = values[i];
}
Source Code 3

list.add(new Vertex(name, values[idx], ID, values[idx+2], group, value
s[values.length-1]));
}

br.close();

vertices = new Vertex[list.size() -1];
for(int i = 1; i < list.size(); i++) {
vertices[i-1] = list.get(i);
}

}
catch(FileNotFoundException e) {
e.printStackTrace();
}

W = new double[1932][1932];
A = new int[1932][1932];

...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here