matlab ALGORITHMS IN BIOINFOMATICS (MATLAB ) 1. Given L=[1,3,4,4,5,7,9,11,14], make a quick conclusion that a PDP answer can never be found. DO NOT apply skiena's algorithm. 2. Given L =...

1 answer below »
ALOGORITHMS IN BIOINFOMATICS


matlab ALGORITHMS IN BIOINFOMATICS (MATLAB ) 1. Given L=[1,3,4,4,5,7,9,11,14], make a quick conclusion that a PDP answer can never be found. DO NOT apply skiena's algorithm. 2. Given L = [1,1,1,2,2,3,3,3,4,4,5,5,6,6,6,9,9,10,11,12,15], 1. Provide detailed discussion and exact calculation of search space (expression only, no actual number) when a naive brute force is proposed, i.e. try every possible value between 1 to 15. 2. Provide same analysis when a 'smart' brute force is proposed, i.e. only try valid values. 3. Looking the Longest Common Substring Problem: Given two strings X and Y, find the length of the longest common substring. For example X="abcd" Y="bbc", answer is "bc". 1) Propose a 'fast derived' algorithm. Find the size of the search space. Provide Big O. 2) Propose an improved version of (1) Provide Big O. 4. Given L={1,1,4,5,6,8,9,10,11,12} in PDP, why 3 can not be part of the answer? 5. For a complete search tree for 3-mers in the 3-letter alphabet {a, b, c}. What is the total number of leaves? What is the total number of nodes? How to represent the root? 6. Shortest Binary Superstring Problem: Given a set of strings of binary 0 or 1, find a shortest string that contains all of them. Description Input: Strings s_1, s_2, ..., s_n. Output: A string s that contains all strings s_1, s_2,..., s_n as substrings, such that the length of s is minimized. Example: Strings {000,001,010,011,100,101,110,111}. s=0001110100 Outline a very detailed naive brute forth solution for the problem. Use the following string set up {00000,010,0001,0111}, i.e 4 strings of length {5,3,4,4} Provide your search space and show exact the number of testing cases (in expression). Try to propose a more efficient brute forth solution and justify it but no analysis 7. Let DNA (5 x 40) be CCTGATAGACGCTATCTGGCTATCCACGTACGTAGGTCCT A G T A C T G G T G T A C A T T T G A T A C G T A C G T A C A C C G G C A A C C A A A C G T A C G T G C A C C C T C T T T C T T C G T G G C T C T G G C C A A C A G C C T C C G A T G T A A G T C A T A G C T G T A A C T A T T A C C T G C C A CTGTTATACAACGCGTCATGGCGGGGTATGCGTTTTGGTC Let l=3 and s=(5,4,3,2,1) what is the profile for the l-mer the the score(s) the consensus string What is a possible median string? Let v=GGGG, what is the totalDistance(v, DNA) what is the score and what is s? 8. The eight queens puzzle is the problem of placing eight chess queens on an 8x8 chessboard so that no two queens attack each other. Thus, a solution requires that no two queens share the same row, column, or diagonal. The eight queens puzzle is an example of the more general n-queens problem of placing n queens on an n x n chessboard, where solutions exist for all natural numbers n >= 4. Outline a detailed naive brute forth solution for the 4-queens problem. Provide your search space and show exact the number of testing cases (in decimal expression). Apply the same strategy, what is the number of testing cases for 8-queens problem? Could you do better? Try to propose a more efficient solution and justify it.
Answered 1 days AfterFeb 21, 2022

Answer To: matlab ALGORITHMS IN BIOINFOMATICS (MATLAB ) 1. Given L=[1,3,4,4,5,7,9,11,14], make a quick...

Dr Shweta answered on Feb 23 2022
94 Votes
ALGORITHMS IN BIOINFOMATICS (MATLAB)
Q.1 Given L= [1,3,4,4,5,7,9,11,14], make a quick conclusion that a PDP answer can never be found. DO NOT apply skiena's algorithm
Ans The PDP or partial dependence plots
in MATLAB shows the relationship or dependence of input factors and the response so at least two strings are needed which is not mentioned here.
Q.2 Given L = [1,1,1,2,2,3,3,3,4,4,5,5,6,6,6,9,9,10,11,12,15], 1. Provide detailed discussion and exact calculation of search space (expression only, no actual number) when a naive brute force is proposed, i.e., try every possible value between 1 to 15. 2. Provide same analysis when a 'smart' brute force is proposed, i.e., only try valid values.
Ans.For naïve brute force the condition for loop is P [1....... m] = T [s+1.......s+m] for each of the n - m +1 possible value of s. While for smart brute force, subsequence pairs are analyzed in a specific order ((Si, Sj), (Si+1, Sj+2)----) that reduces the time complexity from offset of motif length O(L) to offer of 1 O(1).
Q.3 Looking the Longest Common Substring Problem: Given two strings X and Y, find the length of the longest common substring. For example, X="abcd" Y="bbc", answer is "bc". 1) Propose a 'fast derived' algorithm. Find the size of the search space. Provide Big O. 2) Propose an improved version of (1) Provide Big O.
Ans. 'fast derived' algorithm - For two given sequences X and Y with string length n and m, the required memory is max{8*(n+1) *8*(m*1), L}, where L is the number of identical character pairs. The time complexity is O(|LCS(X,Y)|), where, |LCS(X,Y)| is the length of the Longest Common Substring of X,Y. The size of the search space-The fast derived algorithm for Longest common substring is also known as FAST_LCS. To obtain all the identical pairs with their levels, this algorithm firstly seeks the successors of the initial identical character pairs as per the successor table. Later, it traces the identical character pair at largest level. The improved version of (1) – We use a hash table algorithm to find the longest common substring,...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here