Exercise 1 (10 points) A palindrome is a word, phrase, number or other sequence of units that can be read the same way in either direction. E.g. the word level, the number XXXXXXXXXX, the phrase Step...

1 answer below »

Exercise 1 (10 points) A palindrome is a word, phrase, number or other sequence of units that


can be read the same way in either direction. E.g. the word level, the number 1234321, the


phrase Step on no pets.


Write a Python program, that reads a text file and searches for all palindromes in this file. The


program should write all palindromes (except phrases) found, together with their multiplicity


to an output file. Handle all strings case insensitive. I.e. the word Level is also a palindrome.


The input and output file should be specified as command line arguments. Copy some arbitrary


text (e.g. from the internet) and apply your program to it.


Exercise 2 (12 points) Write a Python program, that finds restriction sites in a DNA sequence.


Restriction sites are positions where restriction enzymes cut the DNA. They are usually recognized


by a short, specific sequence motif.


Here are the recognition sequences for the restriction enzymes PpuMI, MspA1I, and MslI:


PpuMI RGˆGWCCY


MspA1I CMGˆCKG


MslI CAYNNˆNNRTG


Note: K stands for G or T, M means A or C, N stands for A, C, G or T, R is A or G, W is A or


T, and Y is short for C or T. The caret (ˆ) indicates the cut site.


Given a file with DNA sequences use regular expressions to look for all restriction sites of


the three enzymes listed above and print the position after the cut site to an output file (e.g.


the position of the G for PpuMI). Make sure that the name of the input and output file can


be specified as command line arguments and exactly two command line arguments have been


specified.


Use the UCSC Genome Browser http://genome.ucsc.edu/ in order to download the DNA


sequence from human reference genome version hg38 of chromosome 22 band q13.1 (chr22


bp 37200001-40600000). Remove the header line (manually or skip it in your program) and


use the created file as input for your program.


Many restriction enzymes have so called palindromic recognition sequences. Read up what


palindromic means in the context of DNA sequences. Which of the three enzymes from above


has such a palindromic recognition sequence (argument your choice)? What is the advantage


of a palindromic recognition sequence?


Hint: Unlike for most computer scientists, for biologists the first base of a sequence is at


position 1 not 0.

Answered 1022 days AfterJan 10, 2020

Answer To: Exercise 1 (10 points) A palindrome is a word, phrase, number or other sequence of units that can be...

Sharda answered on Oct 29 2022
14 Votes
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here