Untitled document BLAST interpretation Following questions will require interpretation of BLAST searches. You may find these NCBI help webpages useful for your reference: ● The BLAST glossary. ● The...

1 answer below »
see attached file


Untitled document BLAST interpretation Following questions will require interpretation of BLAST searches. You may find these NCBI help webpages useful for your reference: ● The BLAST glossary. ● The guide to BLAST home and search pages. ● BLAST video tutorials. 1. How many basepairs of the top sequence in this pairwise alignment are inferred to share positional homology with a basepair of the bottom sequence? Enter your count of such basepairs as a whole number without units. 2. INSTRUCTIONS: 1. Read the imaginary email below. 2. Run a BLASTN search of an appropriate NCBI database using the DNA sequence contained in the email as a query. 1. Copy and paste the sequence from the email in the BLAST query box. 2. Select the "Nucleotide collection (nr/nt)" database. 3. Select the BLASTN program/algorithm. 4. Under "Algorithm parameters", specify an "Expect threshold" of 1000 in the text box. 5. Leave all other options at their default setting. 6. Note: If you set the options correctly, you will retrieve some hits. 3. Analyze the results of the BLASTN search by applying the concepts discussed in class. 4. Write a response in the space provided (10 marks in total as indicated). Format your response as a coherent email [1 mark] in your own words. Your response should include a prediction regarding the homology or analogy of the sequence to the sequences you identified using BLASTN (if any) [1 mark], the name of the specific NCBI database that you queried, justification or your prediction based on interpretation of relevant BLAST outputs including sequence comparison metrics discussed in class [4 marks], and consideration of the evidence in the context of the problem described in the email [1 mark]. https://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE_TYPE=BlastDocs https://www.ncbi.nlm.nih.gov/books/NBK62051/ https://ftp.ncbi.nlm.nih.gov/pub/factsheets/HowTo_BLASTGuide.pdf https://www.youtube.com/playlist?list=PL7dF9e2qSW0azL2xOKAtxDW7QI8UU4XZ6 5. Also, include in your response a question for the (imaginary) sender, requesting information which they might have, and which could help you better infer homology or analogy [1 mark]. 6. Cite any sources you refer to by including the author, year, and Digital Object Identifier (DOI) number (or website URL if relevant) [1 mark]. At least one source must be cited. 7. Limit your response length to a minimum of 300 words and a maximum of 500 words [1 mark]. (Hint: Compose first in a google document, count words with the word count tool, then paste into the response box). 8. Please do not sign your response with your real name. You may use a pseudonym instead if you wish. IMAGINARY EMAIL: __________________________________________________________________________________________________ ___ From: [email protected] Date: September 30, 2034 Dear Dr. Student, I am the chief scientific officer of MarsX. We recently received a transmission from personnel at our Martian base station, and they have apparently identified living cells in samples obtained from deep in the martian subsurface. To verify whether these are the first forms of martian life discovered, or microbes introduced from earth, they used a portable sequencing device in an attempt to obtain identifying DNA sequences. Unfortunately, they ran out of reagents before sequencing was complete. This is the only sequence they obtained: >001 Unknown martian DNA sequence TCCGGTGGCAATGGCGGAGG Our staff biologists cannot tell whether the sequence is from any known terrestrial organism or not. If not, we assume it could be of martian origin and demonstrate that martians have the same genetic material as terrestrial organisms. This is a matter of urgent importance for us, as our prediction regarding the origin of these organisms will determine what experimental equipment we send via our next earth-to-mars mission which is scheduled to launch in three days. Any advice you can provide would be greatly appreciated, and you would be credited appropriately in any resulting scientific publications. Thank you in advance, Sincerely, Dr. Chuck Darwin -- Chief Scientific Officer, MarsX __________________________________________________________________________________________________ ___ Sequence similarity clustering case study material Following questions will refer to the information below. Below is a table of percent sequence identities calculated from pairwise alignments of 16S rRNA sequences from 13 species of bacteria in the genus Bacillus (abbreviated as "B." in the table). All the sequences were compared against all the others. So, for example, the percent sequence identity in the pairwise alignment of B. brevis 16S rRNA and B. alvei 16S rRNA is 89.2%. The image of this table is also available in PDF format here: link to PDF. https://drive.google.com/file/d/1yoUE0CCjosZDE8uvVzvW2h9doZ3L27cd/view?usp=sharing Regarding the table of Bacillus 16S rRNA pairwise sequence identities, which Bacillus species contains the 16S rRNA orthologue with the greatest proportion of identical nucleotide bases at sites which are inferred to be homologous following pairwise alignment with the Bacillus subtilis 16S rRNA sequence? Select one: a. B. stearothermophilus b. B. laterosporus c. B. cycloheptanicus d. B. megaterium B. polymyxa f. https://eclass.srv.ualberta.ca/mod/quiz/attempt.php?attempt=8543333&cmid=5575939&page=4# https://eclass.srv.ualberta.ca/mod/quiz/attempt.php?attempt=8543333&cmid=5575939&page=4# B. marinus g. B. coagulans h. B. alvei B. brevis j. B. psychrophilus k. B. marcerans l. B. macquariensis Based on the table of pairwise percent identities of Bacillus 16S rRNA sequences, which of these rooted tree topologies would you infer? Assume that greater sequence identity implies more recent common ancestry. Select all that apply. Select all that apply: a. b. c. https://eclass.srv.ualberta.ca/mod/quiz/attempt.php?attempt=8543333&cmid=5575939&page=4# https://eclass.srv.ualberta.ca/mod/quiz/attempt.php?attempt=8543333&cmid=5575939&page=4# d. Types of homology case study material Following questions will refer to the phylogenetic tree described below. The image below is a phylogenetic tree of EF-Tu (EF-1α) and EF-G (EF-2) protein-coding genes. Branch supports are bootstrap percentages (bootstrap percentages from multiple methods are shown on some branches in bold font). The tip labels indicate the genus (or in some cases the genus and species) of the organism from which the gene sequences were derived. The brackets on the right side of the tree indicate the domain of life into which the genera/species are classified: Euk = Eukaryotes, Arc = Archaea, Eub = Eubacteria (Bacteria). This image is also available in PDF format here: link to PDF. https://drive.google.com/file/d/1dTTN-NutMPkiOpKwF0cYO0po70MvV7e0/view?usp=sharing Choose the best interpretation of the phylogenetic tree of EF protein genes shown in the case study material. a. This tree is inconsistent with the Eocyte hypothesis, because organisms of the taxonomic domain Archaea form a paraphyletic group. This topology is consistent with a two-domain tree of life. b. This tree refutes the Eocyte hypothesis, because organisms of the taxonomic domain Archaea form a monophyletic group. This topology is consistent with a three-domain tree of life. c. This tree refutes the Eocyte hypothesis, because organisms of the taxonomic domain Archaea form a monophyletic group. This topology is consistent with a two-domain tree of life. d. This tree supports the Eocyte hypothesis, because organisms of the taxonomic domain Archaea form a monophyletic group. This topology is consistent with a two-domain tree of life. e. This tree supports the Eocyte hypothesis, because organisms of the taxonomic domain Archaea form a polyphyletic group. This topology is consistent with a two-domain tree of life. f. This tree supports the Eocyte hypothesis, because organisms of the taxonomic domain Archaea form a paraphyletic group. This topology is consistent with a two-domain tree of life. g. This tree supports the Eocyte hypothesis, because organisms of the taxonomic domain Archaea form a paraphyletic group. This topology is consistent with a three-domain tree of life. https://eclass.srv.ualberta.ca/mod/quiz/attempt.php?attempt=8543333&cmid=5575939&page=6# https://eclass.srv.ualberta.ca/mod/quiz/attempt.php?attempt=8543333&cmid=5575939&page=6# Choose the statement that best characterizes the phylogenetic tree of elongation factor genes shown in the case study. a. This is an arbitrarily rooted phylogram of homologous genes (which code for elongation factor proteins) containing two paralogous clades. The bifurcation between these two clades is found in some topologies inferred from bootstrapped alignments. Poorly supported branches in the topology of the tree shown are consistent with the root of the tree of life being on a branch between ancestral prokaryotes, implying that eukaryotes originated after diversification of prokaryotes. b. This is an arbitrarily unrooted phylogram of homologous genes (which code for elongation factor proteins) containing two paralogous clades. The bifurcation between these two clades is found in some topologies inferred from bootstrapped alignments. Weakly supported branches in the topology of the tree shown are consistent with the root of the tree of life being on a branch between an ancestral bacterium and an ancestral archaeon, implying that eukaryotes originated after diversification of bacteria. c. This is an arbitrarily rooted phylogram of homologous genes (which code for elongation factor proteins) containing two orthologous clades. The split between these two clades is found in all topologies inferred from bootstrapped alignments. Strongly supported branches in the topology of the tree shown are consistent with the root of the tree of microbial life being on a branch between ancestral prokaryotes, implying that prokaryotes originated after diversification of eukaryotes. d. This is an arbitrarily rooted phylogram of homologous genes (which code for elongation factor proteins) containing three paralogous clades. The bifurcation between these three clades is found in all topologies inferred from bootstrapped alignments. Strongly supported branches in the topology of the tree shown are consistent with the outgroup of the tree of life being on a branch between ancestral prokaryotes, implying that eukaryotes originated after diversification of prokaryotes. e. This is an unrooted phylogram of homologous genes (which code for elongation factor proteins) containing two paralogous clades. The bifurcation between these two clades is found in some topologies inferred from bootstrapped alignments. Strongly supported branches in the topology of the tree shown are consistent with the root of the tree of life being on a branch between ancestral prokaryotes, implying that eukaryotes originated after diversification of prokaryotes. https://eclass.srv.ualberta.ca/mod/quiz/attempt.php?attempt=8543333&cmid=5575939&page=6# https://eclass.srv.ualberta.ca/mod/quiz/attempt.php?attempt=8543333&cmid=5575939&page=6# f. This is an arbitrarily rooted phylogram of homologous genes (which code for elongation factor proteins) containing two paralogous clades. The bifurcation between these two clades is found in all topologies inferred from bootstrapped alignments. Strongly supported branches in the topology of the tree shown are consistent with the root of the tree of life being on a branch between ancestral eukaryotes, implying that prokaryotes originated after diversification of eukaryotes. g. This is an arbitrarily rooted phylogram of homologous genes (which code for elongation factor proteins) containing two paralogous clades. The bifurcation between these two clades is found in all topologies inferred from bootstrapped alignments. Strongly supported branches in the topology of the tree shown are consistent with the root of the tree of life being on a branch between ancestral prokaryotes, implying that eukaryotes originated after diversification of prokaryotes. https://eclass.srv.ualberta.ca/mod/quiz/attempt.php?attempt=8543333&cmid=5575939&page=6#
Answered 1 days AfterSep 29, 2021

Answer To: Untitled document BLAST interpretation Following questions will require interpretation of BLAST...

Dr. Sulabh answered on Sep 30 2021
116 Votes
1
    Assignment solutions    
Name of the student-
Name of the subject-
Name of the course-
Name of the College or University
Bioinformatics Solutions
1. There are 38 nucleotides that are identical in the given 2 sequences as identified in th
e sequence alignment software.
BLASTN search results for the query question sequence “TCCGGTGGCAATGGCGGAGG” as given in the question is as follows:
The research group is working on the sequences of microbes obtained from MARS to analyze the homology, analogy and the phylogenetic history of the classification of the different microbes with the other microbes having known sequences in the genome library.
1. According to the email written by the scientific officer the main aim is to analyze the given sequence and compare this Martian sequence with the other known sequences given in the NCBI database to find a comparative homology, analogy and phylogenetic analysis of the different sequences.
2 .The BLASTN results of the sequence given above are as follows showing a comparative similarity of this sequence with the other sequences in the genome.
2. Based on the NCBI BLASTN results the given Martian sequence has homology with the following sequences that are already present in the database like Candidatus, Firmicutes, Clavispora, and Aspergillus genome. There is comparative similarity of this sequence with the other sequences given in the NCBI database table above.
3. Email
Dear Sir,
Based on the NCBI BLASTN results for the given Martian sequence there is a considerable similarity between this sequence and the other sequences that are given in the database.
The software that is used for analyzing the results is the BLASTN software. The different comparison metrics that were used for analyzing the different results in the database are Max score analysis, total score analysis, Query cover similarity 100%, E-value, percent identity, accession length and accession number. The maximum similarity of the given sequence is 100% with the other microbe species like Candidatus, Firmicutes, Clavispora, and Aspergillus genome. There is similarity of this sequence with the other sequences present in the NCBI database also but that similarity is not 100% but around 95% and 90% similarity with the bacteroidetes, bifidobacterium microbes present in the NCBI genome. From the NCBI BLAST analysis it is clear that the martian sequence given in the question has complete...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here