Task instructionsDownload theinstructionsDownload instructionsand complete the questions. Each question has a guide for how long it should be. Please use this as a guide, however you will be assessed...

1 answer below »







Task instructions

Download the
instructions


Download instructions

and complete the questions. Each question has a guide for how long it should be. Please use this as a guide, however you will be assessed on your ability to give succinct, but informative reports.








Learning Outcomes

The Learning Outcomes for this assessment are:



  • Explain the basic principles that underpin Bioinformatics analyses, and apply these principles when analysing biological data;

  • Analyse biological data using a variety of Bioinformatics tools; and

  • Interpret correctly the outputs from tools used to analyse biological data and make meaningful predictions from these outputs.




RMIT Classification: Trusted Final Assessment – Drug Discovery in real life A new pharmaceutical company, Holien-X, is interested in finding new therapeutics for a rare type of children’s cancer, Neuroblastoma. They have gathered over 2000 patients with Neuroblastoma and have also managed to find genetically matched control patients and wish to sequence both to obtain the largest genetic set of data for this disease. Being a rare disease, these patients are very difficult to find, thus the method needs to be as accurate as possible. Your first step is to decide which technique you will use to sequence this data? To convince Holien-X, you must list why you chose this method and any pros/cons of this method (up to ½ page and/or table/figures). The sequencing was successful and using differential expression analysis (i.e. comparison between the genes upregulated in Neuroblastoma patients compared to the control patients) Holien-X has discovered the following 3 upregulated targets: >sp|Protein1 MPSCSTSTMPGMICKNPDLEFDSLQPCFYPDEDDFYFGGPDSTPPGEDIWKKFELLPTPP LSPSRGFAEHSSEPPSWVTEMLLENELWGSPAEEDAFGLGGLGGLTPNPVILQDCMWSGF SAREKLERAVSEKLQHGRGPPTAGSTAQSPGAGAASPAGRGHGGAAGAGRAGAALPAELA HPAAECVDPAVVFPFPVNKREPAPVPAAPASAPAAGPAVASGAGIAAPAGAPGVAPPRPG GRQTSGGDHKALSTSGEDTLSDSDDEDDEEEDEEEEIDVVTVEKRRSSSNTKAVTTFTIT VRPKNAALGPGRAQSSELILKRCLPIHQQHNYAAPSPYVESEDAPPQKKIKSEASPRPLK SVIPPKAKSLSPRNSDSEDSERRRNHNILERQRRNDLRSSFLTLRDHVPELVKNEKAAKV VILKKATEYVHSLQAEEHQLLLEKEKLQARQQQLLKKIEHARTC >sp|Protein2 MASGSCQGCEEDEETLKKLIVRLNNVQEGKQIETLVQILEDLLVFTYSERASKLFQGKNI HVPLLIVLDSYMRVASVQQVGWSLLCKLIEVCPGTMQSLMGPQDVGNDWEVLGVHQLILK MLTVHNASVNLSVIGLKTLDLLLTSGKITLLILDEESDIFMLIFDAMHSFPANDEVQKLG CKALHVLFERVSEEQLTEFVENKDYMILLSALTNFKDEEEIVLHVLHCLHSLAIPCNNVE VLMSGNVRCYNIVVEAMKAFPMSERIQEVSCCLLHRLTLGNFFNILVLNEVHEFVVKAVQ QYPENAALQISALSCLALLTETIFLNQDLEEKNENQENDDEGEEDKLFWLEACYKALTWH RKNKHVQEAACWALNNLLMYQNSLHEKIGDEDGHFPAHREVMLSMLMHSSSKEVFQASAN ALSTLLEQNVNFRKILLSKGIHLNVLELMQKHIHSPEVAESGCKMLNHLFEGSNTSLDIM AAVVPKILTVMKRHETSLPVQLEALRAILHFIVPGMPEESREDTEFHHKLNMVKKQCFKN DIHKLVLAALNRFIGNPGIQKCGLKVISSIVHFPDALEMLSLEGAMDSVLHTLQMYPDDQ EIQCLGLSLIGYLITKKNVFIGTGHLLAKILVSSLYRFKDVAEIQTKGFQTILAILKLSA SFSKLLVHHSFDLVIFHQMSSNIMEQKDQQFLNLCCKCFAKVAMDDYLKNVMLERACDQN NSIMVECLLLLGADANQAKEGSSLICQVCEKESSPKLVELLLNSGSREQDVRKALTISIG KGDSQIISLLLRRLALDVANNSICLGGFCIGKVEPSWLGPLFPDKTSNLRKQTNIASTLA RMVIRYQMKSAVEEGTASGSDGNFSEDVLSKFDEWTFIPDSSMDSVFAQSDDLDSEGSEG SFLVKKKSNSISVGEFYRDAVLQRCSPNLQRHSNSLGPIFDHEDLLKRKRKILSSDDSLR SSKLQSHMRHSDSISSLASEREYITSLDLSANELRDIDALSQKCCISVHLEHLEKLELHQ NALTSFPQQLCETLKSLTHLDLHSNKFTSFPSYLLKMSCIANLDVSRNDIGPSVVLDPTV KCPTLKQFNLSYNQLSFVPENLTDVVEKLEQLILEGNKISGICSPLRLKELKILNLSKNH ISSLSENFLEACPKVESFSARMNFLAAMPFLPPSMTILKLSQNKFSCIPEAILNLPHLRS LDMSSNDIQYLPGPAHWKSLNLRELLFSHNQISILDLSEKAYLWSRVEKLHLSHNKLKEI PPEIGCLENLTSLDVSYNLELRSFPNEMGKLSKIWDLPLDELHLNFDFKHIGCKAKDIIR FLQQRLKKAVPYNRMKLMIVGNTGSGKTTLLQQLMKTKKSDLGMQSATVGIDVKDWPIQI RDKRKRDLVLNVWDFAGREEFYSTHPHFMTQRALYLAVYDLSKGQAEVDAMKPWLFNIKA RASSSPVILVGTHLDVSDEKQRKACMSKITKELLNKRGFPAIRDYHFVNATEESDALAKL RKTIINESLNFKIRDQLVVGQLIPDCYVELEKIILSERKNVPIEFPVIDRKRLLQLVREN QLQLDENELPHAVHFLNESGVLLHFQDPALQLSDLYFVEPKWLCKIMAQILTVKVEGCPK HPKGIISRRDVEKFLSKKRKFPKNYMSQYFKLLEKFQIALPIGEEYLLVPSSLSDHRPVI ELPHCENSEIIIRLYEMPYFPMGFWSRLINRLLEISPYMLSGRERALRPNRMYWRQGIYL NWSPEAYCLVGSEVLDNHPESFLKITVPSCRKGCILLGQVVDHIDSLMEEWFPGLLEIDI CGEGETLLKKWALYSFNDGEEHQKILLDDLMKKAEEGDLLVNPDQPRLTIPISQIAPDLI LADLPRNIMLNNDELEFEQAPEFLLGDGSFGSVYRAAYEGEEVAVKIFNKHTSLRLLRQE LVVLCHLHHPSLISLLAAGIRPRMLVMELASKGSLDRLLQQDKASLTRTLQHRIALHVAD GLRYLHSAMIIYRDLKPHNVLLFTLYPNAAIIAKIADYGIAQYCCRMGIKTSEGTPGFRA PEVARGNVIYNQQADVYSFGLLLYDILTTGGRIVEGLKFPNEFDELEIQGKLPDPVKEYG CAPWPMVEKLIKQCLKENPQERPTSAQVFDILNSAELVCLTRRILLPKNVIVECMVATHH NSRNASIWLGCGHTDRGQLSFLDLNTEGYTSEEVADSRILCLALVHLPVEKESWIVSGTQ SGTLLVINTEDGKKRHTLEKMTDSVTCLYCNSFSKQSKQKNFLLVGTADGKLAIFEDKTV KLKGAAPLKILNIGNVSTPLMCLSESTNSTERNVMWGGCGTKIFSFSNDFTIQKLIETRT SQLFSYAAFSDSNIITVVVDTALYIAKQNSPVVEVWDKKTEKLCGLIDCVHFLREVMVKE NKESKHKMSYSGRVKTLCLQKNTALWIGTGGGHILLLDLSTRRLIRVIYNFCNSVRVMMT AQLGSLKNVMLVLGYNRKNTEGTQKQKEIQSCLTVWDINLPHEVQNLEKHIEVRKELAEK MRRTSVE >sp|Protein3 MNNFGNEEFDCHFLDEGFTAKDILDQKINEVSSSDDKDAFYVADLGDILKKHLRWLKALP RVTPFYAVKCNDSKAIVKTLAATGTGFDCASKTEIQLVQSLGVPPERIIYANPCKQVSQI KYAANNGVQMMTFDSEVELMKVARAHPKAKLVLRIATDDSKAVCRLSVKFGATLRTSRLL LERAKELNIDVVGVSFHVGSGCTDPETFVQAISDARCVFDMGAEVGFSMYLLDIGGGFPG SEDVKLKFEEITGVINPALDKYFPSDSGVRIIAEPGRYYVASAFTLAVNIIAKKIVLKEQ TGSDDEDESSEQTFMYYVNDGVYGSFNCILYDHAHVKPLLQKRPKPDEKYYSSSIWGPTC DGLDRIVERCDLPEMHVGDWMLFENMGAYTVAAASTFNGFQRPTIYYVMSGPAWQLMQQF QNPDFPPEVEEQDASTLPVSCAWESGMKRHRAACASASINV Identify and compare these 3 potential targets and decide which one you would choose for a drug discovery campaign. To convince Holien-X, you must list the pros/cons of this target and show evidence for why this target is the best for a drug discovery campaign (approx. ½ - 1 page + figures/ tables). Holien-X has re-analysed their data using network-based methods and discovered new information which shows that Aurora Kinase B (Uniprot code: Q96GD4) is the best druggable option as it is: · highly expressed and upregulated in the diseased patients compared to controls · has close homology to mouse · has a close homolog to construct a high quality homology model · has a known role in cancer Your job is to download the alphafold homology model and confirm its suitability for a drug discovery campaign. For example, you may like to; analyse the quality of this model (i.e. https://swissmodel.expasy.org/assess), analyse the properties of the protein, search for druggable pockets etc. Write these details into your report with figures to guide the team at Holien-X (approx. ½ - 1 page + figures/tables). Holien-X has taken your advice on board and would like to conduct a Virtual Screen. Write a short proposal for the steps you will undertake in order to do this (up to ½ page and/or table/figures) Congratulation, your virtual screen was successful. Holien-X has screened the compounds you suggested and has identified 5 compounds which bind to the wild-type protein but not an A105R mutant isoform (confirming your active site). They also reduce the growth of Neuroblastoma cell culture. Analyse the following table and let Holien-X know which compound you would choose to develop further and why? (up to ½ page and/or table/figures) SMILES String Activity (IC50) O=C(C1=CC=CC(Cl)=C1F)N(CC2)CCN2CC3=CC=CC(CC4=NC=CS4)=N3 1nM CN(C)CC1=CC(C2=CC(C(C3=CN(CC)N=C3C4=CC=CC=C4)=NC=N5)=C5N2)=CC=C1 0.5nM CCN1N=C(C2=CC=C(F)C=C2)C([C@H]3CCCC(C(F)(F)F)C3)=C1 1.5nM O=C(NC1=CCC=C([C@H]2CCOC2)C1)C3=NC=CN=C3 1mM O=[N+]([O-])C1=CC=C(C2=NNC(SCC#CC)=N2)C=C1C 1mM Holien-X has now developed your compound into Phase 2 clinical trials. Unfortunately, they are finding a subset of patients which are showing resistance to the drug. They have sequenced these patients, and all have the following nucleotide sequence. >MutantProtein atggcgcagaaagaaaacagctatccgtggccgtatggccgccagaccgcgccgagcggc ctgagcaccctgccgcagcgcgtgctgcgcaaagaaccggtgaccccgagcgcgctggtg ctgatgagccgcagcaacgtgcagccgaccgcggcgccgggccagaaagtgatggaaaac agcagcggcaccccggatattctgacccgccattttaccattgatgattttgaaattggc cgcccgctgggcaaaggcaaatttggcaacgtgtatctggcgcgcgaaaaaaaaagccat tttattgtggcgctgaaagtgctgtttaaaagccagattgaaaaagaaggcgtggaacat cagctgcgccgcgaaattgaaattcaggcgcatctgcatcatccgaacattgaacgcctg tataactatttttatgatcgccgccgcatttatctgattctggaatatgcgccgcgcggc gaactgtataaagaactgcagaaaagctgcacctttgatgaacagcgcaccgcgaccatt atggaagaactggcggatgcgctgatgtattgccatggcaaaaaagtgattcatcgcgat attaaaccggaaaacctgctgctgggcctgaaaggcgaactgaaaattgcggattttggc tggagcgtgcatgcgccgagcctgcgccgcaaaaccatgtgcggcaccctggattatctg ccgccggaaatgattgaaggccgcatgcataacgaaaaagtggatctgtggtgcattggc gtgctgtgctatgaactgctggtgggcaacccgccgtttgaaagcgcgagccataacgaa acctatcgccgcattgtgaaagtggatctgaaatttccggcgagcgtgccgatgggcgcg caggatctgattagcaaactgctgcgccataacccgagcgaacgcctgccgctggcgcag gtgagcgcgcatccgtgggtgcgcgcgaacagccgccgcgtgctgccgccgagcgcgctg cagagcgtggcg Holien-X would like to understand what the patient mutant is? Is it a modest mutation or significant? Where on the protein this mutation is occurring? Is it likely to influence compound binding or another aspect of the protein function? (½-1 page + table/figures Congratulations based on your analysis the drug has now passed all approvals and is being used to treat these patients. Please add a summary sentence or two describing how you feel bioinformatics helped these patients.
Answered 2 days AfterOct 27, 2022

Answer To: Task instructionsDownload theinstructionsDownload instructionsand complete the questions. Each...

Dr Shweta answered on Oct 29 2022
43 Votes
Bioinformatics: Bioinformatics is the branch of science in which using the computational tools biological data is analyzed and interpreted. It is a multi-disciplinary field, which connects different branches like physics, computer science, biology and mathematics together [1].
Basic principles utilized in the Bioinformatics technology:
The basic principles utilized in the Bioinformatics t
echnology are discussed as below:
1. Retrieval of the sequence of interest and the desired manipulation – To achieve this, firstly the sequence of interest is searched in all databases or particularly the nucleotide sequences are searched in the drop-down list at the National Center for Biotechnology Information (NCBI) via the search engine “Entrez”. Search is made with the help of suitable keywords, accession number of gene, name of gene or name of species and the search results will provide the list of genes, related sequences, proteins etc. displayed mostly in FASTA [2].
2. Sequence alignment- Sequence alignment is very essential for the purpose of search and assembly of sequence and for phylogenetics. It is commonly performed by the open access softwares- Bio-Edit and MEGA which are freely downloadable and can gets installed easily. For alignment purpose, sequences saved in FASTA format are used as an input and aligned by means of the ClustalW program [3].
3. Phylogenetics- Phylogenetic examination is used for the taxonomical, systemic and evolutionary analysis of the gene sequence. In this, multiple sequences are clustered according to the genetic distances using the software MEGA and a phylogeny tree is constructed [4].
4. Similarity search-To understand the evolutionary relationship between different genes, sequence comparison is performed via the resemblance search tool – BLAST or the Best Local Alignment Search Tool provided by NCBI and European Bioinformatics Institute (EBI). Through BLAST, the nucleotide or protein sequence to be studied is compared with all the all of the accessible sequence databases [5].
5. Primer design- Primer designing is an essential step of marker development like simple sequence repeats markers and is performed via PRIMER 3.0 [6].
6. Advanced Skills- The advanced skills of bioinformatics comprise the designing of databases and specific algorithms for the purpose of multiple sequence alignment and their analysis. Along with this, the annotation of several kinds of oligonucleotide chips, mass spectrometry, microarrays, and varied stages of next-generation sequencing [7,8].
To convince Holien-X, the chosen method along with its pros and cons is discussed as below: The best technique to sequence the desired nucleotide data is the “Next generation sequencing”. This is the best method for large-scale gene sequencing of diseased genes for research and diagnostics. Next-generation sequencing (NGS) is an enormously equivalent sequencing technology which provides the ultra-high output. It is scalable, and has very high speed. This technology is castoff for the determination of order of nucleotides in the entire genomes or the selected portions of DNA or RNA. The process of Next-generation sequencing enumerates the distinct digital sequencing read counts and offers a wider active range. Next Generation sequencing-based RNA-Sequencing is an influential technique that allows the investigators to go through the disorganization and disbursement of bequest skills like microarrays. This technique has transformed the genetic sciences, consenting labs to accomplish an extensive variety of applications and learning the biological organizations at a level which was not at all possible before [9]. The Pros of this method is that it can sequence the complete Exome (Protein coding gene) or genome in an unbiased manner. It is easily available, affordable and one-step approach. The Cons of this method is that if the diseased gene is outside the...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here