You will perform an analysis for a dataset using R and create an Rmarkdown document. This will involve creating functions for reading data files, plotting the data and performing some statistical...


You will perform an analysis for a dataset using R and create an Rmarkdown document. This will involve creating functions for reading data files, plotting the data and performing some statistical analysis (more details below).


The dataset you will be using is single cell RNA-seq (gene expression profiles) of human embryonic cellsarticleand can be accessed from GEO at this linkhttps://www.ncbi.nlm.nih.gov/geo/download/?acc=GSE75748&format=file&file=GSE75748%5Fsc%5Fcell%5Ftype%5Fec%2Ecsv%2Egz.


The file contains 1019 columns. The first column contains the name of the gene, followed by 1018 columns containing the FPKM (expression levels) of that gene in the corresponding cell. Columns containing H1 represent measurements in H1 hESC cells, while the columns containing H9 represent measurements in H9 hESC cells.


Your analysis should have the following functionality. In addition to writing the functions, you need to show examples of how the functions run.


IN R STUDIO






a) General statistics



  • Correctly read the datasetGSE75748_sc_cell_type_ec.csv.gzinto adata.frame.

  • Write a function calledgetMedianExpression, which computes the median expression value for each gene for H1 and H9 cells and returns the corresponding data.frame. The function takes the data, the pattern for the columns corresponding to H1 cells and the pattern for the columns corresponding to H9 cells as arguments.

  • Write a function calledgetExpressionStatistics, which computes the mean and standard deviation of expression value for each gene for H1 and H9 data and returns the correspondingdata.frame. The function takes the data, the pattern for the columns corresponding to H1 cells and the pattern for the columns corresponding to H9 cells as arguments.

  • Write a function calledgetGenesWithStableExpression, which returns the names of the genes that have a standard deviation lower than the mean. The function takes the data and the pattern for the columns corresponding to either H1 or H9 cells as inputs.

Apr 23, 2021
SOLUTION.PDF

Get Answer To This Question

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here