Low quality data can have tremendous down-stream implications if it is not dealth with early. A common technique to mitigate the effects of low quality data is to 'mask' it. To mask the data means to change the base calls to some predefined letter that is known to represent data to be ignored.
The FASTQ file format has both DNA base call and quality information for each sequence in the file. The length of the sequence and the length of the quality information are the same and with some simple programming, the quality value for each base of a sequence can be found. Write a Python program that does the following;
Already registered? Login
Not Account? Sign up
Enter your email address to reset your password
Back to Login? Click here