Skip to content

Task 11

Task 11: Adding a new Workflow Step

To actually figure out what a DNA sequence says, a further analysis has to be conducted. This includes figuring out which RNA is generated from a given DNA sequence and translating the RNA into amino acids.

To emulate this we need a new kind of sequencer: An RNA sequencer. While it has the same capability to analyze samples as the DnaSequencer how this analysis works internally is completely different.

If you are interested: Learn how DNA to RNA encoding works in principle

11a: Even more Generalization

Create a new class Sequencer that acts as a superclass to the DnaSequencer. All sequencers have the model and serial_number attributes in common, as well as the analyze_sample(self, sample)-method.

Hint: Sometimes it is not useful to give an implementation for a method in a superclass yet. The pass keyword can be used to indicate that no implementation will be given:

def some_method(parameter0, parameter1, ):
    pass  # Indicated that no implementation will be given
Calling a method defined that way will simply do nothing.

11b: A new Tool in the Box

Create a new class RnaSequencer. This is a subclass of Sequencer and thus inherits all attributes and methods. Consider how to properly implement the __init__(self,…)-method.

11c: Find and Replace

An RNA sequence is made up of the bases A, C, G and U (instead of T). Add a method dna_to_rna(…) to the RNA sequencer. It should be given a template strand DNA sequence and create a matching RNA sequence by replacing the bases as follows:


The resulting RNA strand should be returned as a string similar to the DNA strand. Make sure to test your new method a bit. For example the input "CATCATCAT" should give you "GUAGUAGUA".

Hint: Discuss whether you could also use a static or class method.

The RNA we generate here is actually mRNA. We skip the following tRNA step for simplicity.

11d: Three of a Kind

In the next step, the newly generated RNA sequence needs to be chopped into triplets. Add a new method extract_triplets(…) to the RnaSequencer. The input is a string made up from the RNA bases. It is to be cut into pieces which each are 3 bases (i.e. letters) long. All pieces should be collected together in one list. Bases that are left over in the end get dropped. The output is this list of strings where each of those is exactly 3 bases (letters) long.

For testing, the input "GUAGUACC" should yield ["GUA", "GUA"], (the remaining "CC" is left out).

11e: Tying it all together

Add the analyze_rna(self, sample)-method to the RnaSequencer. It takes a Sample, obtains its template strand DNA and applies the helpers we created before:


The resulting list of triplets is to be stored again in the Sample, so you will have to add a new attribute triplets there.