Skip to content
Snippets Groups Projects
Commit 49f1bfc2 authored by Franziska Niemeyer's avatar Franziska Niemeyer
Browse files

Upload solutions for exercises_A

parent e90f9b4a
No related branches found
No related tags found
No related merge requests found
%% Cell type:markdown id: tags:
# Python course 2021 - Exercises A
%% Cell type:code id: tags:
```
def print_type(variable):
print(variable)
print(type(variable))
```
%% Cell type:markdown id: tags:
## Part1 - Variables
%% Cell type:markdown id: tags:
---
1.1) Save 3.14159265359 in a variable of type float!
%% Cell type:code id: tags:
```
pi = 3.14159265359
print_type(pi)
```
%% Output
3.14159265359
<class 'float'>
%% Cell type:markdown id: tags:
---
1.2) Convert variable from float to integer!
%% Cell type:code id: tags:
```
pi = int(pi)
print_type(pi)
```
%% Output
3
<class 'int'>
%% Cell type:markdown id: tags:
---
1.3) Convert variable back! What happens?
%% Cell type:code id: tags:
```
pi = float(pi)
print_type(pi)
```
%% Output
3.0
<class 'float'>
%% Cell type:markdown id: tags:
The float is rounded down to the nearest integer and the decimal places are lost.
%% Cell type:markdown id: tags:
---
1.4) Convert variable type to string!
%% Cell type:code id: tags:
```
pi = str(pi)
print_type(pi)
```
%% Output
3.0
<class 'str'>
%% Cell type:markdown id: tags:
---
1.5) Save 'Python' in a string variable!
%% Cell type:code id: tags:
```
python = "Python"
print_type(python)
```
%% Output
Python
<class 'str'>
%% Cell type:markdown id: tags:
---
1.6) Convert variable type to float! What happens?
%% Cell type:code id: tags:
```
python = float(python)
```
%% Output
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-13-fc22f6f198d4> in <module>()
----> 1 python = float(python)
ValueError: could not convert string to float: 'Python'
%% Cell type:markdown id: tags:
---
1.7) What is a pitfall in regards to division when working with int/float?
%% Cell type:code id: tags:
```
a = 3
b = 2
print(a / b)
```
%% Output
1.5
%% Cell type:markdown id: tags:
You need to ensure that no integer division is performed in case you do not want to loose the decimal places. However, in Python you do not need to worry about this.
%% Cell type:markdown id: tags:
## Part2 - Functions
%% Cell type:markdown id: tags:
Primer: 'ATGCCATGCATTCGACTACG'
%% Cell type:markdown id: tags:
---
2.1) Calculate length of primer and print it!
%% Cell type:code id: tags:
```
primer = "ATGCCATGCATTCGACTACG"
print(len(primer))
```
%% Output
20
%% Cell type:markdown id: tags:
---
2.2) Get number of 'G's and print it!
%% Cell type:code id: tags:
```
positions = [i for i in range(len(primer)) if primer[i] == 'G']
print(positions)
print(len(positions))
```
%% Output
[2, 7, 13, 19]
4
%% Cell type:markdown id: tags:
---
2.3) Write a function to analyze the nucleotide composition of a primer and print it!
%% Cell type:code id: tags:
```
def analyze_composition(seq):
gc_content = seq.count("G") + seq.count("C")
return 100 * gc_content/len(seq)
print("GC content:", round(analyze_composition(primer), 2), '%')
```
%% Output
GC content: 50.0 %
%% Cell type:markdown id: tags:
---
2.4) Is it a suitable primer? Why (not)?
%% Cell type:code id: tags:
```
def compute_primer_properties(primer):
length = len(primer)
print(f"Length: {length}")
gc_content = primer.count("G") + primer.count("C")
gc_content = gc_content / length
print(f"GC content: {gc_content * 100} %")
temperature = 4 * (primer.count("G") + primer.count("C")) + 2*(primer.count("A") + primer.count("T"))
print(f"Temperature: {temperature} degrees celsius")
gc_clamp = (primer[-1] == "G" or primer[-1] == "C") and (primer[-2] == "G" or primer[-2] == "C")
print(f"GC clamp: {gc_clamp}")
compute_primer_properties(primer)
```
%% Output
Length: 20
GC content: 50.0 %
Temperature: 60 degrees celsius
GC clamp: True
%% Cell type:markdown id: tags:
The primer's properties are all in a suitable range. However, to evaluate the actual suitability of the primer, its mapping uniqueness and mapping capability to the site of interest are also relevant.
%% Cell type:markdown id: tags:
**Additional exercises**
%% Cell type:markdown id: tags:
2.5) Test if the primer contains a hairpin structure.
%% Cell type:code id: tags:
```
def get_reverse_complement(sequence):
bases = {'A': 'T', 'T': 'A', 'C': 'G', 'G': 'C'}
rev_comp = []
for i in range(len(sequence)-1, -1, -1):
rev_comp += [bases[sequence[i]]]
return ''.join(rev_comp)
"""
Computes all exact matches between seq and other_seq
this method is naive and can be optimized
output is a list of 4-tuples of the form
(start position in seq, start position in other_seq, length of the match, matching string)
"""
def get_common_substrings(min_length, seq, other_seq):
length_seq = len(seq)
length_other_seq = len(other_seq)
matches = []
for i in range(length_seq):
for j in range(length_other_seq):
current_position_seq = i
current_position_other_seq = j
current_match_length = 0
while(current_position_seq < length_seq and current_position_other_seq < length_other_seq):
if seq[current_position_seq] == other_seq[current_position_other_seq]:
current_position_seq += 1
current_position_other_seq += 1
current_match_length += 1
else:
break
if current_match_length >= min_length:
matches += [(i, j, current_match_length, seq[i:i+current_match_length])]
return matches
"""
Tests whether a given sequence contains a hairpin structure
min_length describes the minimum length of the stem of the hairpin
min_distance describes the minimum length of the loop of the hairpin
"""
def has_hairpin_structure(sequence, min_length, min_distance):
length_seq = len(sequence)
upper = min_distance
lower = 0 - min_distance
rev_comp = get_reverse_complement(sequence)
matches = get_common_substrings(min_length, sequence, rev_comp)
for seq_position, rev_comp_position, match_length, _ in matches:
# find start position of second match in sequence from position in reverse complement
start_position_second_match = length_seq - rev_comp_position - match_length
# print(start_position_second_match)
# end position of first match in sequence
end_position_first_match = seq_position + match_length
# print(end_position_first_match)
# positions need to be at least min_distance apart
if end_position_first_match - start_position_second_match <= lower or end_position_first_match - start_position_second_match >= upper:
return True
return False
print(primer)
print(get_reverse_complement(primer))
print(get_common_substrings(3, primer, get_reverse_complement(primer)))
print(has_hairpin_structure(primer, 3, 3))
```
%% Output
ATGCCATGCATTCGACTACG
CGTAGTCGAATGCATGGCAT
[(0, 9, 4, 'ATGC'), (0, 13, 3, 'ATG'), (1, 10, 3, 'TGC'), (4, 12, 4, 'CATG'), (4, 17, 3, 'CAT'), (5, 9, 6, 'ATGCAT'), (5, 13, 3, 'ATG'), (6, 10, 5, 'TGCAT'), (7, 11, 4, 'GCAT'), (7, 16, 4, 'GCAT'), (8, 12, 3, 'CAT'), (8, 17, 3, 'CAT'), (11, 5, 4, 'TCGA'), (12, 6, 3, 'CGA')]
True
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment