Course Level
CS1
Knowledge Unit
Fundamental Programming Concepts
Collection Item Type
Assignment
Other Material Type
Synopsis

This is the second of five programming assignments in a semester-long CS-1-like course named DNA to introduce students to programming within the context of genomics. This programming assignment requests a Python program to open and read a FASTA-formatted file filled with DNA and print a neat summary of Chargaff’s numbers, defined as: the number/proportion/percentage of A, C, G, T nucleotides in the file of DNA. Students are required to research and download the genome of an entire microbe of their choice. In addition to submitting source code, students must practice their scientific writing in a report of their program as applied to an entire microbial genome of their choice. The report must include the sections of Introduction, Methods, Results, and Discussion.

Recommendations

For recommendations about this specific assignment as well as general comments for the entire set of DNA-focused programming assignments, please see the attached recommendations document.

Engagement Highlights

We live in a post-genomic world where strings of sequenced DNA are the starting point for discovery from basic research to personalized medicine. In addition to the human genome, exciting interdisciplinary areas such as the computational explorations of the thousands of genomes in the microbial communities within us are leading to new definitions of personalized medical diagnosis and treatment. "If Charles Darwin had taken a couple of undergraduate interns with him on 'The Beagle', those students would have discovered, described and catalogued their share of new species ... therefore it is perhaps ironic that we are experiencing once again an age of exploration and discovery via the old fashion activities of collecting and cataloguing. This time it is not only organisms but DNA sequences ... That enticing, exhilarating idea of being on an expedition is (or could be) an aspect of DNA sequence analysis. The balance is tipped heavily toward vast, unknown territories of undeciphered data waiting to be explored" (LeBlanc and Dyer, 2007).

Note: I originally taught this course in Perl where we used the text written by my colleague in biology Betsey Dyer and myself. I now use Python in this course so I no longer use my own book, but the text includes some wonderful, interdisciplinary insights (most of which are written by Betsey Dyer). Perl for Exploring DNA (Oxford University Press, 2007).

Computer Science Details

Programming Language
Python

Material Format and Licensing Information

Creative Commons License
CC BY-NC