Using DNA Sequencing to Deconstruct A Bacterial Genome
Our society has been immersed in technology long enough for almost everyone to ponder about coding, programming, machine learning, or something of the sort. Whether we took an “Intro to HTML” course at home or an ICS class at high school, the first program we all learnt to code was the classic “Hello World”.
Soon enough, we came to the realization, that it’s not all that easy. Computer programs actually have millions of lines of code behind the UX/UI. Similarly, cells too, have their function information written in code — it’s just behind all the colourful organelles structures we see in textbooks. (smooth transition, but not smoother than the smooth ER).
Obviously, the simplicity of “coding a cell” is not beginner-level like <Hello World>. In fact, the first synthesized genome was only created about 10 years ago, but even though it was tiny, it was a massive stepping stone! (Peep, Synthia, the first synthetic cell)
But, how does the process of getting a cell’s function information and actually being able to read and perhaps modify or replicate, work? This is where we take a bottom-up approach, first starting with the basic building blocks of life…
DNA
Deoxyribonucleic acid, the fundamental basics of life. We’ve learnt in like sixth grade, that DNA is what makes you, you! Well, it stores the cell's genetic instruction, which is the information that programs cell activities.
“Deoxyribo” refers to sugar, and “Nucleic acid” refers to phosphate and bases. To sum up, the building blocks of DNA are sugar, phosphates and bases.
DNA forms some sort of double dimensional spirally ladder — a double helix. The monomers, which are the single units, are chained together and form nucleotides — bases make up nucleotides. The two long strands create a swirly structure is referred to as the backbone — the sugar and phosphate molecules create the sides.
As you can see, there are 4 bases that make up nucleotides — adenine (A), thymine (T), cytosine ( C) and guanine (G). The bases on one strand pair with the bases on another strand, so it's adenine — thymine, and guanine — cytosine. All the combinations allow complexity to be encoded in the base patterns, where the DNA transcription of genetic information is formed.
Fact break: Human DNA has around 3 billion bases, and more than 99% of those bases are the same in all human beings — that’s definitely something to think about.
DNA Sequencing
DNA sequencing is a method used to determine the exact sequence of nucleotides (A, G, C, T) in a strand of DNA. The DNA base sequence carries the information a cell needs to assemble (which are the RNA molecules and protein). Through DNA sequence information, we can understand how DNA functions, and even specify changes between different organisms.
Biology meets Synthetic Biology
Eh, what? Biology is essentially the study of living organisms which includes physiology, behaviour, and other qualities. We were going into molecular biology — dealing with the structure and function of macromolecules. Now, to break down the macromolecular parts even further, and engineer them to have new abilities requires us to assign functions to the parts.
This is where synthetic biology intersects because we’re creating these biological parts that encode DNA sequence of the biological function, to engineer DNA to perform different functions. The biological parts allow us to essentially, “play with the parts”. I wrote an article explaining more about synbio ⏬
Sequence Analysis
In molecular biology, the goal is to understand what the A, G, C, T mean, what structure they form and why are they important for the formation and function of bacteria. So it’s taking an engineering approach called first principles.
“ The idea is to break down complicated problems into basic elements and then reassemble them from the ground up” — FS Blog
What can we extract from this diagram?
- ATG: Also known as a start codon; it begins with each protein.
- Reading frame: In this region, the DNA is divided into the nucleotide sequence, through which the ribosomes can travel.
- Promoter: A process of transcription (like turning a gene off or on) happens here.
- Protein binding site: Here, the protein binds to a molecule.
- RBS: This is a site that is recognized by the ribosomes to begin the synthesis of a protein.
Biological Parts
While we're analyzing the DNA sequence parts, there are important features to identify, like proteins, RNA, and DNA. Why? Because there are signals on the DNA that direct certain proteins to read the “instructions” and form the needed parts for the cell. Now the concept of “DNA is what makes you, you!” starts to make sense, because it's the instructions in the DNA, that tell the protein what to do, and the protein does it.
Seems like we came back to base one, but I can assure you there are 3 other bases to cover 😅. Here’s where the sequence becomes different.
Synthetic biologists want to “profit” from different protein molecules, or assemble the DNA parts in a different order, and to do so, the DNA sequence needs to be broken into biological parts. From a sequence analysis approach to reading the DNA, we pivot to a more of a circuit approach.
An important part to define:
- Terminator: They occur at the end of a gene that causes transcription to stop.
Grade 10 science class flashbacks to battery circuits — I think so! While analyzing a gene circuit, it's visibly easier to identify the biological parts, and more clear to “play” around with.
Goals in Synthetic Biology
This technology’s goal can be summarized into two objectives.
- Understanding biological processes
Biologists usually anatomize, but it’s synthetic biologists that reconstruct by understanding and playing around with biological processes like DNA sequences.
- Constructing new biological processes
Then the next step is to sustainably construct biological processes with complex parts that will carry new functions, by putting together engineering models and rules — it's like adding some spice.
All in all, I doubt I need to ramble on why synthetic biology will be a breakthrough, because after all,
“If we can program life at its most basic unit, what is there we cannot fix” — yours truly
I’m a 15-year-old student researching biotech applications and learning more about applying engineering principles to life forms! If you enjoyed my article and would like to connect, here’s my LinkedIn and monthly newsletter.