6 min read

Tsar Nicholas

In the early hours of July 17, 1918, the deposed Tsar Nicholas, along with the Tsarina, their four daughters, the heir to the throne, the royal physician, and three servants were taken to the basement of a house in Ekaterinburg where they were told they were going to have their picture taken. Once they posed, they were executed, and their bodies were buried in unmarked graves.

They were governed by Lenin, Stalin, Khrushchev, Brezhnev, Andropov, Chernenko, and Gorbachev, they endured Stalin's purges, won World War II, the great Yuri Gagarin became the first human in space, and so many other things while the bodies of the last Tsar and his family remained lost somewhere cold.

When the Soviet Union fell in 1991, DNA analysis was starting to gain popularity thanks to molecular markers. Here I should explain what molecular markers are. Well, in my previous note, I mentioned that humans have about 30,000 genes that are in long chains of DNA; in fact, we have 46 of those long chains in each cell, which, because they stained very well when they were first seen, are called chromosomes.

So, in those long chains are our genes, but also a lot (and I mean a lot) of DNA whose purpose is not very clear and apparently serves no function. This DNA represents about 98% of the human genome and is inherited in the same way as the rest of the DNA, meaning half comes from each parent, and thanks to that, it can provide important information for, for example, identifying individuals.

The gene responsible for eye color has different versions (alleles) that result in various eye colors; molecular markers also have different alleles that can be observed when analyzing the DNA of that region, different alleles that do not produce any apparent characteristics in a person but can be differentiated.

Those are the molecular markers, small regions of DNA that can be sequenced or identified in some way and are key for, for example, determining if a person is the biological parent of another. When this technology began to be used, the Soviet Union had just fallen, and there was a sort of fascination with the bodies of the Tsar's family.

In 1994, the work identifying the remains of the imperial family using molecular markers was published. Despite the fact that 70 years had passed, the bodies were buried in a very cold place, and good material could be obtained. It was published in one of the most important journals.

Marcador molecular HUMTHO1
HUMTHO1 molecular marker (blue bands) on the remains. Note that the children share at least one allele (each allele is seen with a different height of the bands) with the Tsar or Tsarina. Meanwhile, the doctor and servants are clearly not children of the royal couple.

Not all the DNA in our cells is in the nucleus. There are also mitochondria that have their own genome, let’s call it mitochondrial DNA. Mitochondria, unlike chromosomal DNA, are inherited only through the maternal line, meaning I have the same mitochondrial DNA as my maternal great-great-grandmother. With mitochondrial DNA, it was possible to identify the Tsar and the Tsarina because they had living relatives from their maternal line, and with the marker I mentioned earlier, it was determined that there were three children and four genetically unrelated individuals.

Two children were missing, which fueled the rumor about Anastasia, one of the daughters who was said to have survived. She was supposedly found about to jump off a bridge in Berlin in 1920, and this woman maintained until her death in 1984 that she was the Tsar's daughter. Unfortunately for those of us who enjoy conspiracy theories, in 2007 a second grave was discovered containing the remains of the two missing royal children, including Anastasia.

Around that time, in our country, Argentina, democracy was just being restored. The Grandmothers of Plaza de Mayo took the question of whether grandchildren can be identified while their parents are missing around the world. At that moment, it was probably the most difficult question in the history of genetics. The leading figures in genetics worldwide worked to establish the grandparent index and successfully restored the identity of Paula Eva Logares on December 13, 1984, and 139 more grandchildren to this day.

The Golden State Killer

Around that time, Joe DeAngelo was also killing Janelle Cruz, an 18-year-old, making her his thirteenth victim. The killer was a police officer and quickly learned that DNA samples were being taken from crime scenes to eventually identify the culprits. Once he found out, he stopped killing and was never caught.

Until.

Gordon Moore was one of the founders of Intel and, until his death in 2023 at the age of 94 in Hawaii, led a rather austere life. Well, he lived in a 30 million dollar house, but for a Silicon Valley billionaire, that was an austere life. The point is that Gordon is best known for the law that bears his name, Moore's Law.

It's not actually a law but an observation he made in 1965. He noticed that the number of transistors in integrated circuits was doubling regularly, approximately every two years. Gordon then proposed that this trend would continue, and the cost of transistors would halve every 24 months.

The law held true for several decades, although it has recently slowed down for most technologies that use transistors (basically all technologies). But there was one in particular that accelerated beyond what the law and any predictions could have anticipated: DNA sequencing.

Secuenciación de un genoma humano en dólares según la Ley de Moore.
Cost of sequencing a human genome in dollars according to Moore's Law.

The National Human Genome Research Institute (NHGRI) in the United States has been calculating the costs associated with DNA sequencing performed at the Institute's sequencing centers for several decades. According to the NHGRI, in 2007, sequencing a complete human genome cost around 10 million dollars. If Gordon's law held, by 2014 the cost would be something like one million.

But in 2014, the cost was just over 4,000 dollars. Two hundred and fifty times less than what Gordon's law predicted. Costs continued to drop, and according to the institute, by the end of 2022, sequencing a human genome cost about 500 dollars. Additionally, the speed of sequencing increased dramatically. For example, the first sequenced human genome took two decades (and cost hundreds of millions of dollars). Today, it can be done in a few hours for a few hundred dollars.

The secret behind these new technologies, known as next-generation sequencing, is that they can sequence millions of short DNA fragments in parallel, which are then analyzed using bioinformatics methods to reconstruct the entire genome being sequenced.

In the movie Gattaca, which I also mentioned in the previous note, we see (besides a humanity with access to genetic editing) the ability to obtain genetic data in just a few seconds. At the time of the film, before the first human genome was sequenced, that was just as dystopian as regularly traveling to space or genetically editing people. Thanks to next-generation sequencing, today we're very, very close.

In recent years, millions of complete genomes have been sequenced, and today we know the human genome and thousands upon thousands of molecular markers in incredible detail. This led to the emergence of hundreds of companies that, for a few dollars, provide you with fun information about your ancestors and, if you want (although it's prohibited in several countries), not-so-fun information about disease predisposition, since some markers are associated with diseases and other sensitive aspects.

23 and me, one of these companies that provides fun information, has genetic data from around 25 million people; deCode, an Icelandic company doing the same, has data from two-thirds of that country's population.

In 2017, a distant cousin of DeAngelo took one of these tests, and his information went into the databases. Around that time, they were reviewing the case, and since they had samples from his latest crimes, they decided to see what happened when comparing it with one of these databases. There was a match, they found his distant cousin, began investigating the family, and traced it back to Joe. This is how one of the most notorious serial killers in U.S. history was caught.

Databases

Now, if we think about finding a serial killer who murdered 13 people thanks to a database, it would be hard to be against it. But considering that there are genetic databases covering millions of people with sensitive data (and knowing that if there are databases, there are leaks), things can get a bit concerning. In fact, news of this kind is quite common.

With genetic data from around 10% of a population, it would be possible, starting from any sample, to reach all the people. In other words, when it was very laborious to identify a grandchild from a grandparent in the late 1980s, today it's quite easy to identify second or third cousins.

It's crazy.

Imagine if a private health insurance company wanted to know if a potential member has a higher predisposition to a disease whose treatment is quite expensive.

Or worse yet, think about non-democratic regimes governing countries with access to genetic data of individuals. I don't know about you, but all of this makes me a bit uneasy.

Suscribite