Sunday, November 17, 2019

Gene Fact #7

46.  One orphan gene called NCYM, which is involved in cancer, is found only in humans and chimpanzees

My Initial Reaction
I've never heard of this gene, so I wasn't sure about the veracity of this claim…until the students and I started looking into it!

Student Responses
Almost all of the student responses cited the same manuscript, Suenaga et al. (2014). Others found subsequent manuscripts citing Suenaga et al. - and it seems that National Geographic had done the same, and made the same inaccurate interpretation of the original manuscript. You see…

Suenaga et al. mention in their abstract that "The NCYM gene is evolutionally conserved only in the taxonomic group containing humans and chimpanzees." However, this does not mean that it is found only in humans and chimpanzees, for two reasons. First, the interpretation of this quote depends on which "taxonomic group containing humans and chimpanzees" is being considered. This requires a little background on the relationships between primate species:





The closest related species to humans (genus Homo) is the chimpanzee (genus Pan). As shown above, these two genera (plural of genus) are the only two members of the tribe called Hominini. Thus, Hominini is one "taxonomic group containing humans and chimpanzees." However, if we go back farther in time, to the Homininae subfamily, then the "taxonomic group containing humans and chimpanzees" also includes the gorilla. Going farther back in time, to the Hominidae, we then include the orangutan. And so on.


Put bluntly, the authors did not write "in the taxonomic group only containing humans and chimpanzees" (which would indicate Hominini). Thus, other groups of species containing humans and chimps also exist.


To this point, some students noted that the same study later noted the following: they "identified orthologs for a probable NCYM protein in olive baboons, chimpanzees and pigmy chimpanzees. From here on, we focused on the NCYM gene of the hominidae to investigate the function of the protein product." So, they spent the rest of their analysis on Hominidae…and this group (see above) includes orangutans, gorillas, chimpanzee, and humans.


The authors just mentioned they identified orthologs (versions of the same gene) in baboons and chimpanzees. They did not say they found this gene in orangutans and gorillas (which, other than chimps and humans, are the only other species in the Hominidae). Thus, it is perfectly acceptable that they "focused on the NCYM gene of the hominidae" (which they only found evidence for in humans and chimps). This also explicitly ignores the fact that the authors themselves also said they found evidence for this gene in baboons, which (as shown above) are even less distantly related to humans than gibbons!


Put bluntly, there are two take-away lessons:

1) The authors have not deceived us, but we might deceive ourselves if we don't fully understand the terminology being used by the authors! and
2) Absence of evidence is not evidence of absence. This is one of my favorite quotes. It means that there are several potential reasons why the authors didn't detect the NCYM gene in closer relatives of humans than baboons. For example, it could be that the genome sequences of closer relatives (gorillas, orangutan) weren't of sufficient quality to identify the presence of this gene by computational methods. That wouldn't mean those species don't actually have the gene in question - it would just mean that we weren't yet able to find it!




Student Decision: Fact or Fiction?
Fiction





Literature Cited

Sunday, October 27, 2019

Gene Fact #6

99.  People who carry sickle cell anemia are better protected against malaria than those who are not carriers

My Initial Reaction
I've heard this in classes I took since I was an undergraduate (i.e. decades). However, I've never looked into the primary research literature about this. In other words, I've never checked for myself whether this is true! Here is a summary of what I've heard.

Humans are diploid, which means we have two (di-) copies of each gene. A benefit of diploidy is that if one copy of a gene sustains a DNA sequence change (a mutation) that prevents it from producing a normal protein, we have a second copy of the same gene. In many cases, perhaps, that remaining copy will suffice. If a person has one normal gene and one mutant copy, and if that person is OK, then that person is called a "carrier." The carrier seems normal, but they have a mutation that could be passed on to their children - and their children could manifest the effects of that mutation.


One genetic disease that follows this pattern is sickle cell anemia. In red blood cells, the protein hemoglobin (Hb) binds oxygen, and then the red blood cells deliver that oxygen to our body's tissues. One mutation in Hb causes that protein to fold into a non-normal shape (it looks like a sickle under the microscope), so it called the HbS mutation. When a human has one normal (denoted as +) and one HbS version of hemoglobin, the normal hemoglobin works fine, and the fact that the HbS hemoglobin does not function properly has no detectable effect on oxygen delivery. This is a carrier individual. They are referred to as geneticists as being +/HbS (one + or normal version, and one HbS mutant version, with the forward slash / being used to separate the two notations).


So, there are other two possible genotypes: +/+ and HbS/HbS: a person can have two normal versions of hemoglobin (which is the common situation), or they could have both mutant versions of hemoglobin. These latter individuals are the ones who develop a genetic disease called sickle cell anemia. The lack of normal hemoglobin causes the misshapen red blood cells to clump together in the bloodstream and cause excruciating pain.


In the 1900s, scientists detected a conundrum: if sickle cell anemia is a genetic disease, then why did so many people have it? Normally, mutations that elicit a disease are removed from the population of humans through natural selection. In other words, if somebody had sickle cell anemia, we might expect either that they would die before being able to reproduce and pass the mutation on to their children or at least that an individual who had sickle cell anemia would choose not to have children, since their children would definitely be carriers (if not afflicted).


It appears that things are slightly more complicated. Yes, +/+ individuals are healthy, and HbS/HbS individuals have sickle cell anemia. But the story of the carriers is fascinating. Not only do carriers not manifest sickle cell anemia, but also they seem to be resistant to malaria. If true, then this would explain the prevalence of sickle-cell anemia mainly in regions of the world where malaria is also prevalent: a cost-benefit relationship. Despite carriers being beneficial in environments where malaria is common, they also occasionally produce children with sickle-cell anemia, but they will produce more carriers than sickle-cell children, so the mutation is maintained.Student Responses

Not surprisingly, the genetic basis of malaria resistance has been a widely-studied topic, so there are lots of research papers to read (and the students found many). The oldest paper (Allison 1954) raises the same question as above: why is the HbS mutation not being eliminated from the population by natural selection? Allison noted several previous studies that had identified a correlation between the prevalence of sickle cell anemia and malaria, but apparently the potential for a causal relationship had not been tested. In his study, humans with and without sickle cells were purposefully exposed to malaria, and Allison observed that malaria almost never occurred in individuals with sickle cell anemia, where it was common in those who did not. The paper concludes with the extrapolation that carriers might benefit from protection against malaria.

Subsequent studies have further tested this hypothesis. For example, Ferreira et al. (2011) found the same effect in mice: that the HbS mutation confers malaria tolerance. 


Finally, many students identified review articles that debate the mechanisms by which HbS confers malaria resistance - the answer is still unclear (but see, for example, the recent study by Archer et al.).


Student Decision: Fact or Fiction?

Fact




Literature Cited

Monday, September 30, 2019

Gene Fact #5


14.  Sometimes a mistake occurs when cells divide, causing errors in the number of chromosomes a person has

My Initial Reaction

"A person" probably means that we're discussing humans. "When cells divide" means we're talking about mitosis and/or meiosis. These are the names for two processes of cell division. Mitosis is the process by which one cell divides into two. You might be familiar with this process because it produced you. When you were a just-fertilized egg (a one-cell embryo), that cell had to divide to produce two cells. And two cells divided to produce four. And, pretty soon, you were you. Meiosis also involves cell division, but doesn't have the task of producing two genetically identical daughter cells. Instead, meiosis happens to a select few of our cells: those cell divisions produce our gametes (sperm from males, eggs from females). In general, meiosis differs from mitosis because meiosis involves two serial rounds of cell division.

Either way, cell division is critical to the study of genetics, because it is essential to ensure that the cells produced by division have the same number and composition of chromosomes. For example, humans have 23 different chromosomes, and our cells typically have two of each (we get one of each of the 23 chromosomes from our father at fertilization; one each of the 23 from our mother) = 46 chromosomes. Even seemingly minor changes in the number of chromosomes we inherit can have significant effects. Individuals that do not have two copies of each chromosome are referred to as "aneuploid."* For example, if you happen to get an extra copy of chromosome 21, then you probably have Down syndrome. For most of our 23 chromosomes, having one extra (3) or one too few (1) is not even compatible with life. It could also be that a common cause of spontaneous abortions is imbalance in chromosome number in a developing fetus.

So, I already knew that errors in the way chromosomes are distributed at cell division could cause such errors. The Fact is definitely true. But, that's my opinion. I just "know" this because all of my textbooks say so. Should I trust those sources? Are they, themselves, based on data and facts? Would undergraduate genetics students find research-based sources to support this Fact? Absolutely.

Student Responses
Of all responses, I'll focus on two. One showed that a problem in the equal division of chromosomes at cell division leads to aneuploidy in one cell type (Yang et al. 2003). A more recent study (Holubcová et al. 2015) made the striking discovery that a cellular mechanism overseeing chromosome sorting into human eggs is itself prone to errors.


Student Decision: Fact or Fiction?

Fact (overwhelmingly)

Nota bene
Amazingly, a particularly relevant research paper was published days after I assigned this Fact to my students for analysis. So, I'll briefly describe what we all just learned from this new publication.

For context: many of us have probably heard that there is a positive correlation between the age of a woman and the probability that she has a child with a genetic disorder. In other words, the older a woman is, the more likely she might be to miscarry or to have a child with aneuploidy (like Down syndrome).

Well, Gruhn et al. (2019) just showed that this pattern exists not just for older women but also for young women! They describe that "aneuploidy follows a U-curve." In other words, the eggs produced by both young and old women are more likely to have chromosome imbalances, with the mid- to late-20s appearing to be the ideal time to have children with the lowest probability of this type of chromosome imbalance. Moreover, they find that problems in different cellular mechanisms result in higher rates of aneuploid eggs in young women compared to aneuploid eggs of older women. That is, the Fact is true: not only do cell division mistakes cause errors in the number of chromosomes inherited by a person, but also the rate of such mistakes depends on the age of the mother. In all fairness, I'm not aware (yet) of studies that also look at the same issue in terms of the quality of sperm production by males, but there are biological reasons to think that cell divisions in males that produce sperm would not be subject to the same issues that might face females and the production of eggs.

In sum, the ability of our cells to produce eggs (and maybe sperm, too) that contain the correct number of chromosomes is not perfect.

* the sex chromosomes (X and Y) do not meet this strict definition of aneuploidy, because male humans typically have one X and one Y chromosome (hence they have one fewer X than females, who have two X chromosomes, and they have one more Y than females, who have zero).


Literature Cited




Thursday, September 26, 2019

Gene Fact #4


65. A small percentage of Europeans are HIV resistant because of genetic mutations caused by the plague of the Middle Ages

My Initial Reaction

We're now discussing the topic of mutations in my genetics class. Mutations are not all bad, and this Fact is a great example. A mutation is just a change in DNA sequence. Sometimes these changes are good for the recipient, sometimes they make no difference, and sometimes they are bad.

I've had my ear to the ground about HIV resistance since I was a research technician at Fred Hutchinson Cancer Research Center years ago. One research group was working on HIV transmission. Another lab was interested in host-virus co-evolution. For example, mutations in human DNA can reduce the ability of viruses to invade and infect our cells. But, I was unfamiliar with the claim in this Fact. So, I was eager to see what sources that my students would uncover in their research.

Student Responses
Students identified a variety of peer-reviewed research literature that supported and contradicted this Fact. Not surprisingly, there is considerable interest in the topic of HIV resistance, and there is a lot of literature to wade through. So, I'll summarize:

If we abbreviated the published Fact to "A small percentage of Europeans are HIV resistant," then students would agree this is a fact supported by research. There is a mutation, named CCR5∆32, that involves the CCR5 protein, which is present on the surface of cells that are part of the human immune system. HIV can bind to the CCR5 protein as a first step in entering and infecting these cells. The ∆32 mutation (∆, or delta, meaning a deletion of a part of the DNA that encodes the CCR5 protein) prevents HIV from being able to stick to those immune cells, thus preventing infection.

However, the remaining phrase in the Fact, "because of genetic mutations caused by the plague of the Middle Ages," is its downfall.

First, if the CCR5∆32 mutation had been "caused by the plague," then we can reason this mutation would not have existed before the plague ravaged Europe in the Middle Ages. However, at least one study (Hummel et al. 2005) found that human remains from the Bronze Age (before the Middle Ages) also contained the same CCR5∆32 mutation. These data support the counter-argument that the plague did not cause this mutation, because the existence of the mutation preceded the plague.

Speaking of the plague, it is also not entirely clear what organism(s) caused the Middle Ages plague in Europe. Debate still exists over whether that plague was bacterial or viral in nature (Zajac 2018). Some suggest that the plague might have involved an early HIV exposure of the human population (Cohn and Weaver 2006). However, if the plague was caused by bacterial infections, then it would not make sense that the CCR5∆32 mutation, which prevents viral entry into immune system cells, would be related to exposure to the plague in the Middle Ages.

Other scientists have debated the identity of the virus (if any) that might be related to the prevalence of the CCR5∆32 mutation. As an example, it could be that smallpox, and not HIV, killed Middle-Ages Europeans that did not already have the CCR5∆32 mutation (Galvani and Slatkin 2003).

Finally, some students noted a critical semantic distinction. The Fact asserts that "mutations caused by the plague" conferred resistance. This is the most damning rebuttal. Indeed, few (if any) infectious agents cause mutations like CCR5∆32. Instead, evolutionary geneticists would argue: humans that already contained DNA mutations that helped them overcome viral infection (like CCR5∆32 preventing viral invasion of immune cells) would survive challenges like a viral plague. Humans that did not have such mutations would more likely die. Thus, although a viral plague might not cause (or induce) resistance mutations in humans, the exposure of a genetically diverse human population to a viral plague could cause the mass death of humans that did not already have the rare and randomly-occurring mutation (e.g. CCR5∆32) that would have made them less susceptible to viral invasion.


Student Decision: Fact or Fiction?
Fiction! (but, once again, this outcome resulted mainly from the wording of the Fact. The intent of the Fact is, as has often been the case, strongly supported by research)

Literature Cited




Saturday, September 21, 2019

Gene Fact #3


23.  Genes are fragile. A variety of things can damage them, including sunlight and how cells copy them.

My Initial Reaction

Since my genetics class is dealing with DNA replication now, and will next turn our attention to the topic of mutations, this is a prime time to have students check this Fact!

I (wrongly) guessed that, once again, discussion about biology would get mired in semantics. For example, I envisioned receiving many comments from students about whether it is "genes" that are fragile or DNA that is fragile? A gene is a part of a chromosome, so it is composed of DNA - and genes are, generally speaking, no more or less fragile than non-gene regions of chromosomes.

Student Responses

To my surprise, then, students quickly identified many different sources to support the Fact. Many citations were to review papers, not primary research literature. But, if you like, you can follow the references in the reviews to get at the published data.

Representative responses include the following pair:

“Genes become 'damaged' quite often--often enough that there are regulators whose main function are to fix these damaged DNA.”

“There’s some regions in DNA synthesis that may display certain breaks or constrictions that can lead genes to be fragile.”

Many students identified manuscripts, like the review by Rastogi et al. (2010), that only begin to survey the trove of data we have on how ultraviolet light from the sun can damage DNA and cause mutations.

Some, related to the second quote, noted that DNA is not only prone to damage by light, and by chemicals, but also by our own cellular processes, even those that are meant to accurately replicate our DNA when our cells divide so that the two daughter cells resulting from cell division each have a copy of each chromosome. As Ma et al. (2012) review, there are number of so-called "fragile sites" on our chromosomes, that are (relatively) frequently observed to break during cell division. Some of these chromosome breakage events happen when the failsafe mechanisms in our cells don't work properly and allow cell division to begin before our chromosomes are finished replicating.

I was also quite impressed by other students, who (as in the first quote) identified that the presence of DNA repair machinery in our cells is proof itself that DNA is fragile, such that our cells have evolved tools (in the form of enzymes) to seek out and repair mutations and chromosomal breaks. As we'll see shortly, some of these processes aren't perfect.

Related to DNA repair, I'll add a new Fact to the existing 100: most humans have the breast cancer predisposition gene BRCA1. I even have it. You know why? Because this gene encodes a protein that is a DNA repair enzyme. The reason it is known as a breast cancer gene is because mutation of this gene (that is: loss of its normal, protective function) makes people more likely (predisposed) to develop breast cancer because their cells are less able to repair DNA damage. Thus, it is only particular (mutant) versions of this gene that are more likely to lead to breast cancer. When people are genetically tested for breast cancer susceptibility, one of the targets of testing is to check the BRCA1 gene for such mutations. Actor Angeline Jolie famously elected to have a prophylactic double mastectomy after she learned that she had such a mutation.



Student Decision: Fact or Fiction?
This one was overwhelmingly fact.


Literature Cited




Monday, September 16, 2019

Gene Fact #2


74. Each strand of DNA replicates independently of every other strand

My Initial Reaction

This seems like a pretty factual-sounding statement. At this point in the semester, we're learning about DNA replication, so this is an obviously relevant fact to check. I assigned this at the end of my class where I discussed the Meselson-Stahl experiment. This was the elegant study showing that DNA replicates semi-conservatively: the two strands of the DNA double-helix each serve as the template for the contruction of a new second strand each, which duplicates the double-helix. I figured that the students would just cite this paper and we'd be done. But, as always, the students think outside the box!

Student Responses

The responses were more mixed than I had expected. Many students focused on the meaning of "independently" in the Fact. A representative response reads, "When DNA is being replicated the two individual strands are separated into what is called a replication fork and although both of the individual strands are being replicated separately by two separate, but similar, processes they occur simultaneously. This means that there is not one single strand of DNA that is replicated completely independently."

An example of a student that did agree with the Fact, as written, cites a line from the abstract of Graham et al. (2017), "Using real-time single-molecule analysis, we establish that leading- and lagging-strand DNA polymerases function independently within a single replisome."

Student Decision: Fact or Fiction?
I didn't even hold a vote. No raising of hands, electronic ballots, or tallying. The vast majority of students were hung up on the semantics of the way the Fact was stated.

However, the literature does support the essential genetic concept underlying the Fact: each strand serves as its own template for replication. In other words, we know that the way DNA replication does not occur is by an alternative (hypothetical) mechanism, "conservative DNA replication," in which both strands would be duplicated at the same time by the same single enzyme.

Literature Cited



Sunday, September 8, 2019

Gene Fact #1

As it turns out, my genetics class was just starting to discuss genomes and genomics when this fact-checking project began, and the first fact presented in "A User's Guide: Your Genes - 100 Things You Never Knew" (a Time Inc. Specials publication, by National Geographic), happened to be relevant to that topic:

1. We humans share 99% of our genes with chimpanzees and bonobos

Initial Reaction

I knew that this putative Fact would be tricky to ask students to work on, because it combines an easily vetted numerical value with two subtle, perhaps trivial, issues of definition:

  • what does "share" mean?
  • what definition of "gene" should we use?

Also, I anticipated that this topic could automatically (without trial following investigation and critical thinking) be rejected by creationists or others who don't want to consider whether we humans are most closely related to non-human primate species like chimpanzees.

Initial Student Responses

Of all of the ~60 written summaries by students:

A few cited a review paper by Khodosevich, Lebedev and Sverdlov, "Endogenous retroviruses and convergent evolution" (2002) Comp Funct Genom, which states in the second sentence of the introduction, "The average DNA sequence difference between human and chimpanzee is only 1.24% [7] and probably only 0.5% in active coding regions [9]," but some students didn't even make it that far, stopping at the first sentence of the abstract, "Humans share about 99% of their genomic DNA with chimpanzees and bonobos."

Another review, by Varki and Altheide, "Comparing the human and chimpanzee genomes: Searching for needles in a haystack" (2005) Genome Research, was often-cited to refute the Fact, perhaps because its abstract states, "The difference between the two genomes is actually not ∼1%, but ∼4%—comprising ∼35 million single nucleotide differences and ∼90 Mb of insertions and deletions."

Britten (2002) PNAS says it all in the title, "Divergence between samples of chimpanzee and human DNA sequences is 5%, counting indels."

In Ebersberger et al. (2002) "Genomewide Comparison of DNA Sequences between Humans and Chimpanzees" Am J Hum Genet., the authors studied about 1.9 million nucleotides and found an average sequence difference of 1.24%.

Most cited PrĂĽfer et al. "The bonobo genome compared with the chimpanzee and human genomes" (2012) Nature, in which the authors state that, in the single-copy autosomal regions they analyzed, the average percent identity between human and bonobo genomes is 98.7%, with bonobo-chimpanzee percent identity at 99.6%.

My Feedback to the Students

If you'd like to hear (and see the slides I used) our in-class discussion on this process, please visit this link to YouTube

Much student discussion focused on whether it was reasonable to round 98.7% (e.g. from PrĂĽfer et al.) up to 99% (as stated in the Fact). Although this is a relatively trivial point, whether it is critical for concluding that a statement is fact is not an easy conclusion to draw!

I raised one concern with using the Khodosevich paper as evidence supporting the Fact. This is a review paper, meaning that it didn't technically adhere to my requirement of a primary research article (such as PrĂĽfer et al.) that would be the original source of data that could be used to support or refute a conclusion. If students had found, read, and directly cited reference [7] from the above quote from Khodosevich, then that would be more appropriate. This assumes, of course, that reference 7 really does provide evidence that supports the Fact. The practice of citing literature based only on its title, or perhaps after only reading the abstract, does occur in science. Hence, some practical skepticism is warranted! Don't blindly believe that a citation supports an author's position until you read the citation.

I also offered students two technical, genetics-related viewpoints.

The Fact, as stated, discusses the percentage of genes shared between the species. However, most of the students cited research that was not tallying genes shared in common, but rather nucleotides (the letters comprising a chromosome's sequence) that are shared in common, and it is not necessarily an obvious logical step to conclude that 98.7 percent identity at the DNA sequence level would result in 98.7% (or 99%, with rounding) of genes being shared between humans and chimpanzees.

Second, many of the genomics studies that students cited analyzed different regions of chromosomes. Some looked only at single-copy regions of chromosomes. These DNA sequences are easy to locate the identical (homologous) chimpanzee version of, so the two sequences can be directly compared to find percent identity.

ATACATAG (Human)
ATAGATAG (Chimp)
Of these eight nucleotides (which I invented for the purpose of this example), only one is different between humans and chimps, so 7/8 are identical (87.5% identity). This was the type of approach used by Ebersberger et al. and by PrĂĽfer et al.

Other chromosomal differences exist, like insertions and deletions (indels), and duplications of large (or small) stretches of DNA. Scientists can interpret these sorts of differences in various ways.

ATAC-----ATAG (Human)
ATAGGGGGGATAG (Chimp)

Here, in the middle of the same eight nucleotides as the first example, there is a five-nucleotide insertion of G in chimpanzee (or, just as likely, a five-nucleotide deletion of G in human). If scientists align the two sequences like this, then they might only analyze the alignable sequences, which would result in the same calculation as above: 7 of 8 alignable nucleotides are identical, so 87.5% identity. However, if the structural variants (like indels) are included in the calculation, then above there are seven of thirteen total nucleotides that match (53.8% identity). This latter approach was employed by  Varki and Altheide and also by Britten. Notably, Britten found 1.4% divergence in alignable nucleotides, with an additional 3.4% divergence based on indels. The sum is thus 4.8 (which was rounded to 5% in the manuscript title - probably not deceptively, but also not accurately!)

Thus, different studies can reach different conclusions because of the analysis methods used. And, of course, those subtle differences don't make it into the paper titles (and sometimes not into the abstracts) - and they definitely don't trickle down into popular science and media coverage of these types of studies!

Comparing research studies that arrive at apparently different conclusions

I made two final points for the students. I suggested that they might consider how much DNA sequence was analyzed in each study. This could be used to decide the relative importance of each study when arriving at a conclusion about which of the various calculations of human-chimp percent divergence (here we've seen citations reporting 0.5, 1, 1.24, 4, and 5%) is perhaps most accurate. Britten's analysis of 779,000 nucleotides resulted in a conclusion of 5% divergence, while Ebersberger et al. analyzed 1,900,000 nucleotides and found 1.24% divergence. 

I also suggested that they might consider the ages of the studies. It might be important, if you really want to be sure of your facts, that you not cite an old study that might have been conducted with perhaps less precise methods than we have at present, or that was since corrected in more recent studies. This is not to say that old studies using old approaches or tools are necessarily flawed and inherently worse than more current research, but some people (I understand) still think the earth is flat, citing really old literature, despite a plethora of more recent work that has pretty much eroded confidence in the flat-earth stance. When performing a literature review, it is best practice to read studies spanning the time when a particular topic has been researched. I know that my audience here doesn't necessarily have that sort of time, which is why I hope that this information literacy project will fill that need!

Student Decision: Fact or Fiction?

Ultimately, 30 of 51 students (58.8%) agreed that it is a fact that "We humans share 99% of our genes with chimpanzees and bonobos."

From my perspective, it seems like the top two aspects that caused less-than-overwhelming support for this Fact, as stated, were that

  • the specific value of 99% did not appear in research literature
  • the studies students found did not assess shared genes (as in the Fact) but instead shared DNA sequence

In other words, I think that this Fact didn't garner more support because of rounding and because the Fact, as written, misconstrues the actual basis of the research supporting it.

What resources did students use to find supporting research literature?

The week before this information literacy project began, I had started showing students in class how to use the NCBI PubMed literature database to find research publications. So, I also asked students to report how they identified the research literature they cited for this first fact-checking assignment.

29% PubMed
17% Google Scholar
29% Other form of search via web browser
2% EasyBib
2% Public Library of Science website
3% EBSCO

and, to my delight and surprise:

19% used some form of resource at the Henry Madden Library on our campus. I only teach upper-division and graduate classes, so I'm not familiar with how much exposure students get to using our library resources. I'm glad they're taking advantage of what our library has to offer!

Literature Cited




Fact-checking "100 Things You Never Knew About Your Genes"

If my goals are
  • to produce a curated bibliography for 39 of the DNA facts
  • to help demonstrate how to be information literate
then I owe it to readers to be transparent about the process. I think this is an important way to model information literacy: explain the process of fact-checking, particularly how decisions are made about how the copious information reviewed ultimately gets whittled down and passed on to the audience in digestible pieces.

Remember, this process is arguably the reason that scientific research conclusions get blown out of proportion or misconstrued, because the goals of media are to outcompete other outlets for audience and to give the audience information in a format that the audience wants. And, more and more, that means quick bites of information that necessarily exclude important details for understanding the assumptions, meaning, value and/or impact of a research study.

For example, if you only read the title of Beall and Tracy's 2013 paper in Psychological Science, "Women are more likely to wear red or pink at peak fertility," you might take this claim at face value. However, this study fails critical thinking and good information literacy practices in many ways, including that the title does not specify which women. You have to delve into the research paper itself to learn how many women were surveyed, and where in the world they live. This is where critical thinking is important. When I see the title of this study, I start thinking questions like, "What do they mean by 'more likely' - how much more likely than women wearing other colors?" "Where in the world did these women come from - might there be a cultural bias in what colors are worn?" "When was the study conducted: was it around Christmastime, where it might be more likely that women are wearing red?" "What control experiment was conducted - what about men, for example? Or women of ages outside of the fertility range?" And, because I'm red-green color-blind, another really important question to ask: "Who decided what shades and hues count as red or pink? Did women self-report, or did the research take photographs of the clothing and objectively define and measure both red and pink?"

So, this is how the fact-checking will work in my class:

When we begin to study a topic that is relevant to a group of DNA facts, then I'll present that group to the students and ask each student to select one to scrutinize. As we study that topic, the students will be introduced to concepts and vocabulary that might be important for them to understand the research literature they'll access for fact-checking their chosen claim. By the end of the topic, each student finds one primary peer-reviewed research manuscript that contains evidence either supporting or refuting the fact. Using that single-source information, each student writes and sends me a short (~1-2 paragraph) summary of the information, along with the citation to the primary literature.

Then, on my end, I compile and read through the student summaries as well as many of the commonly-cited research studies. I look for common misconceptions in how some of my students might have misinterpreted data from the studies, and I identify whether multiple sources tend to agree or not on the fact. So that I can give my students feedback on their work (which they will hopefully use to improve their information literacy and critical thinking skills), I give a short in-class presentation the day after their summaries are due, which includes:
  • summarizing various student perspectives about whether fact is supported or not
  • addressing misconceptions related to genetics concepts that were evident in the written summaries
  • highlighting strengths and weaknesses related to the credibility of sources that were used
Finally, after discussion, I put the matter to a yes or no vote:

"Based on the information found in published research literature, is it reasonable for this claimed fact to be called a fact?"

Saturday, September 7, 2019

4. The maiden voyage project

I've thought for many years about teaching a course (like Calling Bull) or starting an institute (like the Institute for Media and Public Trust) on information literacy and critical thinking. I already incorporate these elements into all of the classes that I teach. Then, I came across a very opportune way for me to more visibly help people practice skepticism while improving their information literacy and critical thinking skills.

I'm teaching an undergraduate genetics class this term, as I often do. Last week, when I was at the grocery store waiting to check out, I saw a National Geographic publication called, "Your Genes: A User's Guide. 100 things you never knew." For $14.99 (plus tax), I took home a copy. After all, I figured, I should probably learn the 100 things before going back to class the following day.


I looked through the 100 facts, and made a not entirely surprising discovery: I was familiar with many, but there were some I didn't know. Most surprisingly to me, none of the facts were presented with citations to the original research. At the back of the issue, a page is devoted to all of the photo credits throughout the magazine, but there were no references to the facts! So, we have an issue (pun intended): a newsstand piece is offering facts about genetics, but it doesn't support information literacy because the reader has no easy way to independently verify the veracity of the claims!

My mind was immediately made up: I'd ask my genetics students to participate in an assignment to create a bibliography of credible scientific literature to support or refute each of the facts. I usually have students practice using mobile devices in the same way that scientists do (e.g. looking up genetic information online - using trusted sources, of course!), so this assignment supports one of my student learning outcomes, "Efficiently find and use quality information relevant to genetics." I should also mention that this project is feasible because I teach a class in which all of my students are supported by my university in having a mobile device for in-class use if they do not already own one: the DISCOVERe Mobile Technology Program. One goal of this program is for students to master digital literacy skills.

Before starting, I wanted to know if some facts would be really difficult for my students to fact-check. So, I did what I'm good at: I made another spreadsheet. I typed out the 100 facts, and then I categorized them:

  • 81 of the facts relate to topics I teach about in my genetics course
  • 42 of those 81 immediately fail fact-checking for a number of reasons. Here are an example or two from several categories:

Semantics

#33.  "The bacterium E. coli can replicate 1,000 nucleotides per second." The the real problem is the word "can." To the critical thinker, this could imply that in some trivial, artificial scenario, a bacterium is technically capable of such an act. If so, then would this be a fact? I suppose it would adhere to the technical definition of "fact," but would it be relevant to genetics in the sense of a valuable fact to know about biology and how cells work?

#5.  "All of us get three feet (1 m) of DNA from our father and three feet from our mother" At the outset, there are two potential problems with fact-checking this. First, one of the key warning flags for spotting untruths is the absolute, like "all of us." In biology, there is almost always an exception to every rule, so in this situation, it is almost certain that an exception to this purported fact could be found. Also, a finer point: DNA is a molecule, and its length, like a rope, depends in part on how much it is stretched. This number, as presented, probably assumes some common knowledge of the chemical structure of DNA, but without that assumption explicitly accompanying this fact, there is no clear way forward for fact-checking it.

Ambiguity: vague or imprecise wording

#8.  "More than 800 genes are involved in cell division." This fact seems likely true, but "more than 800" is vague. Additionally, "involved in cell division" is also imprecise. One could argue that DNA polymerase, the enzyme that replicates DNA, is involved in cell division because without it, cells wouldn't contain the 799 other genes otherwise involved in cell division. It is a slippery slope, and so the way this fact is worded makes it unable to fact-check, depending on who is defining what "involved in cell division" means.

#85.  "Although James Watson and Francis Crick used x-ray data to visualize DNA, the real proof came in 1982 when the B-form of the molecule – the right-handed side of the helix – was crystallized." The real proof of what? We can't fact-check what we don't know what we're trying to find evidence for

Opinion

#76.  "Rosalind Franklin’s x-ray photographs of crystallized DNA fibers are hailed as the 'most beautiful x-ray photographs of any substance every taken.'" We could probably find a source for who said this, and maybe it is a fact that this opinion was made, but that's irrelevant to genetics

#62.  "The most useful stem cells come from human embryos." Most useful to who and for what?

#31.  "Many scientists believe antisocial behavior is a function of genetics, as multiple genes work together." What "many scientists" (how many? a tiny minority?) have concluded is relatively difficult to fact-check, but more critically, a factual statement cannot contain the word "believe." What scientists believe (think) is not relevant to fact (what we know).

Definition (no real need to fact-check)

#53. "The entire genome of an organism is found in a donor, or somatic, cell." This isn't quite true, but (almost) every somatic cell does have a nucleus and so does contain the entire genome of the organism.

#39.  "Each of us has 22 nonsex chromosomes." This is true, by definition: humans have 22 nonsex chromosomes and then the 23rd (the X and Y, or sex, chromosomes). Whether a definition comprises a fact is debatable, I suppose?

Predictive or otherwise not-fact-checkable

#79.  "In less than 10 years, scientists will be able to sequence an entire genome in just a few hours." I guess we'll know in, at most, ten years! This also, as many of the facts, falls into multiple categories, like "Ambiguous." We can already sequence the entire genome of many species in just a few hours; presumably this fact is meant to be about sequencing a human-sized genome…

#13.  "About 1 percent of the total DNA carries instructions to make proteins; the rest is so-called junk DNA." The first clause might be true (but, like #79, "the total DNA" of what organism? Also the second clause is arguable. Some used to call the non-protein-coding part of our genome "junk DNA," but that term has fallen out of fashion. We now better appreciate that sections of our DNA that don't contain genes still can have important functions. So, on one hand, many scientists would disagree that this is a fact, but the existence of the few who might still refer to it as "junk DNA" means that, technically (the way it is written), this is an indisputable fact.

Unimportant facts or definitions to be aware of (granted, "unimportant" is my opinion)

#98. "Aboard the International Space Station is a digitized copy of Stephen Hawking’s genome, along with that of Stephen Colbert and others." It is called the Immortality Drive - look it up on Wikipedia

#45. "'Quaternary marriages' are when identical twins marry identical twins."

The remaining 39 of the 81 facts that are relevant to my genetics class are definitely in need of fact-checking, and my students and I will meet that challenge! Our findings will be shared here.

3. About the Author

Dr. Joseph Ross
As I'm espousing skepticism, I would be remiss not to point out that you should first be skeptical about who I am!

I am an extremely objective, evidence-based decision-maker. I am a scientist. I'm about as analytical as a person can get. You should marvel in wonder at the spreadsheets that I generate when I'm shopping for auto loans.

A formative time when I was a child was around my late elementary school years, when I got obsessed with categorizing things. I think this story might resonate with many scientists. First, it was fruits and vegetables. My father being a botanist, I knew from an early age the key difference between fruits and vegetables, but as I got older, I realized that there were so many fruits and vegetables! So I started making lists. First on paper, but then I wanted to alphabetize the lists so I could more easily tell if I had already read (usually in an encyclopedia) about that food before. So, I moved to a spreadsheet format. And then I realized that herbs and spices are also plant products, but not necessarily classified as fruits or vegetables, and that spawned another spreadsheet or two.

And then I moved on to geography and, in my case, seas. Mind you, this was all before Wikipedia, so, to my knowledge, I was the only person poring over atlases and maps of all kinds and trying to make a comprehensive list of all of the seas. The inland ones (like the Salton Sea), the big ones (Mediterranean), and the obscure ones (like the Moluccan Sea). A huge resource for this project were National Geographic maps. Every so often, an issue of National Geographic (which my parents had subscribed to for decades - there were piles of them all over the place) came with a separate foldable map of some part of the world - either a country or a region with a few countries. I kept these in a huge stack in a cupboard and regularly pored over them. And then, of course, I had to know how seas and bays and coves and oceans and other bodies of water are defined. And then I realized that the definitions of these were all relatively arbitrary, so I went on a permanent project hiatus.

So, in some ways, I think I ultimately became a practicing geneticist partly because I'm intrigued at least a little bit by definitions and also by the real impact on effective communication of developing and sharing precise definitions.

After that, I earned a B.A. in Biochemistry degree, then a Ph.D. in Molecular and Cell Biology, and since then have practiced genetics and molecular biology, running an academic research group with graduate and undergraduate students and teaching classes in, among other topics, genetics. My trainees and I have published in reputable scientific journals. Through these efforts, my objective-compulsive perspective has been even more finely honed as I've been increasingly tasked with helping others learn how to be productive scientists.

None of this means you should trust me or my opinions. But, at least now you know a little bit about me. And, in my opinion, that's one of the most important things we can do to battle misinformation, is to get to know people well. It is one thing to read things that somebody posts on social media, but if you've never met that person (if it is a person!) and gotten to understand them, their proclivities, habits, and motivations, then you might want to place less weight on what they say.

2. Objectives and Outcomes

On this blog, I have one clear objective and one associated outcome:

Objective
To improve information literacy and critical thinking skills (hopefully in an engaging way)

"Information literacy" and "critical thinking" are, and have been, educational buzzwords for years. With the expansion of the reach of the internet, though, efforts to improve these skills in as many people as possible have seemed even more important of late. There are many variations of the definitions of each of these terms.

In my vernacular, information literacy includes (but not exclusively) the ability of a person to identify how credible a source is, and thus also to know multiple ways that sources can come to be trusted. I trust an information literate person to have developed a rigorous threshold for themselves that they would be able to articulate. So, at least, if I wanted to have a productive conversation with that person about whether or not a piece of information was true, we could at least understand each other's standard of truth. Thus, perhaps, at least we could arrive at an "agree to disagree" conclusion instead of debating pointlessly for hours without real hope of convincing the other that one's own conclusion is more accurate.

For example, I think an information literate person would use a web search engine to research a particular fact, but they would prioritize efforts looking for peer-reviewed research articles published in scholarly journals instead of an unfamiliar news website (or social media post) that provides no references to the sources for the fact. Other aspects of the source might also be scrutinized, like how old (and potentially out-of-date) the source is, and the extent of independent agreement on the fact.

To me, a critical thinker is always dubious of a claim and performs further evaluations and fact-checking before reaching a conclusion. One aspect of critical thinking is the urge to ask, "does this fact make sense?" In some situations, a claim might seem reasonable at face value, but the skeptic will delve deeper. A potentially poor example, but a favorite of mine, involves foods that are labeled "99% Fat Free." Of course the manufacturer wanted to put the words "Fat Free" on the label, but is the product? Of course not. 99% fat free is the same as printing "1% fat," which might not sell as well.

Another facet of critical thinking, which can be difficult to develop, is the ability to initially reject a proposed hypothesis or explanation and to come up with an alternate explanation that you think explains an observation or fact just as well (or better).

Another potential aspect at the intersection of critical thinking and information literacy is considering the potential motivations of the source(s) of the information. In my domain, that can include inquiring about who funded a particular research project, to look for potential financial conflicts of interest.

So, as we explore facts, I will emphasize the warning signs to watch for that might signal innocent or deceptive information, as well as techniques and methods for supporting or refuting that initial skeptical stance.

Outcome
The main produce of each post here will be to publish a curated list of references that, together, either support or refute a published piece of information.

1. Don't believe everything you read

There are both innocuous and also nefarious reasons that you might encounter untruths on the internet and in other media outlets. Some lies might be nefarious; some are inadvertently misleading. The latter is often the case with science reporting.

Scientific topics are frequently in the news (ancestry testing, disease risk, global warming and climate change and their potential impacts on species extinctions, and so on). Yet, many journalists are not trained as scientists, and many scientists are not trained in effective communication to non-specialist audiences (i.e., the public).

These two parties involved in the assembly of public-facing reports of scientific discoveries also serve different stakeholders and have different motives. For publicity, career advancement, and self-esteem purposes (among others), a scientist and a journalist want their work to be as publicized and disseminated as possible. This pressure, called a conflict of interest, could provide temptation to oversimplify and/or overstate the conclusions, relevance, and potential future applications of even the most incremental of advances in science.

As a practicing scientist, I know, because I've been there. In fact, for full disclosure, I'm spending my personal time right now beginning this blog for selfish reasons. If I'm successful in attracting you to read more, that this effort might help me gain visibility among colleagues and peers whose opinions I value. Maybe I'll be asked to travel to hold workshops or present seminars on improving information literacy. Maybe I'll get a book deal, or be on TV! Ahh, the allure of the potential for fame.

More seriously, I'm investing my time in this project because I think that what I'm about to tell you is inherently interesting and important. I'm interested in helping develop an informed and skeptical populace because I think that society functions best when everybody is able to make decisions based on truths and not just on what the loudest voices in the room are saying. And this benefits us all.

Thus, while here, I encourage you to practice skepticism: don't believe anything you read unless you are convinced by factual statements supported by objective evidence. The goal here is to objectively analyze and either support or debunk information that has been presented (innocently or deceptively) as fact.