Information

What about 23andMe's SNP test gives it such bad efficacy as a diagnostic tool?

What about 23andMe's SNP test gives it such bad efficacy as a diagnostic tool?


We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

The recent news about the FDA stopping the google backed 23andMe service selling any more kits got me thinking. I understand the company may have been selling it as a medical tool prematurely, but what is it about SNP analysis that makes it so poor as at predicting disposition? Even other SNP companies have similar problems with accuracy and can't agree on disease disposition (Tonellato et al., 2011). Why might they differ from 23andMe's predictions?


The long and short of it is that genetic variation is actually not very predictive in comparison to "environmental" effects such as lifestyle.

Only a quarter of the variation in lifespan between twins is attributable to inherited factors (including genetics and epigenetics) [1] - the rest is environmental, from lifestyle to air quality.

Most genetic variants (such as those measured on the chip used by 23andMe) have a very small impact on outcomes such as "Risk of disease X, Y or Z". Many risk loci have been identified, but without a full understand of what this risk means, the results of the 23andMe analysis can be misleading.

Some few rare genetic variants have a larger impact on diseases (such as early onset diabetes), but this comes under the "monogenic" category of genetic variation, rather than the common variants typically observed in large populations.

Basically, genetic risk scores are, at the moment, only indicative of an individuals risk, and to get the real picture a lot more information is required - for instance smoking is a better predictor of early mortality than any "common" (>=5% frequency in a population) genetic variants, likewise a high fat diet is more indicative of cardiovascular disease.

  1. Skytthe A, Pedersen NL, Kaprio J, et al. Longevity studies in GenomEUtwin. Twin Res. 2003;6(5):448-454. [PubMed link]

Collecting Thoughts on the FDA vs. 23andMe

I’ve been reading many of the news articles and blogposts about the 23andMe / FDA controversy, and marking interesting points, agreements and disagreements, various perspectives. These are some of my favorite bits from the pieces that have come out so far. I have a distinct personal bias regarding this extremely complicated topic, but also have good friends on “the other side,” as well as having a professional commitment to attempt to be unbiased. In this collection of quotes and highlights, I try to not let my own bias interfere too much, and to fairly represent opinions both pro-FDA and pro-23andMe. Please note that, on both sides, there are extremists, who I found typically didn’t clearly express the complexity of the issues, thus most of the quotes excerpted below are from more moderate authors.


Thoughts on the FDA

“The Food and Drug Administration’s recent directive to the company 23andMe to stop marketing its genetic tests directly to consumers is a shortsighted, heavy-handed, double-standard act of paternalism.”

“I am deeply frustrated by the simplistic narrative of OMG FDA BIG GUBBERMINT SILENCING DARING ENTREPRENEUR. It’s not that simple.”

Wilbanks, John. FDA’s Culture Is Mendelian Dominant Over 23andme’s Business Model. DEL-FI November 30, 2013. http://del-fi.org/post/68560843111/fdas-culture-is-mendelian-dominant-over-23andmes

“The outrage over the FDA’s treatment of 23andMe is the wrong response. We should be holding 23andMe accountable for the claims they make in marketing their product. Even a product with such great potential should have to support its claims with valid evidence.”

Curtiss, Chase. Here’s what health entrepreneurs can learn from 23andMe. The Verge Beat November 29, 2013 5:11 PM. http://venturebeat.com/2013/11/29/heres-what-health-entrepreneurs-can-learn-from-23andme/

“This incident highlights the tension between the paternalistic medical establishment that arose to deal with the dangers of 19th-century quack medicine, and a “techno-populist” element of American society pioneering personal health assessment and decision-making by leveraging new information technologies.”

“If the F.D.A. indeed insists on making 23andMe prove beyond doubt the validity of every single correlation, no genetic-testing service will be able to economically deliver medically relevant genetic information directly to consumers. It will destroy the industry and leave medical genetics in the hands of a medical establishment that has already failed to give people an easy way to obtain and use the elemental information in their own spit.”

“Years ago, the FDA used the same argument against selling blood glucose meters to patients with diabetes. I believe that the FDA is wrong in saying that data in the hands of patients will do damage.”

Bartlett, Ann. Genetic Testing and the FDA. Health Central Wednesday, November 27, 2013. http://www.healthcentral.com/diabetes/c/9993/164554/genetic-testing-fda?ap=2008

“FDA acted properly in view of 23andMe’s cavalier attitude toward its regulatory obligations and its failure to meet past commitments. However, it would be a setback for science if 23andMe were not allowed to proceed. For its research model to deliver, it needs more people, far more people in its database. A campaign to sign up a million customers is a good start, and not losing momentum is essential. Perhaps 23andMe and FDA can find an accommodation — such as a consent order — that allows 23andMe to move forward while catching up on its overdue obligations, under threat of financial penalties or even perhaps the licensing of its database.”

“It reads like the letter of a jilted lover,” Misha Angrist, a former genetic counsellor who writes about personal genomics and teaches at Duke University, said. “ ‘We went on fourteen dates! We exchanged all these e-mails! We held hands in the park! Now you’re telling me, “F*** you,” and kicking me to the curb.’ ”

“Though the FDA talks up progress, there is a risk that it may slow it down. The agency is weighing regulations on test kits sold directly to consumers, laboratory tests and software that analyses raw genetic data. It is clear that 23andMe is not the only testing firm in its sights.”

“Is the FDA and the rest of the medical establishment too conservative about innovation and health data that consumers can get directly? Well… is the Pope Catholic?”

MacManus, Richard. Thoughts On 23andMe & The FDA. November 27, 2013. http://ricm.ac/2013/11/27/thoughts-on-23andme-the-fda/

“I asked Dr. Hamburg if she were to have any power that FDA currently lacks, what would it be? The central thesis of her reply was: “I also think we need to find a way – maybe it’s just completely unrealistic – where we can have more flexibility in the system so that every time there’s a crisis or a recognition of a need to do more…we don’t have to go through the process of seeking new legislation.””


Thoughts about 23andMe

“Either 23andMe is deliberately trying to force a battle with the FDA, which I think would potentially win points for the movement the company represents but kill the company itself, or it is simply guilty of the single dumbest regulatory strategy I have seen in 13 years of covering the Food and Drug Administration.”

“On the Twitterz, I wrote that 23andMe’s attorneys should be disbarred for letting things reach this point. Interestingly, it appears that General Counsel left the company several weeks ago (and no replacement has been found). I’m not always a big fan of the FDA (they still haven’t really figured out how to approve new antibiotics), but the reality is that the FDA is like those humongous tractors used to move space rockets: they’re slow, but crush everything in their path. You can’t bullshit these guys–they just keep coming.”

Mike. Some Thoughts on the FDA Action Against 23andMe.com. Mike the Mad Biologist November 26, 2013. http://mikethemadbiologist.com/2013/11/26/some-thoughts-on-the-fda-action-against-23andme-com/

“The consequences of mistakes by 23andMe can be deadly serious. If it reports a “false positive” for a major disease, that can alter someone’s whole life (though I’m rather sure that any medical professional would obtain results from another service to confirm positive results).”

“But as the FDA frets about the accuracy of 23andMe’s tests, it is missing their true function, and consequently the agency has no clue about the real dangers they pose. The Personal Genome Service isn’t primarily intended to be a medical device. It is a mechanism meant to be a front end for a massive information-gathering operation against an unwitting public.”

Seife, Charles. 23andMe Is Terrifying, But Not for the Reasons the FDA Thinks. Scientific American November 27, 2013. http://www.scientificamerican.com/article.cfm?id=23andme-is-terrifying-but-not-for-reasons-fda

“Some experts claim that the risk of ailments like Type 2 diabetes can only be partially calculated based on genetic information. A credible diagnosis would require understanding a lot more about the person’s lifestyle and health history. Promoting a do-it-yourself culture when talking about serious health concerns like cancer and heart disease might have major downsides. However, the opinion in favor of 23andMe is that a dangerous double standard is at work. Today a doctor can sell many types of genetic tests to a patient, at a much higher cost. Only a few have received FDA approval. So why penalize the direct-to-consumer model?”

Kaushik, Preetam. FDA vs. 23andMe: The DNA of a disagreement. All Voices Nov 29, 2013 at 8:35 PM PST. http://www.allvoices.com/contributed-news/16062954-fda-vs-23andme-the-dna-of-a-disagreement

“However, according to one expert, the accuracy of the test is not the biggest issue. The company’s testing methods have been found to meet federal standards for lab testing, called Clinical Laboratory Improvement Amendments (CLIA), said Amy Sturm, a genetic counselor at The Ohio State University Wexner Medical Center. A greater problem is that the results provide “a very incomplete view” of a person’s risk for a given disease, Sturm said.”

“Unfortunately, due to all the complex interactions between the markers, this full unravelling is impossible. The number of interactions is probably so high that every patient will have his or her own unique complex cause of disease. And what has never happened cannot be identified or predicted by big data. Advances in genome science will improve what tests offer, but these improvements will be small. While the hope is based on big data, the reality is that most diseases are simply not genetic enough. Other risk factors such as diet, body weight, smoking, exercise and stress are too important. And big data cannot change the biology of diseases – it will not make them more genetic.”

Janssens, Cecile. It is game over for 23andMe, and rightly so. Pando Daily November 27, 2013. http://pando.com/2013/11/27/it-is-game-over-for-23andme-and-rightly-so/

“It’s not all 23andMe’s fault. In my book research, I’ve read a lot about personal genomics. And the more I read up on genetics, epigenetics, etc., the more I see that the scientific community still has very little clue about what actually causes disease.”

MacManus, Richard. Thoughts On 23andMe & The FDA. November 27, 2013. http://ricm.ac/2013/11/27/thoughts-on-23andme-the-fda/

“THAT is the future. The DTC SNP chip era is ending. 23andMe’s two main competitors, Navigenics and deCODEme, already left the market. 23andMe’s SNP chips are slipping rapidly into the past. And 23andMe knows it.”

“As to what will happen now is actually very simple. 23&Me will have to either file as a diagnostic and go through medical approval (which will cost millions of dollars) or start to only offer their service through a medical practitioner, more than likely through a clinical geneticist that can walk a patient through the intricacies of genetics and disease. The problem though is that according to the US News, there are only 358 clinical geneticists currently practicing in the United States. Measure the knowledge dissemination of a clinical geneticist, who usually only sees patients through referral, to the marketing power of a Google backed company like 23&Me, that was gearing up for a television marketing campaign: the outcome is a nail in the proverbial coffin for recreational genomics. As for 23&Me, its problems will only worsen as it will struggle to validate itself through the FDA since genomics is still in its infancy, thus: RIP the only viable, scalable consumer genetics company in the world.”

Pablo, Juan. What is next for Direct to Consumer Genetics. 1EQ Nov. 27, 2013. http://1eq.me/blog/?p=182

“We’re ignoring the bigger issue! The real reason 23andMe can’t test for my mutation is the company who formerly held a patent on BRCA1 and BRCA2 still has a proprietary database of our genetic mutations, and they aren’t sharing it with anyone! We need to fix THIS. We need for the government to help us figure out how data can be shared rather than be treated as a trade secret.”

Andrea. FDA B*tchslap of 23andMe: A BRCA Previvor’s Perspective. Brave Bosom November 26, 2013. http://www.bravebosom.com/fda-btchslap-of-23andme-a-brca-previvors-perspective/

󈬇andMe is simply doing what the Internet does best: forcing old dogs to learn new tricks. That’s what the fight between Uber and taxicab commissions is about. Same for AirBnB and hotel regulators. The only profession slower to change how they do things than doctors is bureaucrats. So the FDA’s reaction is understandable — but misguided.”

Szoka, Berin. FDA Just Banned 23andMe’s DNA Testing Kits, and Users Are Fighting Back. Huffington Post 11/26/2013 7:46 pm. http://www.huffingtonpost.com/berin-szoka/fda-just-banned-23andmes-_b_4339182.html


Thoughts about Regulation of Personal Genetic/Genomic Services

“We need DTC screening. It helped me. It’ll help many others. But until the FDA learns how to deal with Bayes’s rule and its discomforts – and until DTC companies figure out a business model that isn’t based on massive loss leadership – we’re going to keep coming back to this clash of culture and business models. Both sides need to make some changes if we’re going to avoid doing this over, and over, and over.”

Wilbanks, John. FDA’s Culture Is Mendelian Dominant Over 23andme’s Business Model. DEL-FI November 30, 2013. http://del-fi.org/post/68560843111/fdas-culture-is-mendelian-dominant-over-23andmes

󈬇andMe embodies a generation preoccupied with itself. Our right to know has superseded our ability to understand. Empowerment has evolved as data, information, knowledge and wisdom are almost seen as one in the same. Whether 23andMe’s reporting is actionable is to miss the point. When you’ve got your data, what more do you need, really? Epigenetics…what epigenetics?”

Vartabedian, Bryan. 23andMe – Why Our Big Government is Right. 33 Charts November 28, 2013. http://33charts.com/2013/11/23andme-government-is-right.html

“As a citizen, I expect corporate transparency for any new health product. As a patient, I think the risks of taking the test outweigh the benefits for my health. As a doctor, I have my concerns for people with distress or misinformation from results of an unproven genomic test. As a human being, I worry about misuse and unintended social consequences of our genetic heritage.”

“Our society has increasing information and public access to information. While it is difficult for me to think that this isn’t a good thing overall, we have to thoughtfully consider the possible unintended negative consequences. This case is part of a larger pattern of sacrificing quality-control filters for the sake of open access. This increasingly puts the burden on the public to make sense of sometimes complex and technical information. Everyone, now, can be their own geneticist.”

Novella, Steven. The FDA and Personalized Genetic Testing. Science Based Medicine November 27, 2013. http://www.sciencebasedmedicine.org/the_fda_and_personalized_genetic_testing/

“Community forums and news sites across the web exploded with debate, with most people rallying to 23andMe’s defense. The company’s ample support-base claims that the Food and Drug Administration is over-regulating, and is stifling innovation. However, the majority of geneticists and medical professionals I’ve spoke with have sided with the Food and Drug Administration, arguing that many patients require genetic counseling after receiving DNA test results that point to a high risk of cancer and other life-threatening conditions.”

Farr, Christina. Here’s why the FDA is targeting 23andMe. MedCityNews November 26, 2013 10:00 am http://medcitynews.com/2013/11/heres-fda-targeting-23andme/

“When 23andMe sent us our results, we followed their advice: we asked our doctor to talk about them. Most doctors didn’t know where to begin. But the more of us ask about 23andMe, the more the medical profession is catching up. Slowly but surely, they’re brushing up on genomics, taking the time to understand the site, and talking to us about our results and what, if anything, to do about them.”

Szoka, Berin. FDA Just Banned 23andMe’s DNA Testing Kits, and Users Are Fighting Back. Huffington Post 11/26/2013 7:46 pm. http://www.huffingtonpost.com/berin-szoka/fda-just-banned-23andmes-_b_4339182.html

“If you scare somebody into believing they’re high risk, they could take actions that hurt their health,” says Gutierrez. Not only is the data on some genetic links inconclusive, he adds, it’s well-chronicled that patients can push their doctors into authorizing unnecessary procedures. “Doctors do a lot of double mastectomies because of fear.”

“Should this third party be a doctor, as some (mostly doctors) are arguing? There are certainly doctors out there who have a great grasp of human genetics. But there aren’t a lot of them. And even the doctors who do know the world of human genetics inside and out aren’t in a position to help people navigate every nook and cranny of their genome. This is a job for software, not for people.”

Eisen, Michael. FDA vs. 23andMe: How do we want genetic testing to be regulated? November 26, 2013. http://www.michaeleisen.org/blog/?p=1480

“This is a broad cautionary tale,” says Quackenbush. “We need to be careful about how we define phenotypes, such as whether a patient is likely to respond to a drug or have an adverse event, because if we don’t do it well, we’re not going to have good tools for advancing personalized medicine.”


The quadriceps muscle group is composed of the rectus femoris, vastus medialis, vastus lateralis, and vastus intermedius. The rectus femoris originates at the ilium, thus crossing both the hip and knee joint along its course. This anatomy allows for hip flexion and knee extension. The remaining muscles originate on the femur and function solely as knee extensors. Innervation of these muscles is by the femoral nerve. The quadriceps are primarily active in kicking, jumping, and running.

Acute strain injuries of the quadriceps commonly occur in athletic competitions such as soccer, rugby, and football. These sports regularly require sudden forceful eccentric contraction of the quadriceps during regulation of knee flexion and hip extension. Higher forces across the muscle–tendon units with eccentric contraction can lead to strain injury. Excessive passive stretching or activation of a maximally stretched muscle can also cause strains. Of the quadriceps muscles, the rectus femoris is most frequently strained [1𠄵]. Several factors predispose this muscle and others to more frequent strain injury. These include muscles crossing two joints, those with a high percentage of Type II fibers, and muscles with complex musculotendinous architecture [1, 2, 6, 7]. Muscle fatigue has also been shown to play a role in acute muscle injury [8].


Best integration of DNA analysis and historical research

AncestryDNA

Founded in Utah in the 1990s, Ancestry.com -- the parent company of AncestryDNA -- started out as a publishing and genealogy company. Since then, it has had a somewhat tumultuous corporate existence, having been bought, sold, publicly traded and then purchased by private equity groups.

The company's basic DNA kit service, currently on sale for $59, provides you with an "ethnicity estimate" derived from its proprietary sequencing techniques. It's noteworthy that the company's genetic testing, which is outsourced to Quest Diagnostics, is distinct from most other companies that use paternal Y chromosome and/or maternal mitochondrial DNA methodologies, and less is known about the particular criteria it uses.

That noted, AncestryDNA says its database contains more than 18 million profiles, making it the largest of all of the testing services. The company also maintains a powerful tool for searching through hundreds of historical document databases -- but any substantive research will quickly bring you to a paywall. Ancestry's databases are further bolstered by its partnership with FamilySearch.org, a genealogical records site run by the Mormon church.

An entry-level membership, which provides access to more than 6 billion records in the US, costs $99 for six months or $25 per month, after a free two week trial. The "World Explorer" membership, for $40 per month, broadens your access to the company's 27 billion international records, and the "All Access" tier, starting at $50 per month, includes unlimited access to Ancestry's historical and contemporary database of more than 15,000 newspapers and military records from around the world.

AncestryDNA offers a personalized health report with "actionable insights," access to genetic counseling resources, an online tool to help you map your family's health over generations and, starting in August 2020, a next-generation sequencing service for screening your genetic risk for heart disease, some cancers and blood disorders. Still, the results are not diagnostic -- though the test result must be approved by one of the company's physicians -- and the service does not have FDA approval. For now, 23andMe maintains the advantage when it comes to introductory DNA testing for health risks and genetic screening. But AncestryDNA's service is particularly well-suited for leveraging an introductory DNA analysis into deep historical research to build out a family tree.

AncestryDNA allows you to download your full DNA results profile and upload the raw data into other tools, and it provides reasonably good control over your privacy preferences, though the options are not as granular as others.


3. Traditional performance measures

We briefly consider some of the more traditionally used performance measures in medicine, without intending to be comprehensive ( Table 1 ).

Table 1

Characteristics of some traditional and novel performance measures

AspectMeasureVisualizationCharacteristics
Overall performanceR 2 BrierValidation graphBetter with lower distance between Y and Ŷ. Captures calibration and discrimination aspects.
DiscriminationC statisticROC curveRank order statistic Interpretation for a pair of patients with and without the outcome
Discrimination slopeBox plotDifference in mean of predictions between outcomes Easy visualization
CalibrationCalibration-in-the-largeCalibration or validation graphCompare mean(y) versus mean(ŷ) essential aspect for external validation
Calibration slope Regression slope of linear predictor essential aspect for internal and external validation related to ‘shrinkage’ of regression coefficients
Hosmer-Lemeshow test Compares observed to predicted by decile of predicted probability
ReclassificationReclassification tableCross-table or scatter plotCompare classifications from 2 models (one with, one without a marker) for changes
Reclassification calibration Compare observed and predicted within cross-classified categories
Net Reclassification Index (NRI) Compare classifications from 2 models for changes by outcome for a net calculation of changes in the right correction
Integrated Discrimination Index (IDI)Box plots for 2 models (one with, one without a marker)Integrates the NRI over all possible cut-offs equivalent to difference in discrimination slopes
Clinical usefulnessNet Benefit (NB)Cross-tableNet number of true positives gained by using a model compared to no model at a single threshold (NB) or over a range of thresholds (DCA)
Decision curve analysis (DCA)Decision curve

Overall performance measures

The distance between the predicted outcome and actual outcome is central to quantify overall model performance from a statistical modeler’s perspective 32 . The distance is Y −Ŷ for continuous outcomes. For binary outcomes, with Y defined 0 – 1, Ŷ is equal to the predicted probability p, and for survival outcomes it is the predicted event probability at a given time (or as a function of time). These distances between observed and predicted outcomes are related to the concept of ‘goodness-of-fit’ of a model, with better models having smaller distances between predicted and observed outcomes. The main difference between goodness-of-fit and predictive performance is that the former is usually evaluated in the same data while assessment of the latter requires either new data or cross-validation.

Explained variation (R 2 ) is the most common performance measure for continuous outcomes. For generalized linear models, Nagelkerke’s R 2 is often used 1,33 . This is a logarithmic scoring rule. For binary outcomes Y, we score a model with the logarithm of predictions p: Y*log(p) + (Y𢄡)*(log(1 – p)). Nagelkerke’s R 2 can also be calculated for survival outcomes, based on the difference in 𢄢 log likelihood of a model without and a model with one or more predictors.

The Brier score is a quadratic scoring rule, where the squared differences between actual binary outcomes Y and predictions p are calculated: (Y - p) 2,34 . We can also write this similar to the logarithmic score: Y*(1 – p) 2 + (1 – Y)*p 2 . The Brier score for a model can range from 0 for a perfect model to 0.25 for a non-informative model with a 50% incidence of the outcome. When the outcome incidence is lower, the maximum score for a non-informative model is lower, e.g. for 10%: 0.1*(1𠄰.1) 2 + (1𠄰.1)*0.1 2 =0.090. Similar to Nagelkerke’s approach to the LR statistic, we could scale Brier by its maximum score under a non-informative model: Brierscaled = 1 – Brier / Briermax, where Briermax = mean(p)*(1 – mean(p)), to let it range between 0% and 100%. This scaled Brier score happens to be very similar to Pearson’s R 2 statistic 35 .

Calculation of the Brier score for survival outcomes is possible with a weight function, which considers the conditional probability of being uncensored during time 36,37,3 . We can then calculate the Brier score at fixed time points, and create a time-dependent curve. It is useful to use a benchmark curve, based on the Brier score for the overall Kaplan-Meier estimator, which does not consider any predictive information 3 . It turns out that overall performance measures compose of two important characteristics of a prediction model, discrimination and calibration, each of which can be assessed separately.

Discrimination

Accurate predictions discriminate between those with and those without the outcome. Several measures can be used to indicate how well we classify patients in a binary prediction problem. The concordance (c) statistic is the most commonly used performance measure to indicate the discriminative ability of generalized linear regression models. For a binary outcome, c is identical to the area under the Receiver Operating Characteristic (ROC) curve, which plots the sensitivity (true positive rate) against 1 – (false positive rate) for consecutive cutoffs for the probability of an outcome.

The c statistic is a rank order statistic for predictions against true outcomes, related to Somers’ D statistic 1 . As a rank order statistic, it is insensitive to systematic errors in calibration such as differences in average outcome. A popular extension of the c statistic with censored data can be obtained by ignoring the pairs that cannot be ordered 1 . It turns out that this results in a statistic that depends on the censoring pattern. Gonen and Heller have proposed a method to estimate a variant of the c statistic which is independent of censoring, but holds only in the context of a Cox proportional hazards model 7 . Furthermore, time-dependent c statistics have been proposed 6,38 .

In addition to the c statistic, the discrimination slope can be used as a simple measure for how well subjects with and without the outcome are separated 39 . It is calculated as the absolute difference in average predictions for those with and without the outcome. Visualization is readily possible with a box plot or a histogram, which will show less overlap between those with and those without the outcome for a better discriminating model. Extensions of the discrimination slope have not yet been made to the survival context.

Calibration

Calibration refers to the agreement between observed outcomes and predictions 29 . For example, if we predict a 20% risk of residual tumor for a testicular cancer patient, the observed frequency of tumor should be approximately 20 out of 100 patients with such a prediction. A graphical assessment of calibration is possible with predictions on the x-axis, and the outcome on the y-axis. Perfect predictions should be on the 45° line. For linear regression, the calibration plot is a simple scatter plot. For binary outcomes, the plot contains only 0 and 1 values for the y-axis. Smoothing techniques can be used to estimate the observed probabilities of the outcome (p(y=1)) in relation to the predicted probabilities, e.g. using the loess algorithm 1 . We may however expect that the specific type of smoothing may affect the graphical impression, especially in smaller data sets. We can also plot results for subjects with similar probabilities, and thus compare the mean predicted probability to the mean observed outcome. For example, we can plot observed outcome by decile of predictions, which makes the plot a graphical illustration of the Hosmer-Lemeshow goodness-of-fit test. A better discriminating model has more spread between such deciles than a poorly discriminating model. We note however that such grouping, though common, is arbitrary and imprecise.

The calibration plot can be characterized by an intercept a, which indicates the extent that predictions are systematically too low or too high (�libration-in-the-large’), and a calibration slope b, which should be 1 40 . Such a recalibration framework was already proposed by Cox 41 . At model development, a=0 and b=1 for regression models. At validation, calibration-in-the-large problems are common, as well as b smaller than 1, reflecting overfitting of a model 1 . A value of b smaller than 1 can also be interpreted as reflecting a need for shrinkage of regression coefficients in a prediction model 42,43 .


DNA tests we’d avoid

HomeDNA

HomeDNA sells testing kits under a number of brands, including DNA Origins, and has a retail presence at Walmart, CVS, Rite Aid and Walgreens. The company’s tests claim to combine genetic research and “ancestral tracking” techniques that can identify the town or village where your ancestors originated with a high degree of accuracy. Many experts dispute these claims.

The company offers a range of ancestry testing services starting at $69. That’s the price point for the maternal and paternal lineage kits and the “Starter Ancestry Test,” which uses DNA markers to develop an estimate of your origins in Europe, Indigenous America, East Asia and Sub-Saharan Africa — and shows you the modern population groups that share your DNA. The $124 “Advanced Ancestry Test” expands the analysis to 80,000 autosomal genetic markets, 1,000 reference populations and 41 gene pools.

I’ll note that the HomeDNA test kit contained no warning about not eating or drinking for any period of time prior to taking the test — unlike every other kit I used. And of the four swabs the company sent, one broke. The test kit just didn’t seem as rigorously hygienic as the others.

For $199, HomeDNA claims that the Asian Edition of its GPS Origins Ancestry Test can analyze 17 Asia-specific gene pools and hundreds of Asia-specific reference populations. In addition to a $164 paternity kit, the company also sells a variety of specific kits to determine your sensitivities to particular animals and foods, one to help you achieve a healthy weight, and another that promises to “unlock your skin’s full potential.”

For $39, the company will allow you to upload a raw data file from another DNA testing service and pinpoint your origin to a particular town or city. There are also kits to help you identify you screen your dog or cat for genetic diseases and traits.

But this company doesn’t have a sterling reputation in the genetic genealogy world. When we recently spoke with Debbie Kennett, a genetic genealogist from University College London, she referenced the company’s notoriety for delivering “bizarre results” and expressed doubt about the efficacy of its specialized tests for particular ethnic groups. HomeDNA did not respond to CNET’s inquiry about its testing process or results.

And the HomeDNA reports don’t stack up particularly well against those returned by other companies. Results are summarized on a single webpage, though you also get a PDF that certifies that you’ve “undergone DNA testing” and shows the continents and countries where your DNA originates. The company also throws in a boilerplate 20-page explainer about DNA science and technology. HomeDNA does not offer access to any matching databases — so there’s no obvious next step or any actionable data that comes with your results. Given this, I’d recommend choosing a different DNA testing service.

African Ancestry

Claiming to have the most comprehensive database of African lineages, African Ancestry promises to trace its customers’ ancestry back to a specific country and identify their “ethnic group origin.” But a number of experienced genealogists have cited issues with this company’s marketing claims and science.

Unlike most other companies, African Ancestry doesn’t offer an autosomal DNA test. Instead, it offers an mtDNA test or a Y-DNA test (for males only). In contrast to your standard DNA analysis, African Ancestry’s report doesn’t provide the percentage of DNA that’s likely to have originated across a range of regions. Instead, African Ancestry claims to trace your DNA to a specific region of Africa.

According to experts, however, African Ancestry’s DNA tests come up short. As explained in a blog post by African American genetic genealogist Shannon Christmas, the company’s methodology simply doesn’t analyze a sufficient number of DNA markers to deliver on its marketing promises.

Furthermore, he writes, “Ethnicity is a complex concept, a concept not as rooted in genetics as it is in sociopolitical and cultural constructs. There is no DNA test that can assign anyone to an African ethnic group or what some refer to as an ‘African tribe.'” African Ancestry isn’t the only company that claims to be able to determine your ethnicity or “ethnic group of origin.” But its claim to narrow things down to a single “tribe” of origin is overblown, as any African tribe would ostensibly contain multiple haplogroups.

In an email to CNET, African Ancestry responded: “African Ancestry makes it clear that ethnic groups are social and cultural groupings, not genetic ones. However, based on extensive genetic research of African lineages performed by African Ancestry’s co-founder and Scientific Director (who holds a Ph.D. in Biology and specializes in human genetics), we find that contrary to laymen’s beliefs, there are ethnic groups that share genetic lineages. Our results pinpoint genetic lineages that share the same genetics as our test takers. Given the vast number of lineages in our African Lineage Database, we are able to provide the ethnic groups of the people with that shared lineage.”

The company’s PatriClan Test analyzes eight Y-chromosome STRs and the YAP, which it says is a critical identifier for African lineages and the MatriClan Test analyzes three regions of the mitochondrial DNA: HVS1, HVS2 and HVS3. But though these tests offer lower-resolution results than others, African Ancestry’s services are considerably more expensive. The company’s Y-DNA test and mtDNA tests cost $299 each — or you can take them both, and get an eight-pack of “certificates of ancestry” and a four-pack of t-shirts, for $679.

On the plus side, African Ancestry says that it does not maintain a database of customer information and that it will not share or sell your DNA sequence or markers with any third party — including law enforcement agencies. The company’s terms and conditions run to just over 2,200 words, making them considerably more concise than the disclosure statements of most other companies we included in this roundup. And African Ancestry promises to destroy your DNA sample after your test results are delivered.

That said, even if you accept the company’s take on tribal and ethnic genetic markers, African Ancestry remains too expensive to recommend at its current price.


A lot of AI is already being utilized in the medical field, ranging from online scheduling of appointments, online check-ins in medical centers, digitization of medical records, reminder calls for follow-up appointments and immunization dates for children and pregnant females to drug dosage algorithms and adverse effect warnings while prescribing multidrug combinations. Summarized in the pie chart [ Figure 1 ] are the broad applications of AI in medicine.

Applications of artificial intelligence in health care

Radiology is the branch that has been the most upfront and welcoming to the use of new technology.[6] Computers being initially used in clinical imaging for administrative work like image acquisition and storage to now becoming an indispensable component of the work environment with the origin of picture archiving and communication system. The use of CAD (computer-assisted diagnosis) in a screening mammography is well known. Recent studies have indicated that CAD is not of a lot of diagnostic aid, based on positive predictive values, sensitivity, and specificity. In addition, the false-positive diagnoses may distract the radiologist resulting in unnecessary work-ups.[7,8] As suggested by a study,[6] AI could provide substantial aid in radiology by not only labeling abnormal exams but also by identifying quick negative exams in computed tomographies, X-rays, magnetic resonance images especially in high volume settings, and in hospitals with less available human resources.

A decision support system known as DXplain was developed by the university of Massachusetts in 1986, which gives a list of probable differentials based on the symptom complex and it is also used as an educational tool for medical students filling the gaps not explained in standard textbooks.[9] Germwatcher is a system developed by the University of Washington to detect and investigate hospital acquired infections.[10] An online application in UK known as Babylon can be used by the patients to consult the doctor online, check for symptoms, get advice, monitor their health, and order test kits. Apart from that, the spectrum of AI has expanded to provide therapeutic facilities as well. AI-therapy is an online course that helps patients treat their social anxiety using therapeutic approach of cognitive behavior therapy. It was developed from a program CBTpsych.com at University of Sydney.[11]

The Da Vinci robotic surgical system developed by Intuitive surgicals has revolutionized the field of surgery especially urological and gynecological surgeries. The robotic arms of the system mimics a surgeon's hand movements with better precision and has a 3D view and magnification options which allow the surgeon to perform minute incisions.[3] Since 2018, Buoy Health and the Boston children's hospital are collaboratively working on a web interface-based AI system that provides advice to parents for their ill child by answering questions about medications and whether symptoms require a doctor visit.[12] The National Institute of Health (NIH) has created an AiCure App, which monitors the use of medications by the patient via smartphone webcam access and hence reduce nonadherence rates.[13]

Fitbit, Apple, and other health trackers can monitor heart rate, activity levels, sleep levels, and some have even launched ECG tracings as a new feature. All these new advances can alert the user regarding any variation and let the doctor have a better idea of the patient's condition. The Netherlands uses AI for their healthcare system analysis - detecting mistakes in treatment, workflow inefficiencies to avoid unnecessary hospitalizations.

Apart from the inventions which already exist, there are certain advances in various phases of development, which will help physicians be better doctors. IBM's Watson Health being a prime example of the same, which will be equipped to efficiently identify symptoms of heart disease and cancer. Stanford University is making a program AI-assisted care (PAC). PAC has intelligent senior wellbeing support system and smart ICUs, which will sense any behavioral changes in elderly people living alone[14] and ICU patients,[15] respectively, via the use of multiple sensors. PAC is also extending its projects over Intelligent Hand Hygiene support and Healthcare conversational agents. Hand hygiene support is using depth sensors refining computer vison technology to achieve perfect hand hygiene for clinicians and nursing staff reducing hospital acquired infections.[16] Healthcare conversational projects analyzes how Siri, Google Now, S voice, and Cortana respond to mental health, interpersonal violence, and physical health questions from mobile phone users allowing patients to seek care earlier. Molly is a virtual nurse that is being developed to provide follow-up care to discharged patients allowing doctors to focus on more pressing cases.


23andMe offered to test nearly 600K SNPs on their V4 chip for around $90/person. When researching them, they were offering both genealogy and health testing analysis, with emphasis on the latter. In this respect, they were in the vanguard of private direct-to-consumer (DTC) genetic testing and analytics companies involved in health research. I perceived them to be reputable and to provide a product with reasonable cost/benefit. This was just before the FDA ordered them to cease marketing/producing their health analytics product (see below).

23andMe obtains and associates genotype and phenotype from its customers. It gets a DNA sample from the client, and via its web site, asks the client to identify personal traits through answers to a long list of questions. One set of such data associations does not provide useful information. Only by analyzing thousands of such pairs can they begin to identify meaningful association patterns.

Ordering, Waiting, First Blush Results

After we ordered the testing and sent in our samples, the company advised that the FDA was shutting down the 23andMe health analytics product, leaving only their genealogy results in the final deliverable. That was disappointing, apparently a new chapter in the same old story. The AMA (major lobbyist to FDA) wants to keep health analytics on a prescription only basis. Their reasons are touted as consumer safety concerns, but one suspects that money and related turf protection is yet again the root of such overreach.

We were offered our money back, but declined. I discovered that 23andMe would still give us our raw data as a download, and that there were third party packages that could provide health analytics on this data. Our game was still on. Take that, FDA and AMA.

It took five weeks to get the raw testing results back for Debby, and nearly another two weeks to get mine (since they were mailed on the same day, it seems possible mine required some re-processing as part of QA).

23andMe is not positioning themselves as a major player in ancient (prehistoric) ancestry research. Their non-recombinant genealogy tests, upon which such research depends, are very basic (15 years behind by today’s standard). In my paternal line, they provide no historical resolution, nor any data relevant to the last 15K years. Further investment in genetic genealogy likely will not happen at 23andMe, for human ancestry analytics likely does not represent a core business interest, but rather a marketing tool. Thus, the FDA decision may deliver a bigger blow to the near term 23andMe business plan than one might suspect.

Since I already have much more deep ancestry resolution than their Y-chromosome and mtDNA genealogy testing provides, that portion of the testing has only vague corroborative utility for me. I had never before been explicitly tested for Y-DNA SNPs, but have inferred a detailed SNP lineage via another type of DNA marker. Debby had never been tested, however, so she would learn some basic facts about her maternal ancestry (V7a).

In the broader picture, the real and unique value of ancestry testing at 23andMe is the autosomal (recombinant) DNA matching service, which puts clients in touch with one another based on identified shared DNA segments. This is invaluable for extending one’s research beyond paper trails, and for validating paper trails that have been established. It takes us orders of magnitude beyond the discoveries possible from matching others in the non-recombinant DNA databases.

How do autosomal (recombinant) and non-recombinant genealogy differ? Functionally, they address different time frames. Autosomal genealogy can address about the same period as does paper (historical) genealogy, perhaps the last 500 years. Non-recombinant DNA covers earliest history and all pre-history.

That this is so can be deduced from the differing DNA processes. Non-recombinant DNA can be recognized by its unique type over many millennia. It is unchanged over generations except for a few random mutations. Autosomal DNA gets sliced and diced at every generation, so that after a few generations, there no longer remain distinct segments long enough to be identifiable as any specific type. Understanding that the bulk of human DNA is basically the same for everyone, one needs a critical mass of unique information in a segment to characterize it as belonging to some known type.

23andMe’s ancestry categories, identified by autosomal testing, are expressed as geographical regions containing significant percent of matching DNA profiles. For Europe, the designations are Europe-wide, then North, South, and East regional distinction, then specific sub-regions like the British Isles, Germany and France, Scandinavia, and a non-specific catch-all. There are three levels of confidence that can be requested, conservative, standard, and speculative. The difference is in how hard each level tries to make sense of any ambiguous segments.

Through the speculative filter, I am 99.9% European and 94.5% Northern European. Debby was similarly 99.7% European, but her sub-classification was 96.7% Ashkenazi. When it says we are X% this category or that, it means that X% of our ancestors were likely living in that place in the year 1500CE (before global travel was readily accessible to the commoner).

Ambiguity will likely arise in some areas. Two tiny fragments of my DNA resisted easy classification, but speculation resolved one to Native American and the other to North African, together contributing less than one thousandth of my DNA. There may have been some problems reading/interpreting these small DNA areas that accounts for some of the confusion.

In addition to geographical origins, the user is shown a list of relatives, other people who tested as having some measurable identifiable shared DNA. In the first day after my results were made available to me, I sent messages to my 25 closest identified genetic matches from among 23andMe customers.

I received a reply the next day from a 3rd cousin I had not identified before. I was able to provide him much detailed information regarding our common G-G-grandparents that he did not have. I made another contact to my closest match, a second cousin on my mother’s side. These two contacts genetically validated the paper trail for a significant part of my near ancestry, my father’s lineage and my mother’s father’s lineage, both back to nearly 1800. I now know my mother’s father’s haplogroup. I am who I think I am for this portion of my ancestry. This is genealogy nirvana.

I have since added a few more contacts to my original 25, relatives who have listed a surname matching one of my ancestral surnames. Only three contacts to date have seemed interested in discussing ancestry, a 10% success rate. This is sad, but not unexpected, since most of the early adopters at 23andme were seeking health data genealogy was not their passion, and in the minds of many seems to open one to unnecessary privacy invasions.

There are practical limits to the efficacy of the relative finder facility. The period shortly after 1800 becomes problematic for most of us tracking ancestors in the USA, because the great westward expansion had begun, invariably disrupting any corroborating paper trail. Yet the current relative finder process loses utility at this time. The relative-finder algorithm doesn’t identify shared DNA segments much prior to 1800, since the autosomal evidence for such distant shared ancestry is too weak. Also, virtually none of the customers whose primary interest is health data have researched their ancestor surnames back that far, since it takes a real genealogy motivation to do so. Without names, learning how two relatives are related is an impossibility.

Shared DNA identification seems such a useful tool, 23andMe should attempt to promote it more, educating customers to its potential. Perhaps offer rewards for people who provide surnames for their known ancestry, as well as geographical details. Some of us have done years of research and it’s all available to others if they will just talk to us when we contact them. Advocacy and new contributing methods may be needed to realize the potential of this service.

Since I have spent some time with the site, I have more expectations of genealogy successes here. There is a lot of data, and some useful ways for accessing it. For example, under:

  • Family and Friends : DNA Relatives :
    • be notified of relatives and their expected relationship
    • correspond with relatives and share genome data
    • search DNA relatives by ancestry surnames or haplogroup

    Global Relative-Finder and Phased Genomes

    23andMe was a pioneer in ancestry genomic analysis, but now there are several competitors. Staying within a single vendor’s database will be limiting going forward.

    Enter GEDmatch, a site for pulling all relatives together. One just uploads one’s raw data to the GEDmatch database. Then one can query genomes from all the various genetic genealogy vendors. Further, GEDmatch offers state-of-art tools for analyzing genetic relations based on shared genetic segments. And by uploading GEDcom data, complete lists of ancestors can accompany ones genome, enabling identification by name of hitherto unknown cousins who share segments of our genome.

    Finding real relatives by genomic segment matching is a difficult proposition if just one’s own sample is the comparison base. To the rescue, GEDmatch supports phased kits as basis of comparison. If one’s parent’s genomes are also available, then it can be known which ancestral genetic segments one did not inherit. A phased kit thus has more ancestral data to work with, narrowing choices during analysis. (Full understanding of phased kits is still beyond my pay grade. If you know of a good description of the process, please leave a comment. Thanks.)

    Having no reason to dwell on the now vacuous health portion of the 23andMe site, I grabbed our raw data files and headed to the Internet. This is now a DIY project.

    I first decided to try my hand at analyzing Debby’s DNA for specific genes, namely BRCA1 and BRCA2. I found on the Internet a description of SNPs within these genes that were implicated in cancer. I discovered that 23andMe had tested about 70% of these, and for these tested SNPs, Debby possessed normal alleles. That was an encouraging result. But clearly such a manual process was not going to get me far.

    I researched third party software and downloaded the Promethease package. This is a simple web-based, Javascript application that works in conjunction with a web site called SNPedia. SNPedia, also launched in 2006, is a wiki for cataloging the functional consequences of human genetic variation, as published in peer-reviewed studies.

    Promethease apparently goes through the user-supplied raw data file of tested SNP alleles and looks up each SNP in SNPedia (keying by rsid), processing the information found there for that allele, sorting it by positive/negative/neutral impact, assigning to it the SNPedia-determined importance factor, and noting how many publications reference it. A link to the SNPedia entry is returned to facilitate user access to the relevant literature.

    I ran the program against our raw data. In Debby’s negative issues report segment, there were no entries referencing BRCA, confirming my initial manual observation that she was ‘normal’ for these genes. In each of our reports, there were just a few interesting findings. The remainder were a large number of SNPs with small (importance 2/10 or less) statistical evidence of an association with some trait, each likely having virtually zero predictive value by itself, and with no way to establish functional correlations among them.

    The means of integrating all these minor statistical suggestions into meaningful health hypotheses remains far from our current grasp. Meanwhile, we can avail ourselves of the referenced documentation behind each study finding and see the directions of current research.

    Technical Aside: Making Sense of Sense

    When manually comparing SNP alleles as above, one must be aware of the sense (+, -) of the tested base. DNA occurs in two mirror-matched strands, the positive (aka sense) mRNA-like strand, and the negative (aka anti-sense) mRNA transcription strand. An allele may be derived from either strand during the test, so the strand sense must be known as well as the allele base.

    All 23andMe raw allele data is conveniently expressed relative to the positive strand. But comparison data from SNPedia can be relative to either strand. Therefore, one needs to check the reported SNPedia sense and if minus, convert the base to its mirror opposite (C-G or A-T) when comparing to a raw 23andMe allele.

    Customers may wonder what to make of their results. I have discussed and tempered expectations in my DNA testing overview article, but one still asks ‘Where’s the meat?’

    Part of the meat is the metric of actionable information quality of each report item. SNPedia authors use a magnitude scale of 1-10 to rank informational quality. For example, at the top of the scale, BRCA1 and BRCA2 alleles of bad repute are assigned a 10, the level of the most important findings with largest potential impact on health outcomes. In our common experience, informational quality 4 is the most important finding for each of us. We each have a couple of these, and have brought them to the attention of our physicians just as FYI.

    Other meaty results provide information on one’s genetic tolerance for and sensitivity to medications, interesting information that should be shared with one’s physician. In the case of predicted high toxicity, the information could be life-changing. Mostly, it is not definitive enough to be actionable. One still might want to opt for the best medication for the circumstance, even if there is indication of reduced efficacy for one’s genotype.

    Beyond the alleles of highly-ranked importance, there is scant meat on the bones. It becomes an educational exercise, with some expertise in medical genetics research being required to get much out of it.

    In spite of the lack of meat, this is exciting stuff. The first viewing of my results was a great step up on my lifetime quest for knowledge. I have no experience with the company’s results viewer, since it is no longer available to me. But my raw data, as processed by third party software together with SNPedia look-up, provided detailed and prioritized information to the level of current state of the art.

    I approached this as a general quest for knowledge, and not from a potential medical intervention perspective. I realized from all I’ve discovered that medically-actionable information would most likely be scarce or non-existent in the current time frame. But much of the information peaked my interest. I love learning such things about myself and can’t wait to digest as much as possible. Finally, they are talking about me, the real me. This is who I am.

    This sport is in its infancy. Possibly only

    100 of my SNPs had any ascribed importance ranked above 2. That’s out of a half million SNPs. We have so much to learn. Some of my SNPs predict diagnoses I have already received. Some showed extra risk for things I expect I will never experience. Others offered potential explanations for personality traits I had never considered as having a logical explanation. There are intriguing genetic correlations.

    For me, living on the older side of town, my results are most interesting because they confirm rather than predict who I am. It is their explanatory nature that makes them most relevant to me. A younger person might have a different perspective. My life experience can corroborate some predictive elements inherent in my data, and thus perhaps provide additional data points by which to judge the general utility of genetic testing for health prediction.

    My testing revealed two findings of importance rank 4/10, my most important findings in SNPedia’s grading scale. Both predict health status that is characteristic of my current health state. There were other findings of lesser import which also seem accurate in their predictive potential.

    The great bulk of other associations of SNPs with various types of condition are decidedly not actionable, but many seem interesting and I will learn more about them. Nothing else in my report matches anything of my current health history, which remains overall pretty uneventful.

    One of the arguments used against DTC genetic health testing is that family history is more relevant and actionable. My family medical history tells me the exact same story as one of the major findings, but this is the only story that history consistently tells. Thus family history, while a valid proxy for some related genetic conditions, fails to divulge anything close to a complete story. The mechanics of autosomal DNA reinforce this family history limitation. Each parent could be heterozygous for a deleterious allele, and I could be homozygous for that allele. Typically, this means whatever condition might result from that allele will be magnified as a personal risk factor for me.

    There are lots of studies related to genotypes and reactions to medicines. Three well-researched associations applied directly to me, where two drugs were significantly more efficacious for my genotype and one was significantly less. I notified my cardiologist about the lessened response to one of my primary drugs, because it potentially leaves me open to bad events. We decided not to change course, since my time on the medication demonstrated useful efficacy for me, either by luck, or because the dosage permits some latitude.

    My physicians had never mentioned to me a genetic variation in response to medicine. As far as current practice can distinguish, one dosage still fits all, even if the FDA notes the dangers on the packaging. They argue the data is complex and ambiguous, so it may be a while before protocols can be established. And since genetic information on most patients is unavailable, those protocols will be long in coming. Wide availability of genetic testing in utero likely will be necessary to jump-start personalized medicine.

    I sense that 23andMe is now beginning to struggle with the research/commercial transition, evidenced by the recently obtained patent for a process for detecting DNA underpinnings of Parkinsons. This trial balloon may have difficulty withstanding challenge as patent rules are slowly being modernized.

    It appears their DTC testing is partly marketing tool. 23andMe needs to generate sufficient funds and attract sufficient clients to keep them going at their primary task, collecting user-supplied health trait information to augment the client genotype information that supports their research. Once they have populated a sufficiently large database, their research, augmented by independent GWAS efforts, may aim to produce marketable, genetic-based health diagnostics and perhaps eventually genetic-based disease interventions. Or maybe they can just sell the raw information to big pharma. Large profit streams will potentially accrue from such products, services, and raw data.

    23andMe reportedly is looking to regain FDA approval for its health analytics, perhaps with some modification to their marketing claims. Meanwhile, the FDA action may make their interim finances a shaky proposition. Fortunately, there are deep pockets behind their enterprise.

    Even though they can no longer share health information explicitly with the client, they still collect client phenotypes, indicating their strategic goals remain in play. But the AMA, the dinosaur lurking in the shadows, may prove to be a bigger wrench than they anticipate. They may need stronger FDA lobbyists.

    They are unlucky to be a start-up in a start-up industry that has raw edges, unanswered questions, yet unfulfilled promise, and a reactionary regulator. Rather than helping them shape the new industry into maturity, the regulator has chosen to shut them down. What’s behind such harsh judgement? Their mistake may have been failure to court the FDA/AMA from the earliest stages of business development. Now, seven years later, it is too late to avoid the current head-butting.

    Improvement Suggestions (Aka Grousing)

    Initially, before our data became available for presentation, the 23andMe web site was a mishmash of information representing different states of user data collection, sample data display, signup, registration, and support options. There should be process feedback to inform the user site experience, so that only information and user options relevant to one’s current state are presented. Our account home pages were still requesting us to register the kit even though it was registered before submission. The initial user site states could be: overview information and sales pitch test drive sample preparation and registration test status (received, testing progress and expected completion date).

    Their site became a slightly improved experience after our results passed final QA and became fully reflected on our site pages. A To-Do box was annoying when there are no to-do items (perhaps since removed?). It occupies prime real estate on the screen. The thinness of the information, absent health data, remained disappointing.

    There is no tracking of sample status on their site other than the binary status: incomplete-complete. Just keep checking back, I guess. We submitted two samples together, and got an email when the first had completed processing. This was misleading, since the data was not yet through QA and available for presentation. That happened the following day, then it took another ten days before the other sample passed QA, and no email was sent. I finally sent them a status query. Then magically my results appeared the same day.

    Amazingly, the easiest way I found for getting in touch with customer service requires leaving their site and performing a web search for 23andMe customer service. That’s yet another measure of how bad their site is, but they do not yet seem the least embarrassed about it.

    One box on the site requests information from the client in the guise of quick questions, but offers no explanation of how much information is needed or what it will be used for. Abandoning one dialog after it had proceeded for several minutes (not my definition of quick), on a subsequent visit it seemed to revert to question 1 as if it had forgotten all that I had entered. Again, status feedback would help.

    Ultimately, we early donors of genotype/phenotype information may hope to gain some reward from the companies we assist with our data. We are after all a crowd-funding source, supplying both fees and valuable data. To make us more willing to supply information, perhaps DTC companies such as 23andMe would consider giving us each a proportionally small piece of the company. Coupons for additional services, or a share of stock, would be a more meaningful reward than a fancy chart saying, for example, we share a heritage with 95% of other Europeans. We might even be tempted to pay a little more up front.

    Additional testing capability would be welcomed, perhaps Y-chromosome SNPs. FTDNA has done extensive Y-chromosome mapping through their Big-Y tests. Perhaps it would be possible for 23andMe to offer a set of SNP tests specific to the customer’s Y-DNA haplogroup, based on SNPs discovered by Big-Y. Reward coupons could be used to obtain such results.

    The Promethease software provider puts a button on their page that offers the best version of their app for an incentive fee of $2. Else, they warn, the user will receive a sabotaged version that increases the run processing time by two orders of magnitude. (Note to provider: Perhaps it’s more productive to put an unconditional Donate button on your site and then provide the user with the best experience you can offer.) The Mac version of the software is several iterations behind other platforms and the authors note no plan for more Mac updates. Further compromising the user experience, the final HTML report from Promethease has imbedded google ads.

    Since Promethease otherwise does a workmanlike job, we use the software, remove the ads, ignore the snark, and donate nada, thus compensating ourselves for the insult.


    What is a genome?

    An organism's complete set of DNA is called its genome. Virtually every single cell in the body contains a complete copy of the approximately 3 billion DNA base pairs, or letters, that make up the human genome.

    With its four-letter language, DNA contains the information needed to build the entire human body. A gene traditionally refers to the unit of DNA that carries the instructions for making a specific protein or set of proteins. Each of the estimated 20,000 to 25,000 genes in the human genome codes for an average of three proteins.

    Located on 23 pairs of chromosomes packed into the nucleus of a human cell, genes direct the production of proteins with the assistance of enzymes and messenger molecules. Specifically, an enzyme copies the information in a gene's DNA into a molecule called messenger ribonucleic acid (mRNA). The mRNA travels out of the nucleus and into the cell's cytoplasm, where the mRNA is read by a tiny molecular machine called a ribosome, and the information is used to link together small molecules called amino acids in the right order to form a specific protein.

    Proteins make up body structures like organs and tissue, as well as control chemical reactions and carry signals between cells. If a cell's DNA is mutated, an abnormal protein may be produced, which can disrupt the body's usual processes and lead to a disease such as cancer.

    An organism's complete set of DNA is called its genome. Virtually every single cell in the body contains a complete copy of the approximately 3 billion DNA base pairs, or letters, that make up the human genome.

    With its four-letter language, DNA contains the information needed to build the entire human body. A gene traditionally refers to the unit of DNA that carries the instructions for making a specific protein or set of proteins. Each of the estimated 20,000 to 25,000 genes in the human genome codes for an average of three proteins.

    Located on 23 pairs of chromosomes packed into the nucleus of a human cell, genes direct the production of proteins with the assistance of enzymes and messenger molecules. Specifically, an enzyme copies the information in a gene's DNA into a molecule called messenger ribonucleic acid (mRNA). The mRNA travels out of the nucleus and into the cell's cytoplasm, where the mRNA is read by a tiny molecular machine called a ribosome, and the information is used to link together small molecules called amino acids in the right order to form a specific protein.

    Proteins make up body structures like organs and tissue, as well as control chemical reactions and carry signals between cells. If a cell's DNA is mutated, an abnormal protein may be produced, which can disrupt the body's usual processes and lead to a disease such as cancer.


    It admits its results are "statistical estimates" January 11, 2020 1:24 PM Subscribe

    These SNPs can differ by population types. Some subsets of a population type might suffer a disease and appear to have SNPs that others do not. So-called GWAS studies (discussed below) work at associating SNP distributions with diseases or other phenotypic traits.

    You send some spit to 23andme, they tell you who you are likely related to and what diseases or other physical attributes you may have, based on patterns of these SNPs.

    23andme uses Illumina chips for figuring out what genotypic variants you have in the spit sample you send to them. Illumina is a company that manufactures "chips" that react to the presence or absence of markers in the DNA sample you send to 23andme. Each chip in this platform calls different sets of markers.

    These markers often have identifiers associated with them, each called an rsID.

    They also map variants that do not use rsIDs, and instead use some other internal identifier.

    Whatever the identifier, GWAS — Genome-Wide Association Studies — are research projects that analyze populations of people in different ethnicities or other cohorts, looking for over- or under-presence of various markers, compared with the larger human population.

    For instance, here is such a GWAS study that is used by 23andme to indicate risk factors for Parkinson's disease, in sample donors who have these variants. There are other such studies for schizophrenia, for autoimmune diseases, and so on.

    The power of these studies to make a positive, correct association depends on various factors, but the product 23andme sells is ultimately a packaged summary of a lot of work by researchers around the world.

    Getting to the substance of the post, these same markers — SNPs — are used by 23andme to guess at your ancestry. And that's the problem when getting samples from identical twins, because a unique SNP pattern imprints after the fertilized egg splits:

    They sucked his brains out!: The real issue is consumer education, to know what you are buying (and, likewise, what you are not buying).

    One thing they discuss more in the video than in the article is the tone of advertisements. What they say they're selling in the ads and what they say they're selling in the fine print are two different things.
    posted by clawsoon at 2:15 PM on January 11, 2020 [12 favorites]

    I have two thoughts -- send in your spit to the same company several different times to see how varied the results are.

    Also, there are police all across this country using genetic information from crime scenes to compare to genetic information on genealogical databases to try to find "relatives" of unidentified suspects in order to track them down.

    I had though the whole gene testing thing was more accurate than this, but apparently it's largely guesswork beyond just getting the chain of base pairs. Not quite living as much in the future as I thought.
    posted by hippybear at 2:23 PM on January 11, 2020 [1 favorite]

    Every time I think about these genetic ancestry tests I think of my friend Peter Cho, who had quite the life changing experience, twice, with what 23andMe was telling him. It's not complete garbage, but it's close, and it is misleading.

    Thank you for the detailed technical explanation of what's going on, They sucked his brains out. But you make it sound like it's somehow the consumer's fault that this product that's advertised to tell people their genetic ancestry does not, in fact, tell people their genetic ancestry. There's a lot of problems here but the casual consumer is not the one to blame.

    I cancelled my own 23AndMe account recently, on the back of yet more news that your genetic data is not private in an American company's database. If you're curious here's my notes about how to archive and delete your record. Note you can never truly delete your data copies will continue to exist for at least a decade, out of your control.

    Between the threat to privacy, the dubious original value, and the apparent inconsistency of the results I think these home genetics tests should not be sold. SNPs and genetic science are fascinating, but the way it gets packaged into a $100/yr service is terrible.
    posted by Nelson at 2:24 PM on January 11, 2020 [19 favorites]

    They sucked his brains out!: Each twin has a unique SNP pattern. So this result is not so surprising, if you know the technical details of what product these companies sell.

    If the tiny differences between SNPs in twins leads 23andme to conclude that one twin has ancestors from Ireland and none from England, while the other twin has ancestors from England and none from Ireland, would it be fair to guess that there's some overfitting going on in 23andme's ancestry algorithms?
    posted by clawsoon at 2:24 PM on January 11, 2020 [13 favorites]

    hippybear: Also, there are police all across this country using genetic information from crime scenes to compare to genetic information on genealogical databases to try to find "relatives" of unidentified suspects in order to track them down.

    As I understand it, the ability to connect close relatives using DNA is much, much stronger than the ability to trace ethnic ancestry. With close relatives, it's, "Do 25% or 50% of your SNPs match?", while with ancestry it's more, "Can we pick up a signal 1% or 2% above the background noise?"
    posted by clawsoon at 2:30 PM on January 11, 2020 [5 favorites]

    What they say they're selling in the ads and what they say they're selling in the fine print are two different things.

    So, business as usual. Always read the fine print. (I don't want to defend these companies but rather to condemn the system.)
    posted by hat_eater at 2:53 PM on January 11, 2020 [3 favorites]

    This question of "picking signal out of noise" and the adjacent problem of over-fitting is really worrying me in several fields all at once.

    Here, with 23andMe, it's all the reports issued to consumers who don't understand how uncertain they are. In the adjacent field of archeology there's a lot of amazing data being done correlating a very thin historical DNA record (from fossils) to modern population. The science as summarized in David Reich's book sounds amazing. It also sounds incredibly tenuous, reliant on the most delicate of statistical methods. How many of their results will turn out to be incorrect?

    More broadly in the sciences we have replication crisis, often brought about by mis-application of p-testing and other statistical measures that are correct but easy to misuse. And then broadly speaking we've got the entire computer industry in a rush to fit machine learning models to datasets to make predictions like "I recognize that face" or "I know what ad you want to see next" or "this citizen is likely to become a social criminal". Nothing wrong with machine learning per se but it's very, very easy to screw it up and overfit (or underfit) the data and build a bad predictor with errors either gross or subtle.

    It feels to me like the science of statistics needs to come up with some simple post-hoc tests that can be applied to the output of subtle statistical work to test it for plausibility. "Do identical twins test for identican genetic ancestry?" is one such test (albeit not statistical). Benford's law is another.
    posted by Nelson at 2:53 PM on January 11, 2020 [14 favorites]

    Also, there are police all across this country using genetic information from crime scenes to compare to genetic information on genealogical databases to try to find "relatives" of unidentified suspects in order to track them down.

    While its use in criminal cases is concerning, forensic genealogy has proven to be a godsend for cases like this.
    posted by Fukiyama at 2:59 PM on January 11, 2020 [1 favorite]

    Adoptees have found that both 23andMe and Ancestry results are crucial - CRUCIAL - to resolving issues of personal genetic birth identity. As an adoptee, I *implore* you to test at both companies and keep your results public.

    Analyses which cast aspersions on consumer genetic testing often both misunderstand and misinterpret the product that these companies are selling and contribute to the erasure of the human rights of many adoptees. Please take this into consideration both before and after you participate in consumer DNA testing. The tests are accurate enough to materially contribute to the resolution of impeded questions of birth identity for many of us.
    posted by mwhybark at 6:05 PM on January 11, 2020 [8 favorites]

    Sorry - it may be tough not to know about your birth parents - but there is no way that I am ever letting any commercial business near my DNA. I do not trust them to do any analysis properly I do not trust them to use the analysis properly I do not trust them - full stop.

    I don't like having my photo taken I don't like giving my date of birth I incinerate all of my personal papers and financial documents I don't let my passport out of my physical possession. Maybe I am just being paranoic - or maybe it is the result of working with technology for more than 40 years. I cannot conceive of any way that I would voluntarily provide my DNA.
    posted by Barbara Spitzer at 6:38 PM on January 11, 2020 [24 favorites]

    forensic genealogy has proven to be a godsend for cases like this.

    I'm failing to grasp the "godsend" angle there. We're all supposed to submit to a national government database because of one oddball case a decade ago of a person who apparently wanted to remain anonymous?
    posted by JackFlash at 6:58 PM on January 11, 2020 [6 favorites]

    As an adoptee, I *implore* you to test at both companies and keep your results public.

    Yes, but for someone on the other side of the adoption that may also be a personal catastrophe. You may one day log in and find you have a half brother in your DNA relatives. No explanation or context for that, just surprise! Hope the revelation of a long-held secret doesn't destroy your family!

    I personally lived this experience, with one important caveat. I had set my profile to private, so 23AndMe didn't volunteer the unknown half-brother. He found me via other means and then we confirmed the connection via 23AndMe. I'm grateful for that experience and am glad to know him. But if our mother were still alive it would have been much more complicated. Triply so if it had shown up as a surprise one day on a fucking web form.

    I think we humans aren't really prepared for impersonal genetic reality vs. family stories. This must be a regular and tragic story with paternity estimates are 2-15% of people's fathers aren't who they think they are. (Interestingly, the rate varies by culture.) I'm not sure 23AndMe is a good way to learn something as potentially family-wrecking as that your father isn't the man whose genes you inherited.
    posted by Nelson at 7:34 PM on January 11, 2020 [8 favorites]

    If the tiny differences between SNPs in twins leads 23andme to conclude that one twin has ancestors from Ireland and none from England, while the other twin has ancestors from England and none from Ireland, would it be fair to guess that there's some overfitting going on in 23andme's ancestry algorithms?

    But that's not really what they're saying. The absence of evidence is not evidence of absence: the fact that for one twin, the detected SNPs didn't provide sufficiently robust evidence to resolve English ancestry, while for the other twin they did, doesn't mean 23andMe is saying the first twin doesn't have English ancestry. It just means they couldn't resolve that.

    I don't know about other testing agencies, but I've done 23andMe, and they're fairly careful about how they talk about their ancestry results. There's also quite a bit of literature they provide to help you interpret them that I suspect most people skip without reading. I was surprised when I read the article because it opened by making it sound like the two twins tested had completely different results, but when they then presented their testing results, they looked to me to be perfectly compatible. I think a lot of it has to do with the way people interpret terms like "broadly European".

    From these sentences, I think people are reading "broadly European" as something other than French and German ancestry. I suppose this is because 23andMe (and other testing agencies, I guess) report ancestry percentages that sum to 100%, giving the impression that if one twin has less "broadly European" and more "French and German" ancestry, they're getting very different results. But that's not really how it works. It's more like nesting boxes. Any genetic markers that are French and German are also "broadly European," but 23andMe resolves them to the most specific region that it can give with some level of statistical confidence. As CheeseDigestsAll says, 23andMe lets you see your results with varying levels of statistical confidence, which is kind of nice. At the 50% level, it may give you quite specific regions for your ancestry results, but all it's saying is that these are more likely than not. If you ask for higher confidence, that specificity is traded for generality, and only markers that are very closely linked with certain ancestral regions will count towards specific areas, with the rest getting lumped into increasingly generic regions like "Western European" or "Broadly European".

    I don't know exactly how 23andMe's algorithms work, of course, but from looking through the information on their site it's pretty clear to me that they don't just look at which SNPs you have to calculate your ancestry: they also look at how those SNPs co-occur in your chromosomes. When people produce sperm and eggs, genetic information from their paired sets of chromosomes are scrambled together. But this doesn't happen with perfectly uniform randomness. Different parts of the chromosome undergo "crossing over" (scrambling) at different rates, and the closer together two SNPs are on a chromosome, the lower the probability that they will be separated from each other from one generation to the next. I don't know but I suspect that 23andMe uses this as an additional source of information to try to assign genetic ancestry to specific geographical regions.

    A consequence of this is that you might have two SNPs that individually provide only evidence for "broadly European" ancestry, but if they co-occur on the same genome might provide relatively good evidence for, say, Irish ancestry. Because of the way 23andMe reports its results, this means that the evidence from these SNPs is allocated from the "broadly European" percentage and into the "Irish" percentage. However, this means that estimates for more specific geographical regions are much more sensitive to small differences in the detected SNPs. If either one of these two SNPs is missing in the test results, then the "Irish" evidence provided by both SNPs vanishes. Conversely, the more general "broadly European" category is relatively insensitive to small differences in the detected SNPs, so will easily "soak up" noise in the genetic signal. And in fact if 23andMe looks at co-occurrences not just pairwise but in larger combinations, the sensitivity could be extremely high: intuitively I think it would be on the order of N! but I haven't done the math.

    So to me it's not at all surprising that when the two twins have SNP panels that differ by only 0.4%, you can see large differences in the more specific regional assignments. But what you're not seeing is radical differences in the overall regional map, only differences in specificity between the two. Honestly I'm kind of surprised that the computational biologist that CBC spoke to thinks that this is surprising. Maybe my intuitions are radically off, but I don't think so.
    posted by biogeo at 9:58 PM on January 11, 2020 [11 favorites]

    You may one day log in and find you have a half brother in your DNA relatives.

    Half-sister, or sister in my preferred usage, but yes, that is the desired outcome. Nelson, we’ve corresponded cordially in the past, I’ll take that up on Sunday, with love and good intent, and with respect to your experience.
    posted by mwhybark at 1:19 AM on January 12, 2020 [1 favorite]

    Do all/any of the companies give you access to their findings at a data level, or just to their conclusions? It seems like it would be more useful to do test-retest or just compare the overall result between companies, to test their accuracy at least.

    Then the question of who has the best algorithm is more of business decision?
    posted by fizban at 5:41 AM on January 12, 2020 [1 favorite]

    Do all/any of the companies give you access to their findings at a data level,

    23andMe lets you download spreadsheet with a list of chromosome segments and how they're classified at each confidence level.
    posted by CheeseDigestsAll at 7:05 AM on January 12, 2020

    This is relevant to my interests. I got an eighth of Ashkenazi Jewish lineage from my mom and a quarter from my dad, coming to, you know, three-eighths, or 37.5 percent. Ancestry and 23andme have at varying times told me I'm 33 to 41 percent Ashkenazi. It's never been a surprise to me that it's an estimate, but I did have to learn to squint a bit at these numbers. I can, however, understand why plenty of people misinterpret the results, taking them at face value and expressing shock when they're not entirely accurate. Perhaps more warning labels are needed.

    The real strength of one's own DNA testing is in genealogy, not ethnic guessing. Sometimes the latter helps (I have a couple drops of blood from Scandinavian ancestors?! Oh, that was from the fun and games in, oh, the sixteenth century among the German bunch) but the distinct value for the individual user is in finding and correlating/confirming family tree matches.
    posted by Jubal Kessler at 7:54 AM on January 12, 2020 [1 favorite]

    Jubal Kessler: Perhaps more warning labels are needed.

    I wonder if something as simple as percentage ranges would help. Like. why not just tell you "33%-41%", instead of "33%" or "41%"?
    posted by clawsoon at 8:35 AM on January 12, 2020

    hippybear: Also, there are police all across this country using genetic information from crime scenes to compare to genetic information on genealogical databases to try to find "relatives" of unidentified suspects in order to track them down.

    As I understand it, the ability to connect close relatives using DNA is much, much stronger than the ability to trace ethnic ancestry. With close relatives, it's, "Do 25% or 50% of your SNPs match?", while with ancestry it's more, "Can we pick up a signal 1% or 2% above the background noise?"

    To build on clawsoon's response to hippybear, US police use the CODIS system which looks at short tandem repeats (STRs) not SNPs. This is the same type of test to used for paternity cases, and his handy for building family trees since they STRs follow standard rules of heritability (i.e. every individual will get one copy at random from one parent, and another copy at random from the other) SNPs are heritable, obviously, but they are also more likely to mutate in individuals (a one base pair change is easier to "slip through" DNA replication proofreading). There are commercial DNA tests using STRs instead of SNP-chips, such as Family Tree DNA. Not endorsing them, just mentioning that the market has alternatives I work at an academic molecular biology lab that used to do the lab work for Family Tree years ago but no longer does. I have no pony in this race, but if anyone wants to get a broad genetic profile, I recommend volunteering for Genes For Good. Answer a bunch of health questions, send in your spit, and get like half a million SNPs for your own enjoyment.
    posted by lizjohn at 10:15 AM on January 12, 2020 [6 favorites]

    The fact that 23AndMe used my data to help develop a drug doesn't bother me at all. In fact, that was one of their marketing pitches for why you should sign up. "Get lots of personalized reports. Also, help science by contributing your anonymized data to drug research." They've always been transparent about the fact that was part of their business plan.

    What bothered me enough about 23AndMe to delete my account was that a judge decided that a warrant was sufficient to force a company like 23AndMe to share genetic data, even if both they and the genetic donors explicitly don't want to participate in law enforcement searches. 23AndMe isn't the bad guy in this story, they've always tried to keep users' data private from law enforcement, but apparently now they legally can't. Why would I voluntarily join a database that could later help police identify me or some distant relative of mine as involved in a crime? Sure I want to catch the bad guys too, but in no way do I trust American police to do the science correctly nor to be reasonable about what they try to pull a warrant for.
    posted by Nelson at 7:38 AM on January 13, 2020 [5 favorites]

    Yes, but for someone on the other side of the adoption that may also be a personal catastrophe. You may one day log in and find you have a half brother in your DNA relatives. No explanation or context for that, just surprise! Hope the revelation of a long-held secret doesn't destroy your family!

    I don't really get this as an argument against DNA testing. It seems to me that it's an argument against cheating on your spouse, and/or keeping long term secrets from them. "I have a child from a previous marriage/relationship" seems like the sort of thing that /should/ come up.

    It seems like the sort of thing that happens when someone tells their friend that their significant other is cheating on them, and then they get blamed for ruining the relationship. The friend did not ruin the relationship, the cheating partner did. Likewise: DNA reporting isn't ruining your family. Your family is ruining your family.
    posted by Zudz at 10:06 AM on January 13, 2020

    Also, help science by contributing your anonymized data to drug research

    Anonymized data can be used to identify you and family members (cite, cite, cite). Gaps in genomic privacy laws have real consequences.
    posted by They sucked his brains out! at 5:24 PM on January 13, 2020 [1 favorite]

    I know things have come quite a way in the 20+ years since Richard Lewontin's NYRB columns, but the questions he was asking and critiques he offered of genetic testing remain pretty interesting, and quite germane to this conversation.

    Does anyone know of a more contemporary evolutionary biologist with that degree of informed macro-level analysis? I feel like everything I've read about 23andMe & co. tends to either be fairly surface-level like the linked piece, or focused on the legal issues rather than the scientific. I have ready access to the purely scientific stuff, but what I'm really looking for is something more like a literature review for the informed layperson.

    (Happy to take this to ask if that's more appropriate).
    posted by aspersioncast at 11:09 AM on January 14, 2020

    > "I have a child from a previous marriage/relationship" seems like the sort of thing that /should/ come up.

    Marriage / relationship / rape / who knows what. There are all kinds of reasons why someone might have had a child, not raised it, and not told future partners.
    posted by The corpse in the library at 12:55 PM on January 15, 2020 [1 favorite]

    I get what you're saying, mwhybark, but even though I've always wanted to know what my actual background is, I still would never do one of these things. I simply don't trust businesses with information like this. I did do a genetic test when I was diagnosed with cancer two years ago, because my surgeon wanted to know if there was a genetic component to my twin sister's death from ovarian cancer and if so, whether we should look at taking out my ovaries while I was in surgery already. My sister and I were adopted together because it was a private adoption, and back then they would have split us up at an agency, so I have a lot of complicated feelings about this issue.

    And in my state, they notify a big research center when you have cancer, and they contact you for participation in a study, which I did, although I was pretty annoyed when I found out they were potentially selling cells to pharmaceutical companies, because I trust them not at all and it infuriates me that they profit off of people that way. So my information is out there, with companies that swear everything I gave them is private, but I have little faith in that when the guvmint comes knocking. Yet I still am not interested in giving my information to a for-profit company that sells these goofy background stories.

    Would I like to know what my background really is? Sure. What little I know I can't confirm with anyone as I understand it my birth mother is deceased and my parents are both gone, and from what little my parents knew about him, bio-dad was a piece of shit, so that's not a relationship I'd ever want even if he was alive. It's just. really complicated, and as much as I would like to know if I really am Welsh and Scandahoovian of some sort, it doesn't sound like they'd be able to tell me in any useful way anything that's really true. I don't know. I just am not sold on letting this stuff out there for anyone to do what they want with it.
    posted by kitten kaboodle at 12:31 PM on January 16, 2020 [3 favorites]

    I've been corresponding with Nelson about this and it has helped me to clarify what I am trying to say above.

    First, the top-line article, calling into question the scientific accuracy of the genetically-derived ethnicity data that is reported by these sites to the user, is. not wrong, exactly, but misleading. What's particularly at issue with this specific generic style of piece is that it plays toward users' mistaken assumptions about how the consumer sites develop their ethnicity estimates. These estimates change *all the time* as more data is incorporated into the models they use.

    Secondly, the tests are not admissible evidence in court for a while bunch of reasons, the most important of which is that in order to reduce the consumer cost, the tests are non-comprehensive and also while presumably conducted under reasonable standards of lab cleanliness, the submitted sample can be problematic due a variety of possible issues, anything from accidental cross-contamination by the submitters, to issues stemming from certain transplants, to lab techs accidentally swapping results.

    Nelson's concerns about the use of consumer DNA samples by investigative agencies are worthwhile, I think. The primary popularizer of this use is a person called CeCe Moore, who is a reality-show host and who is the lead mod for a large Facebook group called DNA Detectives, as is her show. By far the majority of the users in the FB group, which has 127319 members, are people attempting to assess the accuracy of the consumer DNA ethnicity estimates, and there are a lot of confused people there. After that group, there is a large contingent of people trying to use consumer DNA data to answer questions regarding their immediate biological families - people like me, as an adoptee, but also people whose parents split up, or whose parent did not disclose their partner, or people whose parent misrepresented their paternity. This last group, people that were conceived via what in the group is known as an NPE, an abbreviation for a sort of clumsy term, "Non-Paternity Event," is larger than any of the other groups, based on the posts I see as they scroll by.

    As we move gradually toward a time in which the majority of states in the US have repealed the closed-records status quo for adoptees in closed adoptions, using DNA triangulation to identify one's biological family is an important tool for adoptees who are reluctant to contact their birth families. In many closed-record states, the onus is on the adoptee to initiate the request for contact and the people who hold the power in that relationship are the courts, the involved parties, usually an agency but by no means always, the relinquishing mother, that mother's inheritors of rights should she be unavailable to grant permission, and in some cases the named father on the original birth certificate. This information is not always recorded in the original, and it is not always recorded accurately.

    Placing adoptees in the situation of literally having to ask for information which is ours, which fundamentally constitutes who we are, reduces adoptees to second-class citizenship. It does so by permanently consigning us to a juvenile status, one in which we literally have to ask our mother for permission, one in which the court retains custody of our birth identity, one in which we are denied full access to participate in the life of the nation as independent adults. I think donor-conceived people also have a right to full information about the circumstances of their conception, and that access to that information should not be contingent on the donor seeking contact, for essentially the same reasons. Any tool we have that can be used to get around these restrictions is one that I think is important to use.

    In the case of people whose paternal and/or maternal origins are simply unknown, due to circumstance or catastrophe, the same potential benefit applies: accessible consumer DNA records provide at least the possibility of identifying one's parents without the requirement of requesting contact.

    Requesting contact is hugely fraught. It implies directly the possibility that an adoptee will, in effect, be rejected for a second time by our mothers. We are told all our lives that we have not been rejected by our mothers, that the decision to relinquish was not easy, that it was for the best, and so forth. Some of that is undoubtedly true in many cases, but even if the relinquishment was involuntary or coerced, we still experience it as a rejection. It is different than being orphaned, although one supposes that losing one's mother to suicide could feel quite similar. Thus, many adoptees are fearful of initiating contact, and, when contact is denied, it can be catastrophic emotionally for the adoptee.

    Consumer DNA testing with open records access has been extremely useful to me, post-reunion, in that it allowed me to independently confirm each of the facts of my conception and both sides of my natal families. I am engaged in developing my familial relationship with my maternal family and it is going well. I had no idea that my mother and her family would interpret my request for contact as a request to be integrated into the family and I was deeply taken aback by their expectations at first, but determined that approaching this with openness and an assumption of goodwill was the only sensible course of action, and I am glad I did so.

    Finally, apologies and thanks to Nelson for indulging my correspondence. I said a whole bunch of this to him in a couple of emails, less concisely, and with many eddies and didoes in my prose. I very much enjoy our conversations and am indebted to him.
    posted by mwhybark at 1:32 PM on January 16, 2020 [2 favorites]

    Someone in my FB feed posted this 2018 Atlantic story about DNA testing revealing unexpected parternity results and how Facebook has produced self-organized support groups to help people deal with the fallout.

    The story looks at how challenging the discovery can be for the person who has had their identified parentage suddenly change, how threatening and unwelcome the news can be seen as from both sides, and how the group helps their membership overcome these concerns.

    (In locating a clean link for this story, I happened to notice that The Atlantic appears to have a long-standing editorial interest in covering the larger story of consumer DNA testing. This was one of at least ten stories covering aspects of this published since 2017.)
    posted by mwhybark at 11:49 AM on January 18, 2020 [1 favorite]