Skip to content. | Skip to navigation

Personal tools


You are here: Home / News / Cohen comments on geneology big data project

Cohen comments on geneology big data project

New York Times article reports on "world's largest family tree" research

Writing in Science in March 2018, Yaniv Erlich and collaborators discuss their "Quantitative analysis of population-scale family trees with millions of relatives." The project involved compiling and validating 86 million public profiles from which generated 5 million family trees.

Reporting on the project in the New York Times, Steph Yin drew on insights from Faculty Associate Philip Cohen. “It’s very impressive as a data collection and harmonization effort, and of course they have only scratched the surface of what it might have to offer,” said Dr. Cohen, who was not involved in the research. He pointed out that such data sources have selection biases. “It’s like reading a Jane Austen novel — it gives you lots of great insights, but you have to be careful about generalizing to society as a whole,” he noted.

"While these studies have limitations, Yin writes, "they also offer opportunities. Biomedical research that marries family trees with genetic and health information could lead to new discoveries about heritable disease risk. In social sciences, genealogies merged with census and tax records could yield findings on things like inequality."

“As soon as you start linking these things, your analytical power goes way up — as do privacy risks,” Dr. Cohen said.

With any of these efforts, he added, it’s important to insist “that the norms of science, as far as transparency and openness, still apply.”

See the article in Science

See the New York Times report