Will Big Data DNA analysis herald new era in medicine?

Computer graphic of a DNA autoradiogram and a human head.

Related Stories

Often, you are barely aware of it, but hop on a train, spend some time in the shops, watch a movie or a match or visit your GP, and the chances are you will have contributed to half a dozen collections of mass data.

Government and companies now collect, store and analyse as much information as they can about the way we interact with them.

Their goal is the pursuit of efficiency, and to find ways to save, or make, money. There is even a phrase for it - "big data".

The idea is not just to collect this data, but to analyse it.

Take healthcare. In December 2012, the government announced a big data plan for perhaps our most intimate of data, the DNA read-out of 100,000 people with rare diseases and cancer.


  • Big data involves the gathering and analysis of data on a large scale
  • The data can come from our purchase records, digital photos, social media posts, mobile phone GPS signals etc
  • Companies can use it to help predict who is facing a divorce, planning a baby, looking to move house or change jobs
  • Supermarkets use big data to send money-off vouchers and offers for products they know their customers will like
  • In March 2012 the White House set up a $200m Big Data Research and Development Initiative to explore how it could help address problems facing the government
  • US Police departments use big data to predict crime hot spots and deploy officers before it happens

kshire Police's new technology predicts cri

It is a colossal sequencing effort. Not only does each patient have a unique DNA code, but so do their cancer tumours. And some patients will respond to certain drugs better than others, depending on the genetic variants they carry.

The claim is that a mass DNA database could herald a new era in medicine, and make the nation richer too. Aside from highlighting British innovation and attracting investment, the initial focus is to help people who are already sick.

For the rest of us, the argument goes, if enough people are on the database, trends will become clear.

So we could be more confident that our personal DNA read-out can be checked against those trends and might warn us we are more at risk of certain diseases, and do something about it like changing our lifestyle of getting screened.

We might also be able to avoid drugs known to be toxic in people that carry a similar genetic make-up to our own.

Prof Sir John Bell is one of the government-appointed "champions" for the Life Sciences industry, and chair of the government's Human Genomics Strategy Group. He sees genomics in the NHS as a vital tool and said it is quite a "dramatic change in the way that medicine is likely to evolve".

A graphic of DNA The struture of DNA was discovered in 1953

The big data at the heart of this is the DNA double-helix.

It is made of four chemicals - essentially a code with four letters. The string of letters that spells out a human being is huge - it took about eight years and cost billions of dollars, to unravel the first human genome.

But now, the computer technology that made that possible is far more powerful, and cheaper.

These days, it takes a little over a day to unravel the DNA sequence of a single individual. And though it is not yet possible, there is talk of a £60 price tag.

Aside from cheaper, more powerful technology, it is also scale that brings the real power.

If the plan takes off, then the sheer numbers of patients involved will allow researchers, both public and private, to ask all sorts of questions of the dataset.

The NHS already has big data projects in place, notably, a system that enables scientists to carry out research on our clinical information, once anonymised, and smaller scale genetics research databases, such as UK Biobank, but what is new is the idea of bringing all of this together.

Genetic testing A national gene database might aid epidemiology

"The great thing about the UK, and particularly the English NHS, is 50 million people and it's at that scale that you're probably going to have the power to detect all kinds of things that are very powerful in terms of the management of disease, and have quite a profound impact," said Prof Bell.

He said he does not stand to make any money from the project himself, though he told us he sits on the board of Roche and Genentech, pharmaceutical companies which may benefit from genomics being applied more widely in healthcare in the UK.

There is some agreement that having genetic information from somebody who is already sick can help to find the best treatment for them.

What is less clear is how much the entire genetic read-out of a healthy person can tell you about the illnesses they might get in the future.

Just because someone carries a particular change in their genes, in most cases, it is far from definite that they will go on to become ill.

We are into the realm of probability and risk, which are notoriously difficult to assess and convey.

Identification by data

There is also the issue of privacy.

Professor of security engineering at University of Cambridge, Ross Anderson, has been asked by the Nuffield Council on Bioethics to join a team that will examine the pros and cons of big data and genomics in the NHS.

The government says the information in the new database will be anonymised. But if it is linked to medical records, that will bring new risks, according to Prof Anderson.

Start Quote

The 'grand bargain' that the government is offering us is that if we give them our DNA then they are going to revolutionise healthcare”

End Quote Stuart Hogarth Bioethicist, King's College London

He said medical data is especially hard to protect because it is so rich in information and his primary concern is that individuals, and their data, could be identified by a process of triangulation: "If you look at the typical person's medical record they may have some things that are known to their friends and family, such as that you broke your leg on the 17 January 1991, and some things that you don't want all your friends and family to know, such as that you had a treatment for depression.

"The problem is that if you make de-identified medical records available, then everyone from whom the subject wants privacy knows part of the record - namely the leg break, which is enough to identify that record out of many records - and they can therefore get access to the sensitive information, namely the treatment for depression."

Prof Bell said there are already robust methods in place to protect people's privacy in medical research which rely in part on limiting access to the data to trusted research partners.

"You probably can't get around the issue that no data in any setting is absolutely anonymised and secure," he said.

"But I think the constraints in the system that have already been thought about for other types of clinical data are probably pretty secure."

That is not enough for Prof Anderson, who wants the government to make details public.

"What we actually need is for anonymisation mechanisms to be open to the public, so that we can work out for ourselves whether the protection is adequate.

"I want to be able to test them. I want to be able to kick the tyres, and if the government's lying, I want to expose them, and embarrass them for it."

Bioethicist Stuart Hogarth, of King's College London, said he is not sure people are ready: "The 'grand bargain' that the government is offering us is that if we give them our DNA then they are going to revolutionise healthcare.

"Well it's not clear in fact that we need so much genomic data to understand the genetic basis of health and disease.

"It's not clear that the government has the capacity to put in place the large-scale IT project of the sort that would be necessary to do this, and it's not clear that the British public is willing to accept that bargain."

Watch Susan Watts' full Newsnight report on Big Data and the DNA database


More on This Story

Related Stories

The BBC is not responsible for the content of external Internet sites


This entry is now closed for comments

Jump to comments pagination
  • Comment number 189.

    This comment was removed because the moderators found it broke the house rules. Explain.

  • rate this

    Comment number 188.

    While there are undoubted benefits from a DNA database, there is equally the danger is that it leads to a situation where the best jobs (or other benefits) go to the people with the perceived "best" genes. It's eugenics by the back door.

  • rate this

    Comment number 187.

    186. spam spam spam spam
    Trouble is, even now cutbacks are damaging so many "policing" agencys which is the MAIN reason why we have horse burgers, part policing of food has been passed to local councils who do not have resources to police the industry.
    Ironically without DNA testing the horse meat would never have been discovered........

  • rate this

    Comment number 186.

    Trouble is, even now cutbacks are damaging so many "policing" agencys which is the MAIN reason why we have horse burgers, part policing of food has been passed to local councils who do not have resources to police the industry.

    Food is just one, what about care homes etc.

    CANNOT TRUST establishment whos present policy is to outsource all & sundry

  • rate this

    Comment number 185.

    #162 Robintheboywonder. Elaborate on your illegal obtaining of information comment? I'm not sure what you mean? If you're suggesting an insurer would covertly hire a hacking expert..."

    My point is that if an insurer came into possession of DNA data which could influence premiums, they would use it.

    Who needs a hacker?
    Data security is poor in this country - that is a fact.

  • rate this

    Comment number 184.

    Two words spring to mind. Capita.

    I know, that's one word. But by the time Capita got their hands on it...

  • rate this

    Comment number 183.

    The paranoia that runs through society makes me giggle.
    Our government have problems controlling illegal immigrants, hiding their expenses from the press.
    Our NHS has difficulty controlling the Influenza virus and can't pass on records.
    Yet people believe that a DNA database will be used for "James Bond" type espionage and a New Global Order.
    Get real people.

  • rate this

    Comment number 182.

    As someone who works in cancer research I can see how this would help us find out, not only the genes which predispose us to develop disease, but also those that determine whether we respond to particular types of treatment or if we are likely to relapse or not. All the data I use is anonymised; I'm not interested in the individual but the common ground between groups. This would be no different.

  • rate this

    Comment number 181.

    #162 Robintheboywonder. Elaborate on your illegal obtaining of information comment? I'm not sure what you mean? If you're suggesting an insurer would covertly hire a hacking expert to break the code of redacted DNA files in the hope of tying the info to an applicant, I don't think that is likely. In life & health insurance, premium is key for competition. Premiums have steadily decreased for years

  • rate this

    Comment number 180.

    #178 Curly hair is hardly unique to Africans. Even if it was a 'genetic bullet' that targeted that gene wouldn't kill them... it would probably just straighten their hair! My wife would kill for that!

    Seriously why go to such crazy lengths when we have so many highly effective weapons already? If you want to misuse DNA science GM a virus & immunise your own people before releasing it.

  • rate this

    Comment number 179.

    This is how insurance will get round this. "Sign here for £1000 premium or sign here to allow access to your DNA records for £100 premium".
    Mi5 will have there own method of unrestricted access, GCHQ will hack in and pass everything on to their masters in USA who in turn will sell onto big business. The Polis will just get a rubber stamped court order.

  • rate this

    Comment number 178.

    166. Joe: I suspect this MAY be at least a concept, eg; why do africans have curly hair, where is the preponderence of sickle cell anaemia etc., it's DNA. As a concept, could this not suggest some form of 'genetic bullet'?

  • rate this

    Comment number 177.

    As someone that has worked on data mining projects for Insurance companies, I can assure you that they are not evil geniuses with incredible secret algorithms that can identify your eye colour from your tone of voice - they're mostly a closed shop of old boy bumbling fools.

  • rate this

    Comment number 176.

    The NHS could make so much more and better use of IT.

    From telesurgery through to paper-free communications and robotised cleaning and in-hospital transport.

    They didn't get the basics achieved under Labour's big plan, so maybe they ought to do that first, before even considering this idea?

  • rate this

    Comment number 175.

    "Insurance companies"
    Why should someone whose apparent risk factors indicate probable lower cost to an insurer subsidise someone of probable greater cost? You don't ask to pay the motor insurance of a single 17 year old essex boy, nor to withhold your asl or marital status. Arguing against DNA profiling by insurers is nonsensical unless you expect to profit, in which case it's dishonest.

  • rate this

    Comment number 174.


    Uh...I didn't say it did. I said the nuts would come out in full force and demand that I be sterilized to save the NHS money. Read what I wrote before attempting to undermine my statement.

  • rate this

    Comment number 173.

    There seems to be some confusion between coding and none-coding DNA. Only a few % of our DNA does anything (Genes) and our genes are virtually identical in all humans (mutated genes = cancer and worse). Police DNA databases record none-coding DNA which varies between almost all individuals & can be used to identify individuals.

  • rate this

    Comment number 172.

    I am not impressed to think that people have tabs on everything i am, all my genetics, it seems very daunting to say the least

  • rate this

    Comment number 171.

    To save money, they could integrate the NHS DNA database with the Police DNA database. Christmas would have come early for the boys in blue. Anonymity? A court order to find someone whose DNA matches this would suffice.

  • rate this

    Comment number 170.

    As long as there are constitutional guarantees against data exploitation I quite like the idea. This is where we are most likely to find cures for cancer, aids, and aging.

    Fountain of youth? Yeah, why not?.... Sign me up.


Page 4 of 13


More Health stories



Try our new site and tell us what you think. Learn more
Take me there

Copyright © 2015 BBC. The BBC is not responsible for the content of external sites. Read more.

This page is best viewed in an up-to-date web browser with style sheets (CSS) enabled. While you will be able to view the content of this page in your current browser, you will not be able to get the full visual experience. Please consider upgrading your browser software or enabling style sheets (CSS) if you are able to do so.