Will Big Data DNA analysis herald new era in medicine?

Computer graphic of a DNA autoradiogram and a human head.

Related Stories

Often, you are barely aware of it, but hop on a train, spend some time in the shops, watch a movie or a match or visit your GP, and the chances are you will have contributed to half a dozen collections of mass data.

Government and companies now collect, store and analyse as much information as they can about the way we interact with them.

Their goal is the pursuit of efficiency, and to find ways to save, or make, money. There is even a phrase for it - "big data".

The idea is not just to collect this data, but to analyse it.

Take healthcare. In December 2012, the government announced a big data plan for perhaps our most intimate of data, the DNA read-out of 100,000 people with rare diseases and cancer.


  • Big data involves the gathering and analysis of data on a large scale
  • The data can come from our purchase records, digital photos, social media posts, mobile phone GPS signals etc
  • Companies can use it to help predict who is facing a divorce, planning a baby, looking to move house or change jobs
  • Supermarkets use big data to send money-off vouchers and offers for products they know their customers will like
  • In March 2012 the White House set up a $200m Big Data Research and Development Initiative to explore how it could help address problems facing the government
  • US Police departments use big data to predict crime hot spots and deploy officers before it happens

kshire Police's new technology predicts cri

It is a colossal sequencing effort. Not only does each patient have a unique DNA code, but so do their cancer tumours. And some patients will respond to certain drugs better than others, depending on the genetic variants they carry.

The claim is that a mass DNA database could herald a new era in medicine, and make the nation richer too. Aside from highlighting British innovation and attracting investment, the initial focus is to help people who are already sick.

For the rest of us, the argument goes, if enough people are on the database, trends will become clear.

So we could be more confident that our personal DNA read-out can be checked against those trends and might warn us we are more at risk of certain diseases, and do something about it like changing our lifestyle of getting screened.

We might also be able to avoid drugs known to be toxic in people that carry a similar genetic make-up to our own.

Prof Sir John Bell is one of the government-appointed "champions" for the Life Sciences industry, and chair of the government's Human Genomics Strategy Group. He sees genomics in the NHS as a vital tool and said it is quite a "dramatic change in the way that medicine is likely to evolve".

A graphic of DNA The struture of DNA was discovered in 1953

The big data at the heart of this is the DNA double-helix.

It is made of four chemicals - essentially a code with four letters. The string of letters that spells out a human being is huge - it took about eight years and cost billions of dollars, to unravel the first human genome.

But now, the computer technology that made that possible is far more powerful, and cheaper.

These days, it takes a little over a day to unravel the DNA sequence of a single individual. And though it is not yet possible, there is talk of a £60 price tag.

Aside from cheaper, more powerful technology, it is also scale that brings the real power.

If the plan takes off, then the sheer numbers of patients involved will allow researchers, both public and private, to ask all sorts of questions of the dataset.

The NHS already has big data projects in place, notably, a system that enables scientists to carry out research on our clinical information, once anonymised, and smaller scale genetics research databases, such as UK Biobank, but what is new is the idea of bringing all of this together.

Genetic testing A national gene database might aid epidemiology

"The great thing about the UK, and particularly the English NHS, is 50 million people and it's at that scale that you're probably going to have the power to detect all kinds of things that are very powerful in terms of the management of disease, and have quite a profound impact," said Prof Bell.

He said he does not stand to make any money from the project himself, though he told us he sits on the board of Roche and Genentech, pharmaceutical companies which may benefit from genomics being applied more widely in healthcare in the UK.

There is some agreement that having genetic information from somebody who is already sick can help to find the best treatment for them.

What is less clear is how much the entire genetic read-out of a healthy person can tell you about the illnesses they might get in the future.

Just because someone carries a particular change in their genes, in most cases, it is far from definite that they will go on to become ill.

We are into the realm of probability and risk, which are notoriously difficult to assess and convey.

Identification by data

There is also the issue of privacy.

Professor of security engineering at University of Cambridge, Ross Anderson, has been asked by the Nuffield Council on Bioethics to join a team that will examine the pros and cons of big data and genomics in the NHS.

The government says the information in the new database will be anonymised. But if it is linked to medical records, that will bring new risks, according to Prof Anderson.

Start Quote

The 'grand bargain' that the government is offering us is that if we give them our DNA then they are going to revolutionise healthcare”

End Quote Stuart Hogarth Bioethicist, King's College London

He said medical data is especially hard to protect because it is so rich in information and his primary concern is that individuals, and their data, could be identified by a process of triangulation: "If you look at the typical person's medical record they may have some things that are known to their friends and family, such as that you broke your leg on the 17 January 1991, and some things that you don't want all your friends and family to know, such as that you had a treatment for depression.

"The problem is that if you make de-identified medical records available, then everyone from whom the subject wants privacy knows part of the record - namely the leg break, which is enough to identify that record out of many records - and they can therefore get access to the sensitive information, namely the treatment for depression."

Prof Bell said there are already robust methods in place to protect people's privacy in medical research which rely in part on limiting access to the data to trusted research partners.

"You probably can't get around the issue that no data in any setting is absolutely anonymised and secure," he said.

"But I think the constraints in the system that have already been thought about for other types of clinical data are probably pretty secure."

That is not enough for Prof Anderson, who wants the government to make details public.

"What we actually need is for anonymisation mechanisms to be open to the public, so that we can work out for ourselves whether the protection is adequate.

"I want to be able to test them. I want to be able to kick the tyres, and if the government's lying, I want to expose them, and embarrass them for it."

Bioethicist Stuart Hogarth, of King's College London, said he is not sure people are ready: "The 'grand bargain' that the government is offering us is that if we give them our DNA then they are going to revolutionise healthcare.

"Well it's not clear in fact that we need so much genomic data to understand the genetic basis of health and disease.

"It's not clear that the government has the capacity to put in place the large-scale IT project of the sort that would be necessary to do this, and it's not clear that the British public is willing to accept that bargain."

Watch Susan Watts' full Newsnight report on Big Data and the DNA database


More on This Story

Related Stories

The BBC is not responsible for the content of external Internet sites


This entry is now closed for comments

Jump to comments pagination
  • rate this

    Comment number 89.

    Big Data? More like Big Nose.

  • rate this

    Comment number 88.

    Most of the arguments against this idea have been political, not scientific. If you genuinely don't trust the government with this data stop voting for the same people every election!
    This will undoubtedly be the future of medicine, if you are in a position where you will deny yourself advances in healthcare because of a distrust of the government, then it's time for a new government.

  • rate this

    Comment number 87.

    And the down side...BIG, BIG, BIG, BIG Brother watching you, watching me.

  • rate this

    Comment number 86.

    The insurance industry would just love it - only insuring people that will never claim, and wrapping the rest of us in exclusions.. and the pensions folk could work out their profit on you before you had actually died.
    No thanks; I like the financial services sector to accept SOME of the risks in life

  • rate this

    Comment number 85.

    I change my DNA every 6 months to stop the risk of fraud. Good luck storing my info NHS.

  • rate this

    Comment number 84.

    It's just ID cards by the back door!!
    First they collect the information and then they start insisting that we can produce the evidence. Before you know it we'll be swiping a card to buy beer and cigarettes.
    Oh and newborn babies have rights! ( and that legal challenge could make someone very rich)

  • Comment number 83.

    This comment was removed because the moderators found it broke the house rules. Explain.

  • rate this

    Comment number 82.

    Big Data is an annoying new buzz word that is being trumpeted around everywhere at the moment. Judging by your definition in this article it should be called "Machine Learning" which is the proper Computer Science name.

  • rate this

    Comment number 81.


    "Have you any clue how much resource it would take to monitor emails and phone calls for anyone but targetted individuals?"

    I work in IT and I can tell you that it's easily physically possible using artificial intelligence. You underestimate the power of data mining and pattern recognition in modern computing.

  • rate this

    Comment number 80.

    @ 69.Mister Point

    "the 'evidence' is irrefutable."

    Don't be a doofus. All sorts of evidence can be planted, not just DNA. I think they just might factor that possibility in...

  • rate this

    Comment number 79.

    I knew the usual cries about privacy would be raised in the comments! Sad to see.

    The US company 23andMe have been doing this sort of thing for a while and have made some great breakthroughs via analysing data (from both genes and question answers). There was an article in wired a while back which showed how much faster it is:

  • rate this

    Comment number 78.

    69. Mister Point

    Spot on. Always a good idea to collect some hair off the floor of the barbers and trail it behind you.

  • rate this

    Comment number 77.

    Whilst fit and healthy it is ok to be paranoid about data security.

    If you're diagnosed with cancer (a significant portion of us will be) and the data could have helped towards a cure I suspect people will be a bit less fussy about data security!

  • rate this

    Comment number 76.

    #72 Stick to planes. We have about 40,000 genes. They're identical between individuals because variation would mean they stop working. There are no 'black' genes, simply slightly increased levels of melanin etc which happen post-DNA. The genes a black man has for making blood are identical to yours otherwise they wouldn't work

    Believe whatever conspiracy cr** you want though. Its a free country

  • rate this

    Comment number 75.

    5 johval

    The Governments record on protecting data is poor.

    Remember the DVLA actually sold on our data for profit. I'm sure the government have already wined & dined potential buyers for this information.

  • rate this

    Comment number 74.

    This has the potential to do so much good and yet is also extremely hazardous. I think I trust the NHS but I do not trust the "establishment" or the private sector to put my DNA data to good use.

    Rest assured that the data will eventually get into the hands of people we never intended it to. The prospect selection through DNA, DNA weaponry and DNA breeding are particularly worrying.

  • rate this

    Comment number 73.

    So in 40 years time will there be sales people trying to sell us assisted suicides based on what our DNA predicts. It'll beat Parkie and his free pen.

  • Comment number 72.

    This comment was removed because the moderators found it broke the house rules. Explain.

  • rate this

    Comment number 71.


    "(The government) want to monitor our emails and phone calls"

    For goodness sake - how many times do people like you have to be told? Have you any clue how much resource it would take to monitor emails and phone calls for anyone but targetted individuals?

    It's not possible and let's face it, 99.9% of us don't do anything worthy of monitoring.

  • rate this

    Comment number 70.

    No.15 There is good genetic reason for me to be discriminated, because I am an oriental woman. Stereotype goes against me. Many government projects are lack of three dimensional thinking. All diseases are causes by multi-factors, not genetic alone. They should tackle problems from different directions.


Page 9 of 13


More Health stories



BBC © 2014 The BBC is not responsible for the content of external sites. Read more.

This page is best viewed in an up-to-date web browser with style sheets (CSS) enabled. While you will be able to view the content of this page in your current browser, you will not be able to get the full visual experience. Please consider upgrading your browser software or enabling style sheets (CSS) if you are able to do so.