Will Big Data DNA analysis herald new era in medicine?

Computer graphic of a DNA autoradiogram and a human head.

Related Stories

Often, you are barely aware of it, but hop on a train, spend some time in the shops, watch a movie or a match or visit your GP, and the chances are you will have contributed to half a dozen collections of mass data.

Government and companies now collect, store and analyse as much information as they can about the way we interact with them.

Their goal is the pursuit of efficiency, and to find ways to save, or make, money. There is even a phrase for it - "big data".

The idea is not just to collect this data, but to analyse it.

Take healthcare. In December 2012, the government announced a big data plan for perhaps our most intimate of data, the DNA read-out of 100,000 people with rare diseases and cancer.


  • Big data involves the gathering and analysis of data on a large scale
  • The data can come from our purchase records, digital photos, social media posts, mobile phone GPS signals etc
  • Companies can use it to help predict who is facing a divorce, planning a baby, looking to move house or change jobs
  • Supermarkets use big data to send money-off vouchers and offers for products they know their customers will like
  • In March 2012 the White House set up a $200m Big Data Research and Development Initiative to explore how it could help address problems facing the government
  • US Police departments use big data to predict crime hot spots and deploy officers before it happens

kshire Police's new technology predicts cri

It is a colossal sequencing effort. Not only does each patient have a unique DNA code, but so do their cancer tumours. And some patients will respond to certain drugs better than others, depending on the genetic variants they carry.

The claim is that a mass DNA database could herald a new era in medicine, and make the nation richer too. Aside from highlighting British innovation and attracting investment, the initial focus is to help people who are already sick.

For the rest of us, the argument goes, if enough people are on the database, trends will become clear.

So we could be more confident that our personal DNA read-out can be checked against those trends and might warn us we are more at risk of certain diseases, and do something about it like changing our lifestyle of getting screened.

We might also be able to avoid drugs known to be toxic in people that carry a similar genetic make-up to our own.

Prof Sir John Bell is one of the government-appointed "champions" for the Life Sciences industry, and chair of the government's Human Genomics Strategy Group. He sees genomics in the NHS as a vital tool and said it is quite a "dramatic change in the way that medicine is likely to evolve".

A graphic of DNA The struture of DNA was discovered in 1953

The big data at the heart of this is the DNA double-helix.

It is made of four chemicals - essentially a code with four letters. The string of letters that spells out a human being is huge - it took about eight years and cost billions of dollars, to unravel the first human genome.

But now, the computer technology that made that possible is far more powerful, and cheaper.

These days, it takes a little over a day to unravel the DNA sequence of a single individual. And though it is not yet possible, there is talk of a £60 price tag.

Aside from cheaper, more powerful technology, it is also scale that brings the real power.

If the plan takes off, then the sheer numbers of patients involved will allow researchers, both public and private, to ask all sorts of questions of the dataset.

The NHS already has big data projects in place, notably, a system that enables scientists to carry out research on our clinical information, once anonymised, and smaller scale genetics research databases, such as UK Biobank, but what is new is the idea of bringing all of this together.

Genetic testing A national gene database might aid epidemiology

"The great thing about the UK, and particularly the English NHS, is 50 million people and it's at that scale that you're probably going to have the power to detect all kinds of things that are very powerful in terms of the management of disease, and have quite a profound impact," said Prof Bell.

He said he does not stand to make any money from the project himself, though he told us he sits on the board of Roche and Genentech, pharmaceutical companies which may benefit from genomics being applied more widely in healthcare in the UK.

There is some agreement that having genetic information from somebody who is already sick can help to find the best treatment for them.

What is less clear is how much the entire genetic read-out of a healthy person can tell you about the illnesses they might get in the future.

Just because someone carries a particular change in their genes, in most cases, it is far from definite that they will go on to become ill.

We are into the realm of probability and risk, which are notoriously difficult to assess and convey.

Identification by data

There is also the issue of privacy.

Professor of security engineering at University of Cambridge, Ross Anderson, has been asked by the Nuffield Council on Bioethics to join a team that will examine the pros and cons of big data and genomics in the NHS.

The government says the information in the new database will be anonymised. But if it is linked to medical records, that will bring new risks, according to Prof Anderson.

Start Quote

The 'grand bargain' that the government is offering us is that if we give them our DNA then they are going to revolutionise healthcare”

End Quote Stuart Hogarth Bioethicist, King's College London

He said medical data is especially hard to protect because it is so rich in information and his primary concern is that individuals, and their data, could be identified by a process of triangulation: "If you look at the typical person's medical record they may have some things that are known to their friends and family, such as that you broke your leg on the 17 January 1991, and some things that you don't want all your friends and family to know, such as that you had a treatment for depression.

"The problem is that if you make de-identified medical records available, then everyone from whom the subject wants privacy knows part of the record - namely the leg break, which is enough to identify that record out of many records - and they can therefore get access to the sensitive information, namely the treatment for depression."

Prof Bell said there are already robust methods in place to protect people's privacy in medical research which rely in part on limiting access to the data to trusted research partners.

"You probably can't get around the issue that no data in any setting is absolutely anonymised and secure," he said.

"But I think the constraints in the system that have already been thought about for other types of clinical data are probably pretty secure."

That is not enough for Prof Anderson, who wants the government to make details public.

"What we actually need is for anonymisation mechanisms to be open to the public, so that we can work out for ourselves whether the protection is adequate.

"I want to be able to test them. I want to be able to kick the tyres, and if the government's lying, I want to expose them, and embarrass them for it."

Bioethicist Stuart Hogarth, of King's College London, said he is not sure people are ready: "The 'grand bargain' that the government is offering us is that if we give them our DNA then they are going to revolutionise healthcare.

"Well it's not clear in fact that we need so much genomic data to understand the genetic basis of health and disease.

"It's not clear that the government has the capacity to put in place the large-scale IT project of the sort that would be necessary to do this, and it's not clear that the British public is willing to accept that bargain."

Watch Susan Watts' full Newsnight report on Big Data and the DNA database


More on This Story

Related Stories

The BBC is not responsible for the content of external Internet sites


This entry is now closed for comments

Jump to comments pagination
  • rate this

    Comment number 69.

    So someone steals somthing with your DNA on (clothing, something for hygiene) and places it at the scene of a crime. The Police come along and track you down through DNA and say that as it's your DNA you dunnit the 'evidence' is irrefutable.
    The whole idea is simply ludicrous to me.

  • rate this

    Comment number 68.

    As the younger generation are increasinly willing to upload data about themselves onto the internet be it images or personal traits and 'likes' all of which is used and sotred by big business I think that the genuine privacy concerns most of us feel are paramount regarding sensative data will be lost on the majority going forward.

  • rate this

    Comment number 67.

    #64 I've got a genetics degree. Do you? Tay-Sachs isn't unique to Jews. Its more prevalent in some Jewish communities because of marriage between cousins.

    Most of these 'find out what tribe you came from tests' are utter guff. My ancestors come from France, Ireland, Scandanavia and probably further east than that originally. The idea anyone is a pure anything is scientific nonsense.

  • Comment number 66.

    This comment was removed because the moderators found it broke the house rules. Explain.

  • rate this

    Comment number 65.

    For the last 5 years I have regularly asked if I could communicate with the NHS (hospital wards, GP, district nurses) by email and the answer is always the same. No.

    Perhaps the NHS could get the simple technology things right first before moving on to mass DNA databases for the whole population.

  • Comment number 64.

    All this user's posts have been removed.Why?

  • rate this

    Comment number 63.

    Local government, DVLA, local councils etc all sell on all sell on information that we would prefer to be private.
    It’s a lucrative side income for them.
    It’s eventually sold into the ether & manifests as email/phone scams.
    I’d hate to think what they would do with DNA info’.

  • Comment number 62.

    This comment was removed because the moderators found it broke the house rules. Explain.

  • rate this

    Comment number 61.

    What's to stop the details being "sold" off to any company with the money to pay for the DNA?

  • rate this

    Comment number 60.

    54. The BBC rustles my jimmies
    Would Adolf Hitler have found a national DNA database useful?
    No. Its almost impossible to even tell a black man's DNA from a white man's. There's certainly no 'Jewish Genes'.... although if there were Hitler may have found more than a few of his henchmen not quite as Aryan as they'd have liked (Erhard Milch head of Luftwaffe procurement for starters)

  • rate this

    Comment number 59.

    Will 'terrorists' look to harvest DNA data to create biological DNA specific weapons in the future?

    Biological weapons that only affect one person? Yes that's something ''the terrorists'' would do.

  • Comment number 58.

    This comment was removed because the moderators found it broke the house rules. Explain.

  • rate this

    Comment number 57.

    Possibly in the medium to long term it could herald a new era of medicine, but at the moment our knowledge of genetics & especially the upcoming area of epigenetics is far too limited.

    Currently it could be used to help identify those more at risk of developing inheritable conditions, but that data could potentially be open to so much abuse that the whole area would need very strict regulation.

  • rate this

    Comment number 56.


    "Stop confusing risk factors with genuine cause and effect and do some real research, please."

    Few diseases have one cause. Most are *caused* by the cumulative effects of multiple factors. It is therefore 'real research' to identify risk factors - so long as the secondary questions of 'how are they risk factors?' and 'how do they interact with the other risk factors?' are addressed.

  • rate this

    Comment number 55.

    As someone who works in genetics, this type of database is essential for research. Most people don't appreciate for many diseases, there is not a single gene that is responsible. There may be multiple genes implicated that have small effects. To detect these small effects you need to have large numbers of individuals. Also, we work with numeric IDs, not individual names. Anonymity is guaranteed.

  • Comment number 54.

    All this user's posts have been removed.Why?

  • rate this

    Comment number 53.

    All I can say is, enough is enough. Please no more government sponsored data collection, even if it is supposedly for our own good. Freedom & liberty are worth so much more than any benefits that might come from this.

    Also, it is disingenuous to compare this sort of thing with supermarkets analysing our shopping habits to send us vouchers.

  • rate this

    Comment number 52.

    The government already have enough information and statistics on us to be worrying. They want to monitor our emails and phone calls, we are recorded and reduced to statistics everywhere we go. I say no way I want my privacy and that includes my DNA

  • rate this

    Comment number 51.

    After having worked for a large multi-national company for so many years, I just don't trust anyone any more. No thanks. Count me out.

  • rate this

    Comment number 50.

    @48. Paranoia maybe.
    Plus if a company does grow a DNA specific limb for you, and want to charge you for it, therefore holding you to ransom. How is this worse than nobody growing a DNA specific limb that you may require in the future. I'd rather be held to ransom than given a death sentence.


Page 10 of 13


More Health stories



BBC © 2014 The BBC is not responsible for the content of external sites. Read more.

This page is best viewed in an up-to-date web browser with style sheets (CSS) enabled. While you will be able to view the content of this page in your current browser, you will not be able to get the full visual experience. Please consider upgrading your browser software or enabling style sheets (CSS) if you are able to do so.