This page has been archived and is no longer updated. Find out more about page archiving.
  Ask About English   
Ask about English
Latin nouns and common mistakes

Data storage

A question from Hailu.

Do we say "Data is" or "Data are"?


Listen to the audio
Download the audio (3mb)

Professor Michael Swan answers:
Ok, well I say data and um I say both I say "Data is" or "Data are" but probably mostly "is". Originally, data was a plural noun, it comes from a Latin word that means things which are given, and that’s plural. The singular of that is datum or datum (different pronunciation). But English speaking people mostly don’t know Latin and so not everybody recognised the word was supposed to be plural.

It looks singular to an English speaker, so more and more people came to use it as a singular and now that’s quite normal. At the beginning "The data is" was definitely a mistake but it’s so widely used now that it’s no longer possible to say that it’s a mistake. It’s become part of the language. This is actually quite a common reason for language change. People make mistakes and the mistakes are repeated by other people, and finally they no longer count as mistakes. It happens a lot with vocabulary.

100 years ago, the word "oblivious", for example, meant "forgetful of". If you were oblivious of things that had happened, you’d forgotten them. You might say she was completely oblivious of her early childhood. But then people started using oblivious to mean unconscious of, unaware of, not paying attention to. So they might say someone was oblivious of their surroundings. Not paying attention to what was going on. This was criticised of course.

People wrote to the newspapers complaining that "oblivious" wasn’t being used correctly, the language was going to the dogs, no-body knew how to speak correct English anymore and so on you know the kind of thing. The fact remains, in modern English "oblivious" means unconscious, not paying attention, and that has become its modern, correct use.

The same thing’s happening today with the expression "a concerted effort". It really means an effort by a lot of people working together – concerted means just that; people working together – but a lot of people now use "a concerted effort" to mean "a big effort". For example, somebody might say "I’m going to make a concerted effort to give up smoking." Because they don’t really think of the real meaning of concerted as working together. For the moment, I think that’s more or less a mistake, but if it goes on everybody will end up using the expression to mean a big effort and we won’t be able to say it’s a mistake anymore. It’ll just be part of the language.

So, data is or data are, both ok.

Download the script (59k pdf)
Try the quiz
^^ Back to top Back to Index >>