The web enables pages to be published on the internet, but why was it invented and how does it work? We explain how web pages, web servers and web browsers are linked and how ‘spiders’ play a part in the process.
The web is a system for publishing pages of information on the internet, and for linking pages together using links.
Anyone can publish a page by uploading it to a web server. Anyone can read that page by typing its address into a web browser. This makes it very easy for people to share documents even if they are using what would otherwise be incompatible computers. In fact, that’s exactly why Tim Berners-Lee invented the World Wide Web in the early 1990s.
The web would be useful but annoying if you had to type the precise address - the URL (uniform resource locator) - for every page you wanted. Fortunately, you don’t have to. Web pages can include embedded links or ’hyperlinks’, so simply clicking the link will take you to that page. Following a trail of links is called ‘web surfing’.
Click on the link
The web is based on the idea of ’hypertext’. This implies that texts can be embedded inside other texts, perhaps to provide fuller explanations or background material. Web pages can also embed multimedia, so that, for example, clicking the word ‘cello’ could play some cello music.
How do you know where to click? Traditionally, links were shown by using underlined blue text, and they changed colour after they’d been clicked. However, icons, images, videos and other items can also be used as links. Today, many people understand the conventions for what sort of things are links, so many web developers let users find them with their mouse pointer. When the pointer turns into a small hand, there’s a clickable link.
The web is also traversed by ’spiders’ (or software robots) that follow links and collect information that can be used by search engines such as Google. Not all web users are human!
Web pages and web servers
Web pages are written in text with Hypertext Markup Language (HTML), then uploaded to a host computer running web server software, such as Apache or Microsoft’s IIS (Internet Information Server).
The web server sends out pages when they are requested by a web browser, such as Microsoft Internet Explorer, Mozilla Firefox or Google Chrome. The host and client communicate using an agreed ’language’ called HTTP (HyperText Transfer Protocol). This is why web page addresses begin with http:// and so on.
Today, many web pages are not written in advance, but created dynamically in response to someone’s input. This happens with answers to search-engine queries and, for example, on shopping sites where people search for products within specific price ranges.
The web’s inventor, Sir Tim Berners-Lee, is also leading the attempt to develop a ’semantic web’ that includes metadata which can be read by other computers.
Metadata is data about data, such as whether a string of words is the title of a book. On a semantic web, it would be possible to distinguish between 1984 (a number), 1984 (a date), 1984 (a film starring John Hurt) and Nineteen Eighty-Four (a novel by George Orwell).