BBC Home

Explore the BBC

Front Page

Life | The Universe | Everything | Advanced Search
 
Front PageReadTalkContributeHelp!FeedbackWho is Online
New visitors: Create your membership
Returning members: Sign in
 
3. Everything / Arts and Entertainment / Books & Literature / Publishing

A Short History of Project Gutenberg and Distributed Proofreading

A magnifying glass scanning a keyboard

Project Gutenberg is an effort to digitally reproduce books that are no longer under copyright. Started in 1971 by Michael Hart it now archives over 6000 etexts.

Michael Hart wanted everyone to be able to access the books and documents that the Project archived. On that note, he chose to use ASCII text to save the documents. He also wanted to start with documents that people would want to read and have access to. Thus the first nine documents were all of American historical significance.

The first document to be added into the archive was the US Declaration of Independence in December 1971. A year later the United States' Bill of Rights was added. By the end of 1979, those nine documents had been added to the archive.

It was at that time that the real test began. Through the 1980s the project worked to create the etext of the Bible. Both testaments were completed and uploaded in August 1989.

During 1991, 12 more etexts were loaded including Alice in Wonderland, Paradise Lost, and Aesop's Fables.

Each year during the '90s the number of books added grew exponentially, with over 350 new etexts added in 1999. Some of this increase can be attributed to scanners with optical character recognition (OCR) software that eliminated the need for countless hours of typing, but much of the increase has came from the growth in the number of volunteers to the Project.

Up to that time the process of proofreading and preparing an etext would have been undertaken by one person or a small group of like-minded individuals. In late 2000 a change occurred. A website came online for a different project, an attempt to use the massive numbers of people on the Internet to proofread OCR documents. Called Project Gutenberg's Distributed Proofreaders, the group designed software allowing a person, using only a web browser, to download a single page of OCR text and its matching page image, make changes to the text, and save it. After two such passes the pages are returned to a project manager who fixes any problems noted by the proofers and then submits the etext to Project Gutenberg.

As of March 2003, the distributed proofreader website has had over 1100 etexts posted to Project Gutenberg, has over 400 more in process, and has become the main source of new etexts for Project Gutenberg archive.

Michael Hart's original goal of 10,000 books on Project Gutenberg has not yet been achieved, but, with the continued exponential increase in the numbers of volunteers and the number of etexts being added, the goal should be surpassed by the end of 2004.


Discuss this Entry  People have been talking about this Guide Entry. Here are the most recent Conversations:

Out-of-date!
(Last Posting: Dec 10, 2003)

German Project Gutenberg
(Last Posting: May 20, 2003)




Add your Opinion!

There are tens of thousands of h2g2 Guide Entries, written by our Researchers. If you want to be able to add your own opinions to the Guide, simply become a member as an h2g2 Researcher. Tell me More!

 
Entry Data
Entry ID: A1037936 (Edited)

Written and Researched by:
Raukodraug - Keeper of the Fullmoon Smiley [(2*(-1)+8-0)*(3+4)=42]

Edited by:
The h2g2 Editors


Date: 20   May   2003


Text only
Like this page?
Send it to a friend


Referenced Guide Entries
Alice in Wonderland - the Literary Character
The Declaration of Independence
Aesop and his Fables
ASCII Art


Referenced Sites
Project Gutenberg
Project Gutenberg's Distributed Proofreaders

Please note that the BBC is not responsible for the content of any external sites listed.Community Artist Dr Deckchair Funderlik

Most of the content on this site is created by h2g2's Researchers, who are members of the public. The views expressed are theirs and unless specifically stated are not those of the BBC. The BBC is not responsible for the content of any external sites referenced. In the event that you consider anything on this page to be in breach of the site's House Rules, please click here to alert our Moderation Team. For any other comments, please start a Conversation below.
 


Front PageReadTalkContributeHelp!FeedbackWho is Online

Most of the content on h2g2 is created by h2g2's Researchers, who are members of the public. The views expressed are theirs and unless specifically stated are not those of the BBC. The BBC is not responsible for the content of any external sites referenced. In the event that you consider anything on this page to be in breach of the site's House Rules, please click here. For any other comments, please start a Conversation above.


About the BBC | Help | Terms of Use | Privacy & Cookies Policy