A BRIEF HISTORY OF PROJECT GUTENBERG by Michael S. Hart, Founder This history is the result of several requests and suggestions from friends, colleagues, media personalities, etc. on the eve of the 35th anniversary of Project Gutenberg which takes place a few days from now on July 4, 2006. From my own perspective, I prefer to concentrate on the future as that is where even larger changes will take place, and ever on the present, as that is when the changes DO take place, but I realize that there is a place for such a history, and now in my 60th summer, having just completed my 60th spring, and part of my 60th winter, I fully realize that I have been here now a lot longer than I will be in the future, and would have to get to the age of 95 to even last as long in the future as Project Gutenberg has lasted in the past. Thus this history. A General Overview Table of Contents The Project Gutenberg Timeline My Own Expectations The Five Information Ages and The Five Copyright Laws Passed To Stifle Them Money * The Project Gutenberg Timeline The high points are easy to map out: The U.S. Declaration of Independence The first eBook done burning the midnight oil July 4, 1971. The History of Democracy Series [These works were displayed on the walls of countless schools and malls around the country during the 1970's] One added each year of the 1970's with help from an anonymous set of volunteers that even _I_ could never identify, persons who just seemed to know there was a need, and the work popped up on various local computers. The Bible Our 10th eBook was the again provided by anonymous volunteers as much of the Project Gutenberg library has been. The Bible accounted for all of our successful work in the 1980's except for the preliminary editions of Alice in Wonderland. We were working on a Complete Shakespeare, but the copyright laws had been changed with so little publicity that we didn't find out about it for years, and thus a huge amount of labor was lost. No Direct Positive Feedback for the First 17 Years The total lack of interest by the world at large other than a lot of this anonymous help for the first 17 years. The Complete Works of Shakespeare The 100th eBook again done burning the midnight oil; December 10, 1993, on the 4th anniversary of my Dad's death: and done with the aid of a handful of volunteers around the world so I could honor my father by completing the book in his honor. Also a debt of gratitude to The World Library who allowed the use of their copyrighted edition as source material. [Please note: technically speaking the days of completion of the books done in all nighters would be the following day; in the case of honoring my father, the person who completed that last piece of Shakespeare was in Hawaii so it was technically still December 10th, but obviously given the lengthy times it took for file assembly and file transfer in those days it was obviously December 11th before anyone had the file.] Dante's Divine Comedy Our 1,000th book, released in both the original Italian and a few English translations in August, 1997. For some time book titles in other languages were intentionally features in book numbers of multiples of 500 and 1,000 to encourage a language expansion so Project Gutenberg could provide books to a wider and wider portion of the world population. These included Don Quixote in Spanish as #2,000 Siddhartha in German as #2,500 A L'Ombre Des Jeunes Filles en Fleurs, by Proust as #3,000 The Entire French Immortals Series as #4,000 The Notebooks of Leonardo Da Vinci as #5,000 Ironia Pozorow, by Maciej hr. Lubienski in Polish as #6,000 The Kalevala, The Finnish National Epic, was #7,000 We went a bit further with #8,000 to do in Africa's honor, The Slave Trade, Domestic And Foreign, Why It Exists, And How It May Be Extinguished The Story Of Vishnu, in Sanskrit, was #9,000 And then back to our roots of The History of Democracy for The Magna Carta, for our 10,000th eBook, in Latin. As I write this particular paragraph of this history, it is a month, to the day, until the 35th anniversary of placing that first eBook online that was the first step in the encouraging of the creation and distribution of eBooks that was to become the hallmark of Project Gutenberg in our mission statements: "Encourage the Creation and Distribution of eBooks" "Help Break Down the Bars of Ignorance and Illiteracy" "Give As Many eBooks to As Many People As Possible" At this time it still has not been officially decided what is going to be the 20,000th Project Gutenberg eBook, hopefully a book that we will complete and have online by July 4, 2006. * Note on the numbering and dating of Project Gutenberg eBooks: During parts of Project Gutenberg's history, I would schedule all the books to be done in the following year, creating some index files in which to enter the titles and authors, etc. A huge surge in our production rate in the earlier days of this led to several years in which we were ahead of that schedule, and you will find books with actual release dates ahead of an official release date that was planned well in advance. This practice ended with the achievement of our 10,000th eBook. However, you should also be well advised that the eBooks from Project Gutenberg of Australia, Project Gutenberg of Europe-- independent organizations--each have their own number systems and thus when we add their nearly 1,000 eBooks to the totals, you will find that the 20,000 grand total only includes about 19,000 in the original numbering system. We included the new Project Gutenberg PrePrints section in the grand 2006 totals, and those are counted internally, but only received the final catalogue numbers when they are no longer PrePrints, at least that's the way it is right now. Given the numbers of changes to our cataloguing systems over the years, I wouldn't want to bet that things won't change again, and for the better. In addition, I should mention The Project Gutenberg Consortia Center, a collection of collections donated by various others in eBook production around the world. This site has 75,000+, which will yield an overall grand total of ~100,000 eBooks on July 4, 2006. [My apologies, but John Guagliardo, who runs Consortia Center sites for us, insists on being very conservative in estimates of how many eBooks are at http://gutenberg.cc and he will not let me state for the record the exact number.] * I should also mention The World eBook Fair, co-sponsored by a combination of Project Gutenberg, The World eBook Library and several other eBook publishers you will be able to find via a sponsor's link on the main page at http://worldebookfair.com The world eBook Fair is an effort to present one month a year of completely free access to all the eBooks we can manage. These eBooks are coming from at least 100 collections, around the world, on the following schedule: 1/3 million eBooks in 2006 1/2 million eBooks in 2007 3/4 million eBooks in 2008 ONE million eBooks in 2009 and we hope to make that level of progress with or without an eBook project continuing from Google, Yahoo or The Library of Congress though we would certainly encourage them to continue to redouble their efforts year after year. However, in all honesty, I must point out that it will take a lot more than doubling every year or every 18 months: Let us presume that Google manages to reach 100,000 scans, on their public servers, by June 14, 2006, 18 months after their big billion dollar multimedia publicity blitz of December 14, 2004, to reach 1% of their stated goal of 10 million eBooks. As per Moore's Law, that would yield the following table: Moore's Law Growth / Google eBook Totals Dates Doublings Years 00 Dec 14, 2004 0 0 50,000 Jun 14, 2006 1 1.5 100,000 Dec 14, 2007 2 3 200,000 Jun 14, 2009 3 4.5 400,000 Dec 14, 2010 4 6 800,000 Jun 14, 2012 5 7.5 1,600,000 Dec 14, 2013 6 9 3,200,000 Jun 14, 2015 7 10.5 6,400,000 Dec 14, 2016 8 12 12,800,000 Jun 14, 2018 9 13.5 Or, if they doubled every year, instead of every 1 1/2 years: Moore's Law Growth / Google Totals Dates Doublings Years 00 Dec 14, 2004 0 0 50,000 Dec 14, 2005 1 1 100,000 Dec 14, 2006 2 2 200,000 Dec 14, 2007 3 3 400,000 Dec 14, 2018 4 4 800,000 Dec 14, 2019 5 5 1,600,000 Dec 14, 2010 6 6 3,200,000 Dec 14, 2011 7 7 6,400,000 Dec 14, 2012 8 8 12,800,000 Dec 14, 2013 9 9 However, some people, even those who say that Moore's Law was never meant to apply to human powered project, say that their progress will actually multiply by 10 instead of 2 in Moore's Law periods, as follows for their 6 year plan: 10x Moore's Law Growth eBook Totals Dates 00 Dec 14, 2004 100,000 Jun 14, 2006 1,000,000 Dec 14, 2007 10,000,000 Jun 14, 2009 leaving 6 months to spare for 10 years to December 14, 2010. I'm sure that Google, with over $100 billion, aligned with a cadre of multibillion dollar libraries such as, The New York Public Library, Oxford, Harvard, Stanford and Michigan would certainly be able to scan 10 million books in that period if they really wanted to. However, given their track record for their first 18 months, I worry that they are not taking this project seriously, and are giving short shrift to what could actually give truth to their public statement of December 14, 2004: "This is the day the world changes." I should also add that there are many people who are pleased to a much greater degree with their progress than I am and a similar number of people who seem to be less pleased. Given my own 35 year history of making eBooks I can only say that I had both hoped and expected better results. Yes, I admit it, my hopes and expectations are high. My Own Hopes and Expectations My own ideal of eBooks is what you would get if you sat down at your computer and typed in a few pages every day, week by week, until you finished the book, then proofread it several times to get it to 99.9% level of accuracy and then sent the book off to a few of your friends to proofread with eyes not your own to find the errors you missed more than once. This should result in an eBook that exceeds the standard set by The Library of Congress for eBooks in the 1990's. Obviously there was no other way to create eBooks in 1971 as when the whole process of creating eBooks readable by humans and computers began because there were no automated tools in place to assist us as we have today. Personally, I think it is now time to increase the standards to perhaps 99.97% or 99.99%, not a big change, and certainly within the realms of reality. Given these expectations that I harbored for several decades before eBooks finally struck the fancy of others, you might, just possibly, be able to understand my reaction to Google's announcement of December 14, 2004. I was elated! Here, finally, was going to come the $100 billion dollar new enterprise that would be able to finish what we had started! Obviously Google was going to create an online library where anyone in the world could read any of 10 million books, from some of the greatest library collections in the world. However, it didn't work out that way. It didn't work out that way several times over. 1. We never got any public announcement of books being done, catalogues made available, or anything else of the like, so a person who wanted to see what Google was doing had nothing in the way of the same kind of opportunity as with a "bricks and mortar library." 2. The books we did manage to find were not available for an easy download to our own systems. In fact, upon examination, it appeared that Google had actually taken actions to PREVENT people from downloading the books. 3. It turned out that Google Print Library books were not in one single computer file, but that the text you search was in one file, while the pages you actually could read were in one or more other files, often one file per page, and with a kind of invisible file overlaying them, so that if you downloaded, or tried to download, the page, you got a blank file. 4. The text you could search was available only in "snippet" form. . .literally a couple lines of a book were all you got, and then when you wanted to look at the book, you were sent a whole different route to different files containing images of the words you had just seen in plain computer text, words you could previously have cut and pasted into your own computer-- but now which were in a non-portable format. [Note: it turns out that many of the names used are misnomer after misnomer. . .Google's Print Library could not print and was not a library. . .Adobe's "Portable File Documents" would turn out not to be very portable at all, but certainly moreso that a graphical representation in .gif or .jpg format, etc.] So, my "Great Expectations" of an eBook library that could be written on a single DVD and snailmailed for one stamp via the first class mail system [or even less at book rate], were not coming true via The Google Print Library as announced in that December 14, 2004 media blitz, nor from the renamed version: Google Book Search which was immediately relabeled by Google "Google Book Search is a means for helping users discover books, not to read them online and/or download them." Thus we had both the reality that Google was not providing an electronic library the like of which they described via their billion dollar publicity campaign, nor now was defined in the official words quoted above. I won't go into detail here about the authors and publishers, who took Google to court for concentrating on the copyrighted materials instead of public domain materials, or how Google's books have changed even more, as per public pressure, to keep at least one foot in the door of eBooks as I envisioned them, other than to say that I continue to hope Google, Yahoo, etc. will finally come to the point of realizing that publicity of the kind the received from their early efforts might be a lot less than the kind they would receive if they actually set an electronic library free to be downloaded with their blessing. Yes I understand the Masters of Business Administration logic of wanting to force everyone to come to Google every time the need arises to look something up in one of these books, and a certain amount of advertising revenue is generated whenever a "hit" takes place. . .that's lots of money. But I also understand that the just plain "good will" that is generated by giving these books away is even greater. Just as Project Gutenberg once got credit for giving away ALL of the eBooks in the world [we constantly received the errors or all the other publishers], Google could easily have made a similar impact, by outproducing Project Gutenberg, many times over, and getting credit from the masses for all eBooks, even without making any statements to that effect. It could literally have been on the order of Carnegie library efforts that were so big exactly 100 years ago today. For an awfully long time, millions of people thought that Carnegie's public library program had built every public library. THAT is the way promote books and libraries, not lock them up in ways that prevent people from having access. By the way, Google wasn't the first to make this mistake. eBooks for the Elite. . .or for Humanity at Large? Oxford made the same mistake with The Oxford Text Archive, in which they attempted to collect all the world's eBooks in one collection that was as exclusive as possible, and to claim it as free to anyone who wanted to archive their eBooks there. However even for something as freely available as Gutenberg's Alice in Wonderland they charged $45 and limited distribution to 9-track reel to reel computer tape, just to keep the books out of the hands of the riffraff. In the end they were so exclusive that they priced themselves right out of the eBook marketplace, when they should have had the whole thing to themselves, if they hadn't been so greedy. "Those who do not study history, are condemned to repeat it." The Oxford Text Archive wanted to be something the common man or woman could not and would not use, they got their wish. This is what happens when you try to be too elitist. My own Project Gutenberg efforts are just the opposite with a priority to bring books and literacy to the masses just as in the days of Johannes Gutenberg's press, when numbers of books owned by the average person rose above 0 for the first time-- my goal is to raise the number of libraries owned by the same average person above 0 for the first time. The average public library has about 30,000 books. We are currently building a single-sided, single-layered DVD, and right now it has 20,000 books on it, and is only 2/3 full of books, so the final edition should have as many books as a modern public library has. These common single-sided, single-layered DVDs were available for just 17 cents the last time I bought them in a store, and together with a book rate stamp for each one would mean I can mail 100 of these to separate 100 addresses for $50. However I should add that if you use heavier envelopes, the weight is going to go over one ounce, and the stamps will cost more. In a few years I predict there will be any number of DVDs out there containing 30,000 eBooks, and even more as the standard becomes dual-sided and dual-layered DVDs that can hold eBooks at four times the number of pages. Can you imagine a single DVD containing 100,000 eBooks?!?!?!? And mailing these to anyone you know for just 50 cents? * That's the kind of thing _I_ had in mind when I started doing eBooks, and it is possible right now, though the dual-layered dual-sided DVDs are more expensive right now per side, so the prediction I am making is still for a few years from now. However, right now blank DVDs are about the same price as CDs and that will eventually happen with the dual-sided & layered DVDs as well, particularly when the new higher density DVD is in the major marketplaces. Then DVDs containing 100,000 eBooks will indeed be common. * There you have my general overview of "The History of Project Gutenberg," and I will fill in more details as time goes on a little more complete with some perspective on why the initial items produced were so small by comparison as most readers of this are probably unaware of just how limited computer drives and bandwidth were, back in the 1970's, and also some mention of why our first effort into very large books like a Complete Works of Shakespeare were counteracted by the voiding of that copyright law we had started out with, and replacement of the law with a new law that eliminated one million books from the total we could use in Project Gutenberg. Obviously my own view of putting entire libraries in the hand of anyone who wants them has been countermanded by legalistic corporate lobbying as well as the machinations of elitists of the nature of Oxford, Harvard, Stanford, etc. My own goal is bring all this to the masses, and research for just a few minutes will tell you of centuries of efforts by a cartel of publishers starting with The Stationers Guild later known as The Stationers Company, all the way to their modern- day heirs, The World Intellectual Property Organization. I close this introduction with a comparison of the first four Information Ages to the current Information Age of today. For those who want this is more detail, please see my blog at http://pglaf.org/hart where one of the main essays is on The Five Information Ages. The Five Information Ages and The Five Copyright Laws Passed To Stifle Them Obviously Project Gutenberg is named after the inventor of an elementary fundamental building block of Information Ages for all time, Herr Johannes Gutenberg, inventor of moveable type, the foundation of both the printing revolution, and the later Scientific Revolution, and thus The Industrial Revolution. However, what most of you probably didn't know before Project Gutenberg essays on the subject, is that the first copyrights were created simply to stop The Gutenberg Press and to renew, or restore, the monopoly such scribes and stationers had had, all the way back to time immemorial. For those of us in the United States the political process of counteracting technological breakthroughs via manipulation of copyright laws might be much more obvious. In 1830, the first high speed steam printing press patent was issued right alongside an expiration of the 1st United States copyright ever issued starting back around 1790. For those who had enjoyed a monopoly of publications in early days of the United States, this was totally unacceptable, and a huge effort was mounted to change the copyright laws before any new publishers could buy these new printing presses in an effort to create inexpensive reprints of these books. Thus the rights of the public to the public domain were dead, for a second time, stifling the effects of the steam presses, just as thoroughly as the first United Kingdom copyrights had reduced the number of titles available from 6,000 to 600 back when The Stationers Company first outlawed all such Gutenberg presses that were outside their purview. The U.S. Civil War interrupted this process or it might be in continuation as the 1830 Copyright Act allowed the copyrights to eventually expire, but with the advent of electric presses at the turn of the century, the legalistic maneuverings would be revived and the 1909 copyright act served the same purpose a third time, yet again stopping an impending revolution in a world of inexpensive republishing of public domain works. The Xerox machine received the same treatment in 1976 and the Internet was responded to via the 1998 U.S. Copyright Act. In just the 89 years from 1909 to 1998 the U.S. Copyright Act was amended again and again and again to kill off revolutions in publishing from electric presses, Xerox machines, and then finally the Internet as copyrights were extended from average periods of about 30 years in 1900 to 95 years in 2000. The odds of living long enough to legally copy anything later in life that you had seen originally published in childhoods, virtually dropped to 0% for children today from expectation a child could legitimately and legally have a century ago. In plain terms, even without any further copyright extension, the average child born today, even in the countries with high life expectancies, will never live long enough to republish a single example of anything published in their lifetime. We are now legally cut off from republishing the culture your life included being published from birth to death. . .it will all be in the hands of Big Brother. * Money At least one of my most vocal critics has tried to convince a number of people that I do eBooks for the money and that I am manipulating the Project Gutenberg volunteers for a profit. Of course, that very same person has publicly said that money should be the top priority of Project Gutenberg. At the current time I have not received a PGLAF paycheck from my services to Project Gutenberg for over three years. This selfsame critic has also criticized John Guagliardo, who has run the Project Gutenberg Consortia Center on his own, in terms of both his own time and his own money for years. I am not sure the total amount of money either of us has made for our years of labor would equal a median nation income for the same period. . .but. . .if either John or I ever do get a fortune in return for our work, I think we will deserve it. On the same note, I should add that receiving just a penny in return for every eBook given away would generate enough money to buy out Donald Trump. The world population is coming up on 2/3 of 10 billion. 1.5% of that total is 100 million people. Thus if the average one of the Project Gutenberg eBooks reaches 1.5% worldwide, there would be a total of 2 trillion copies given away just of that 20,000 eBooks that Project Gutenberg has published. 2 trillion pennies is 20 billion dollars. If we add in all the eBooks Project Gutenberg has republished from other eBook publishers, we approach 100,000 eBooks and a total of 10 trillion eBooks given away if the average one may get to 1.5% of the population. 10 trillion pennies is 100 billion dollars. And this is only counting up to July 4, 2006, and again only counting the eBooks Project Gutenberg made or republished. If we count the 1/3 million World eBook Fair books presented on July 4, 2006, and if those reach just 1.5% percent of the world population, that would be 33.3 trillion copies. 33.3 trillion pennies is 1/3 trillion dollars. And that is just for the First World eBook Fair, just wait a bit for the next three to take place and those figures above will translate to: 1/3 million eBooks in 2006 yields 1/3 trillion dollars 1/2 million eBooks in 2007 yields 1/2 trillion dollars 3/4 million eBooks in 2008 yields 3/4 trillion dollars ONE million eBooks in 2009 yields ONE TRILLION DOLLARS just valuing each copy at one penny and reaching just a 1.5% portion of the world population with the average eBook. * However, I have tried to keep money out of the equation with respect to the reality of Project Gutenberg, as it is a very important issue to prove that eBooks can be created en masse without any real financial support. This keeps eBooks in an entirely free realm, from start to finish, without any needs for multibillion dollar corporations such as Google, Yahoo!, or even governments; though it would certainly be nice if an assortment of these, or of schools, actually acted on vision statements that SAID that they are interested in the general welfare of the population as a whole. A Few More Comments On Money While on the topic on money, I should mention that our money isn't really based on anything much more sophisticated in an overall point of view than that monetary system of Indians a Dutch man gave "$24 worth of beads and trinkets" in exchange for Manhattan Island. Our own "beads and trinkets" consist of gold and silver from the old days, along with platinum and iridium, etc., from an assortment of newer items on the list. Let's face it, any spacefaring ship that landed here is more than likely to have come across some number of gold, silver, platinum and iridium asteroids in their travels and would be thus able to buy anything they wanted from us in exchange of a few "beads and trinkets" they picked up along the way. It is fairly common knowledge that all of the heavy elements including even the iron in our blood, were created in novas, the explosions of stars billions of miles away. I should add that some of the iron is actually made in stars before they reach the nova stage, and thus probably we could find examples of some heavier elements in very dense stars. But the fact is that we still value shiny objects such as an assortment of precious metals and precious stones, still the basic "beads and trinkets" of primitive money. * A Brief History Project Gutenberg The First Step to eLibraries, July 4, 1971 To put it simply: I was the right person in the right place at the right time with the right friends. There was a LOT of luck involved in making Project Gutenberg and I would warn you to be quite suspicious of anyone who is claiming to have started such a project without lots of luck being a major factor. I grew up in a house full of books and electronics: luck. My brother's best friend was a computer operator: luck. MY best friend was an operator on the same one: more luck. This was one of the first computers on the Internet: luck. I just happened to be there when it happened: more luck. I already knew how to run a TeleType machine: more luck. I figured out how to run the computer by hanging around. They gave me my own computer account July 4, 1971: luck. A copy of The Declaration of Independence in my bag: luck! When I wondered what I could do to repay the $100 million in my new computer account I stopped to eat while I thought and thought about what I should do with all that computer power, and when I opened my bookbag and shook out my munchies I saw the copy of The Declaration of Independence tossed in by the grocery bagger at the store: luck! The proverbial light bulb went off over my head. . .!!!!!!! And the rest, along with a LOT more luck, and a lot of work, as they way. . .is history. * The first 17 years of my work on eBooks was boring. The first years of the Internet were boring, too. Anyone who says they were exciting must have been insiders-- with a totally different perspective--as there was nothing-- literally nothing--of interest to the general public. Therefore I am skipping over all those years in which I made some little efforts at making eBooks without a single word-- not one--in response to the idea of eBooks other than that I was crazy to want to put Shakespeare in a computer. Remember, this is my own perspective, I didn't know all that was going on in other parts of the world, though I will make an effort to fill in those gaps to some degree here. Thus we jump from the early 1970's to the second half of the 1980's with just a heartfelt nod to Steves Jobs and Wozniak, the inventors of the personal computer, and Bill Gates, IBM, and the IBM cloners who brought PCs to the mainstream. Here is the briefest of histories of that period. My own first computers were cheapies like Atari's, and a few 4040 chips that I never got around to building into anything at all, and then, finally a couple second hand CP/M machines that turned out to actually be useful in that I could call a lot of other computers with them, though at great expense as I learned when I received my first phone bill afterwards. In addition I got WordStar with each of those machines, thus starting my career of banging away at my own keyboard to see what I could create on my own computers. I eventually built my first "gray market" IBM machine, for a whopping total of $3300, even using nearly all scrap parts-- and the conversion to DOS from CP/M really got me hooked, as I could write my own batch files, to automate nearly all the processes I used on a daily or weekly basis. In addition I got Word Perfect, which turned out to have the best phone support in the world, in addition to being one of the best programs in the world, and I was off to the races-- so to speak--as eBooks suddenly became much easier to make-- via the macro commands you could write in Word Perfect and a host of other features that made creating eBooks a breeze up to the point of doing the final save. . .that was slow. In addition, around the same time I became a BBS SysOP for a somewhat famous BBS [Bulletin Board System] and thus learned a LOT about communicating with people around the world a lot before I got back on the Internet [see the next section].