5 projects that have changed access to information
The way people access information has changed greatly over the last few decades due to the incredible growth of the Internet. Today, people who are interested in any kind of science can easily find explanations that were previously only accessible to academics. They can also take advantage of new communication channels to know different points of view and participate in discussions with other people who have the same interests.
In this context, some projects stand out for providing the community with tools to organize access to large amounts of information and to ensure the quality and reliability of that information.
I want to make here a brief introduction to 5 of those projects: Wikipedia, The Internet Archive, Khan Academy, Github and Reddit.
Wikipedia is a multilingual online encyclopedia created and maintained as an open collaboration project by a community of volunteer editors using a wiki-based editing system. A wiki is a knowledge base website on which users collaboratively write contents directly from a web browser.
At the begging, in 2001, Wikipedia was created by Jimmy Wales and Larry Sanger to complement other of their projects called Nupedia, an online encyclopedia whose articles were written only by experts and that used an extensive peer-review process before publication. The idea was that Wikipedia provided articles and ideas for Nupedia, but the writing of content for Nupedia was extremely slow due to the strict revision system, while Wikipedia reached over 20,000 encyclopedia entries in its first year of existence. Wikipedia quickly overtook Nupedia, becoming a global project in multiple languages. Since 2003, Wikipedia is owned and supported by the Wikimedia Foundation, a non-profit organization that raises money, distributes grants, controls the servers, develops and deploys software, and does outreach to support Wikimedia projects, including the English Wikipedia.
The development of Wikipedia is based on the following pillars:
- Articles should be of general interest and ephemeral information should be avoided. Wikipedia is not an advertising platform, a newspaper, or a collection of source documents.
- Articles should be written from a neutral point of view. Editors’ personal experiences, interpretations, or opinions must be avoided.
- All texts can be edited and redistributed as long as the source is indicated when reusing them and the same Creative Commons (CC) license is used. No editor owns an article.
- Users must respect each other. Tolerance, cordiality and good faith are fundamental to the project.
- Wikipedia has policies and guidelines, but no firm rules. The principles and spirit matter more than literal wording.
Jimmy Wales often talks in his interviews about the importance of informal learning, which does not have a goal in terms of degrees or professional growth, but helps people better understand the world we live in. Wikipedia is a perfect tool for this kind of learning.
Wikipedia is so popular that almost any Google search for an event, a person or simply a movie has among its first results a link to some Wikipedia article. As of April 2020, it is ranked by Alexa one of the 15 most popular websites. There are currently 309 language editions of Wikipedia, being the largest the English Wikipedia that includes more than 6 million articles and averages 584 new articles per day.
The Internet Archive
The Internet Archive, also known as archive.org, is a non-profit organization that provides free public access to collections of digitized materials, including websites, software applications/games, music, movies/videos, images, and books. It is funded through donations, grants, and by providing web archiving and book digitization services. The Internet Archive advocates a free and open Internet.
Brewster Kahle founded the Internet Archive in 1996 at around the same time that he began the for-profit web traffic analysis company Alexa Internet. He realized that unlike newspapers, no one was worried about saving ephemeral content published on websites, so he decided to use crawler programs to store for future use, copies of millions of web sites and their associated data (images, source code, documents, etc.). The archived content is available since 2001 using the Wayback Machine, a service that can be used to see how previous versions of websites used to look like, or to visit websites that no longer even exist.
The Internet Archive has been growing over time in terms of media and technology, and developing practical services for users. In 2005 a program to digitize books was started, and today the Internet Archive operates scanning centers in lots of countries, digitizing about 1,000 books a day. It also hosts a number of other projects like the NASA Images Archive or the digital lending library and book information site Open Library.
Below are some numbers to get an idea of the magnitude of the project (April 2020):
- 424 billion web pages
- 24.7 million books and texts
- 10.6 million audio recordings
- 5.7 million videos
- 3.4 million images
- 532,000 software programs
Many of the digitized documents and photos are old and have gone into the public domain, or have been provided by partners. This includes, for example, books published prior to 1920, magazine collections, video recordings of university classes, and historical photo collections. From the point of view of someone who is fond of history and science, it is like having at hand the research material available in hundreds of libraries and museums.
Khan Academy is a non-profit organization created by Salman Khan with the purpose of providing free online tools that help educate students of all ages, regardless of their social status or where they live. The organization provides through its website a learning platform, based mainly on short lessons in the form of videos and supplementary texts. The platform also includes practice exercises and interactive materials for self-evaluation. In addition, everything is organized and classified so that students can learn at their own pace and using the time they have available.
Salman Khan studied computer science and mathematics in the Massachusetts Institute of Technology (MIT), and also holds a Master of Business Administration from Harvard Business School. In 2004, he began remotely tutoring some members of his family. In 2006, he decided to begin recording videos and posting them on YouTube so that their relatives could see them at their own pace, but it just happens that his videos received worldwide interest from both students and non-students. In 2008, Salman Khan took a step forward and founded Khan Academy as a non-profit organization. One year later he decided to quit his hedge fund job and work full time for his project. Since then Khan Academy has received large grants from well-known institutions such as the Bill and Melinda Gates Foundation, the Carlos Slim Foundation, AT&T or Google.
At the beginning Khan Academy was focused on mathematics, but now there are many courses available that can be classified into five areas: mathematics, science & engineering, computing, arts & humanities and economics. There is more content in English than in other languages, but the organization strives to translate its content into different languages. Khan Academy also develops online tools for teachers and collaborates with schools to improve their teaching methods.
Among Salman Khan’s educational proposals stands out what he calls “inverted class”, where students prepare the topics at home using interactive materials such as videos, that they can pause and repeat when they want to learn at their own pace (lectures), and use the time in class with the teacher to solve exercises and practical cases (homeworks). He insists on studing to master a topic, not only to pass an exam, and highlights the importance of continuous learning. Today we can no longer believe that it is enough to obtain a diploma to have our professional life solved, it is necessary to obtain new skills and knowledge constantly.
Khan Academy is an outstanding example of the current global trend to provide quality online courses. Other platforms worth knowing are edX, created in 2012 by the Massachusetts Institute of Technology and Harvard University, and Udacity, that provide free courses in collaboration with companies as famous as Google.
GitHub is a collaborative online development platform for creating group projects. It has been designed to be used by computers programmers, and is really popular among Open Source software developers. However, something that makes GitHub especially interesting, is that over time people have found multiple ways to leverage the platform that go beyond software development. At the core of the platform is the Git version control system. The same system used by the Linux kernel developers to control changes made to its source code. Although Git has been designed to track changes in the code of a computer program, it can also be used to track changes in any type of text file, such as a book or technical documentation. In addition to version control functionality of Git, GitHub provides several collaboration features such as access control, bug tracking, feature requests, task management, and wikis for every project.
GitHub, Inc was founded by Chris Wanstrath, P. J. Hyett, Tom Preston-Werner and Scott Chacon in 2008 in San Francisco (California). At that time, there were no commercial Git hosting options and GitHub provided developers the possibility of hosting code securely and managing commits to code in a proper manner. The adoption of GitHub for managing Open Source projects grew rapidly due to which paid Git hosting became a viable option and paid subscriptions made the project profitable. On October 26, 2018, Microsoft acquired GitHub for $7.5 billion. As of January 2020, GitHub reports having over 40 million users and more than 28 million public repositories, making it the largest host of source code in the world.
Today many technology companies hosts their Open Source projects on GitHub, including Google, Facebook, Twitter, and even Microsoft. Some of the GitHub’s most popular code repostories are Tensorflow, Electron, Flutter, or Kubernetes. Having all that code publicly available is in itself an invaluable resource for learning programming, and an opportunity for developers to demonstrate their skills by making contributions. Furthermore, GitHub has an education program to give students and teachers free access to popular development tools and services.
As I said before, GitHub’s platform has uses that go beyond software. Some people use repositories in GitHub to collaboratively write documents on various topics, or to create collections of links to tutorials or papers. There are legal documents templates, cooking recipes, cheat sheets, books… It’s just a matter of imagination.
Reddit is a discussion website where users submit contents such as links, text posts, or images, and other users can add comments about this contents in what is essentially a bulletin board system. Posts are organized by subject into user-created boards called subreddits, which cover a variety of topics like science, movies, video games, music, books, history, or economics. Each subreddit is a community with its own focus and moderators. It is possible to find wide-ranging discussions, for example about books of a particular genre (r/printSF), and more specific discussions, for example about a particular book (r/Neuromancer). Users can cast positive or negative votes for each post, and the number of upvotes or downvotes determines the posts’ visibility on the site, so the most popular content is displayed to the most people.
Reddit was created by University of Virginia roommates Steve Huffman and Alexis Ohanian in 2005. The original idea was described as the “front page of the Internet”. The site was acquired by Condé Nast Publications in 2006. In 2009, Huffman and Ohanian left Reddit. As of April 2020, Reddit ranks as the No. 7 most visited website in the U.S. and No. 21 in the world, according to Alexa Internet. The site is especially popular in U.S, with around 40% of its user base coming from this country. There are 430 million monthly active users making comments in more than 1.2 million subreddits.
Reddit is known for its open nature and diverse user community that generate its content. The site administrators defend the freedom of expression and privacy of their users.
I think the best way to understand the kind of information that can be found in Reddit is to include here a short list of subreddits I consider interesting:
- r/science (23.9m users): This community is a place to share and discuss new scientific research. Read about the latest advances in astronomy, biology, medicine, physics, social science, and more.
- r/askscience (19.0m users): Ask a science question, get a science answer.
- r/history (14.8m users): This community is a place is a place for discussions about history.
- r/askhistorians (1.2m users): The Portal for Public History.
- r/economics (800k users): News and discussion about economics, from the perspective of economists.
- r/programming (2.6m users): News and discussion about Computer Programming.
- r/sciencefiction (133k users): This community is a place for fans and creators of Science Fiction and related media in any form.