Tag Archives: wikipedia

Why China (and other countries) are banning Wikipedia

Whenever people have sought to exercise their freedom of expression, censorship wasn’t too far behind. The word itself can be traced to the office of censor established in Rome in 443 BCE — the Roman Republic thought good governance included shaping the people’s character. More than 2,000 years later, access to information is still not free in all places of the world despite the fact that we’re now living in the digital age of the internet. And like Rome before it, China is also adamant about what kind of information it allows and, more importantly, what it doesn’t allow its citizens to access.

There are hundreds if not thousands of websites blocked in China, including Google, YouTube, Facebook, and, as of recently, Wikipedia.

The Wikimedia Foundation released a statement on May 17, 2019, confirming that Wikipedia was “no longer accessible in the People’s Republic of China—impacting more than 1.3 billion readers, students, professionals, researchers, and more who can no longer access this resource or share their knowledge and achievements with the world.”

Censoring knowledge in broad strokes

Banning encyclopedias isn’t a modern occurrence. In 1752, the French royal court ordered the distribution of the Encyclopédie to be immediately ceased on grounds that it was “destroying royal authority and encouraging a spirit of independence and revolt.” Denis Diderot and other authors of the French encyclopedia were charged as heretics for suggesting that observation and reason are the sources of knowledge rather than religious authorities.

China’s banning of Wikipedia came without warning, just as previously blocked thousands of other websites to enforce its ‘Great Firewall’ — a strictly controlled ‘Chinanet’ where only part approved values and opinions are publically allowed.

But despite China’s silent approach to banning Wikipedia, we can’t say that it was surprising given the history of the two entities.

Wikipedia has been blocked intermittently since 2004 over controversies surrounding certain Wiki pages that the Chinese communist party considers controversial, such as the 1989 Tiananmen Square protests, or about Mao Zedong and Taiwan.

Most sensitive political articles remained blocked in mainland China for both the English and Chinese versions of Wikipedia until June 2015. Up until then, Wikipedia had been using the non-secure HTTP version, which allowed individual articles to be selectively blocked. But after that date, Wikipedia switched to HTTPS for its entire site, thereby making encryption mandatory for all users and making it impossible to block individual pages. In response, China simply blocked the whole domain.

The Chinese government banned Wikipedia for all languages, not just Mandarin, as it was aware of advances in translation software that can enable anyone to read content from other encyclopedia editions.

Of course, there’s always the option of using a VPN in order to browse Wikipedia from China — but do so at your own risk. On 24 October 2020, a Chinese citizen from Zhoushan, Zhejiang province, came under police attention for “illegally visiting Wikipedia”. The extent of the VPN crackdown is not clear, but news reports have documented police arresting VPN users in at least Hunan and Guizhou.

“There’s no doubt that similar things are happening much more often this year,” Mo Shaoping, a prominent Chinese human-rights lawyer, told The Globe and Mail. “Restrictions for online communities and activities have become increasingly strict. The number of cases in which people are held legally accountable is also growing,” even for reposting controversial foreign content domestically.

The only other country except for China that has banned all editions of Wikipedia is Turkey. Turkish authorities have demanded Wikipedia to “remove content by writers supporting terror and of linking Turkey to terror groups.” In December 2019, the Constitutional Court of Turkey ruled that the block of Wikipedia was unconstitutional and since 15 January 2020, the website is once again accessible in Turkey. However, Wikipedia remains fully blocked in China and it seems like it will stay this way in the foreseeable future.

Meet the Internet’s unsung heroes: Wikipedia’s human collaborators

It would be impossible to imagine the world today without Wikipedia — a fact that students around the world can gratefully attest to.

Although editor bots often steal the limelight in conversations about this great resource, a lot of people have been putting in a lot of work to make Wikipedia what it is today.

Wikipedia actually has a pretty interesting page dedicated to tracking the most prolific authors on the site. All contributors are equal in the eyes of the site and its userbase, so this list isn’t about giving anyone bragging rights.

It serves to acknowledge the people putting considerable time and effort into creating this unique repository of knowledge that we all use daily (and mostly take for granted). In my eyes, they’re the unsung heroes, the ‘real MVPs’ of the internet, and a list commemorating their work is the least of what we should do for them.

But let’s get to know who they are so that we know who to be thankful to while scrambling to meet that paper deadline in the wee hours of the morning.

Steven Pruitt / Wiki user Ser Amantio di Nicolao

Pruitt in a jacket and tie

Steven Pruitt is an editor from Virginia, USA, with over three million edits and more than 35,000 written articles under his belt. Pruitt is hands down the most prolific human Wikipedia editor and publisher — not a bad accolade to hold.

Time magazine seems to agree, as it named Pruitt as one of the 25 most important influencers on the Internet in 2017.

Pruitt works as a contractor for the U.S. Customs and Border Protection but also finds the time to edit, flesh-out, and create material for the online encyclopedia.

“It started in 2001,” he explained in a Reddit AMA (ask me anything) thread in 2019.

“I matriculated college in 2002. I remember watching it climb in the Google search results, from the bottom of the first page to about two or three from the top. Honestly, I didn’t think it was going to take off…but it kept showing up, and one day I thought, ‘What the hell?’, and jumped in. I’m not sure I believed the ‘anyone can edit’ part of it until I became part of ‘everyone’.”

Pruitt is also one of the leading forces that helps shine a light on the achievements of women throughout history (my personal favorite is Hypatia), having written 212 new articles detailing the lives and achievements of influential women when the Time Magazine piece was published. He is also part of the Women in Red initiative, which is “focused on improving content systemic bias in the wiki movement”.

However, he doesn’t focus solely on this or any other topic. His primary source of information, according to the AMA thread are “books, mostly encyclopedias”, alongside material on the web or other sources “as long as they pass a small test :)”.

As to why he does it, it’s the oldest reason in the book — “it’s a hobby”.

“I have my moments, I think everyone does,” he said when asked whether he ever felt like he’s putting too much time and effort into Wikipedia. “But then I look back on some of the articles I’ve written […] and it feels good. That wonderful feeling of having made something useful. That’s what keeps me going, often as not.”

Pruitt adds that he has been approached with offers to write Wikipedia articles for pay by “a couple of people” and only said yes once because “I genuinely felt the subject deserved an article, and would pass the notability test”, but didn’t accept payment for it.

“I know it sounds cheesy, but I’ve come to believe that we, collectively, are changing the world and the way the world thinks about knowledge. That’s an amazing thing to think about, and it still blows my mind.”

It’s safe to say that without Pruitt, Wikipedia — and maybe the internet — wouldn’t be the same.

Justin Knapp / Wiki user Koavf

Justin Knapp—a Caucasian male with brown hair and a bushy beard—stands with his arms folded

He used to be the top contributor between April 18, 2012, and November 1, 2015, when Pruitt took the title. While Justin may not be the most prolific contributor to Wikipedia by sheer number of edits and posts any longer, he will forever remain the first to reach one million edits on the site. As of March 2020, he has performed over 2 million edits and doesn’t seem to be losing any steam.

We have to keep in mind the dedication and the workload people such as Knapp and Pruitt put themselves through for our collective benefit. He has submitted an average of 385 edits a day, every day, for seven years (starting in 2005) by the time he reached 1 million edits in 2012.

To be fair, he does have a perk most of us don’t: a degree in philosophy (and one in political science, but it’s harder to make unemployment jokes with that one) from Indiana University – Purdue University Indianapolis.

“Being suddenly and involuntarily unemployed will do that to you,” he wrote in “his personal page”, according to The Telegraph.

His work didn’t go unnoticed in the community, with Wikipedia co-founder Jimmy Wales congratulating him and declaring that April 20 would be Justin Knapp Day. He says that he doesn’t have a fixed routine in regards to his editing and that his “go-to edits are small style and typo fixes”. Philosophy, politics, religion, history, and popular culture are some of the categories he works on most.

In his day to day life, Knapp has had several odd jobs including pizza delivery, working at a grocery store, and as an operator for a crisis hotline. He also owns a magnificent beard.

I find it quite infuriating that the work of people such as Knapp goes unrecognized by the vast majority of society, despite all the immense benefits it brings. His is a prime example of why careers aren’t a true reflection of an individual’s worth and merits. We all are liable to look down on someone for being “just a pizza delivery guy,” or “just a cashier lady,” with the unstated but implied belief that if they’d only work hard and educate themselves as we do — perhaps on Wikipedia — they would deserve the quality of life we enjoy.

But that pizza delivery guy and that girl working her fingers to the bone behind the counter might just be the person who wrote and corrected the Wiki page you used in your dissertation at school or high-stakes presentation at work. And they may be working “up to 16 [hours] a day” to allow anyone, anywhere, including you, access to the sum of human knowledge.

Wiki user BrownHairedGirl (BHG)

Grainne Ní Mháille (Grace O’Malley), the pirate queen of Ireland, whom BHG calls “one of her heroes”.
Image modified after Wikipedia.

The definition of an unsung hero, because she wants to “neither tell [her] life story or reveal all sorts of interesting details about [herself].”

All we know of the elusive BHG comes from her user page (linked above). She’s currently living in Connacht, Ireland, was “expelled from the University of Life”, her eye color is “working”, and she seems to be in a romantic triangle with Gráinne and Finn McCool, two figures from Irish mythology. Which, one would assume, makes her Mrs. McCool.

Beyond her sense of humor, BHG is also a major contributor to Wikipedia, having performed close to 1.8 million edits. She’s one of the scarier admins of the site, with close to 12,000-page deletions and 245 user bans under her belt in her 14 years and 4 months’ worth of work on the site. But she also has a nurturing side, restoring over 1,100 pages back to the Wiki.

BHG spawned BHGbot in 2007, a bot that tags “talk pages of articles and categories [to] identify the articles as being within the scope of a particular WikiProject”, although we are not informed of whether she did so by Finn McCool or someone else. Tragically, BHGbot’s life was cut short in 2009.

Her work further includes quite a bit of heavy lifting behind the scenes, which has to do with streamlining the way Wikipedia’s automated systems handle category indexing. I won’t pretend to fully understand what that means, but it involves coding.

BHG mostly works “on Irish topics, especially politics,” she explains on her user page.

“I have also done a lot of Scottish politicians and judges, and on Westminster MPs from across the UK. Plus Irish and UK constituencies and by-elections. I created the page Families in the Oireachtas [the Irish legislature], built and developed a large chunk of the articles on UK constituencies, and built the [navigation boxes] which unify navigation between the constituencies in Ireland of 5 different parliaments and assemblies.”

She cites the British Newspaper Archive (BNA) as her “most useful source” of information for her work on Wikipedia, noting that “sadly, the BNA no longer offers free access to Wikipedia contributors” but that she plans to one day take out a subscription and “start writing articles again”.

Richard Farmborough / Wikiuser Rich Farmborough

A 53-year old Richard Farmborough.
Image credits SKBalchal / imgur.

Another one of Wikipedia’s mysterious contributors (hard facts and mystery — this site has it all), Rich is close to reaching around 1.7 million edits since he joined in March of 2015 (according to his user page).

His user page is quite cryptic. It tells us that Richard is skilled in technical and general editing and that he likes “to welcome, facilitate, and enable new editors”.

I had to do some prodding around to find out more about the man behind the Wikiuser Rich, but from what I’ve found, he seems like a genuinely cool guy. Writing in a blog post for Wikimedia, Syed Muzammiluddin explains that Rich “developed a passion for English Wikipedia the moment he discovered the project as early as 2004,” especially given his previous experience and interest with bulletin boards. He considers himself to have been a “full-time Wikipedian” ever since March 2012 “with some gaps”.

According to the same source (it also has a photo of a younger Richard), he was born and brought up in Enfield (a London borough) and holds a degree in Mathematics. As far as careers go, he has worked as a professional in car insurance, in e-commerce, and in academia.

“The English Wikipedia should be considered a storehouse of resources,” he told Muzammiluddin. “Given the ubiquity of the language, anyone with even a passable command of English can make a valuable contribution to Wikipedias in other languages. Not just in articles, policies, and guidelines, but also in the wide reuse of templates—saving thousands of hours.”

British tabloid The Sun cites Rich as a “retired project manager Richard, from Stamford, Lincs”, so make of his bio what you will. However, they do have some interesting quotes detailing’s Richard’s experience with the encyclopedia and what drives him to contribute:

“When I first found Wikipedia I started jumping in and editing as I read, adding bits here and there. If I see something that needs doing, I will do it. It might mean writing a few sentences, but it could be as simple as fixing a typo,” he explains.

“It seems like a lot of time, but what else would I be doing? Watching videos of cats on YouTube? At least this is productive.”

He further explains that it is important to him “that knowledge is accessible to all,” and volunteers like him “are making that possible — one edit at a time.” His advice to everyone out there is to “be bold, be patient, and be kind”.

That being said, though, I am very partial to cat videos on YouTube.

Wikiuser BD2412

“This editor is a Grandmaster Editor First-Class and is entitled to display this
Mithril Editor Star with the Neutronium Superstar hologram” — this is the message that, in a bright yellow box alongside a picture of said Star, greet you upon accessing BD2412’s profile page.

This is the best picture I’ve been able to find of BD2412 — conveniently supplied by himself.

BD2412 joined Wikipedia in December 2005, making him one of the longest-serving Administrators of the site.

Among the few tidbits of information we get from his page is that BD2412 is a lawyer. Perhaps unsurprisingly, his profile lists “law” and “people in law” as some of his main areas of contribution. Wikipedia lists him as having made in excess of 1.5 million edits, and BD2412 states that he has edited about 14.25% of the articles on Wikipedia — not a mean feat by any margin.

“If you have edited more than seven articles, there is probably an article that both you and I have edited,” he adds.

BD is also “an administrator on English Wiktionary; and on English Wikisource; and an admin and bureaucrat on English Wikiquote,” and judging by the pins on his page, has published peer-reviewed articles in academic journals.

I also like these two quotes he has on his profile — one about what Wikipedia is for and how it should function, the other about cautioning that the source of information you’re using can shape the data that you find:

“Wikipedia is not an experiment in democracy: its primary method of finding consensus is discussion, not voting. That is, majority opinion does not necessarily rule in Wikipedia. Various votes are regularly conducted, but their numerical results are usually only one of several means of making a decision. The discussions that accompany the voting processes are crucial means of reaching consensus.”

“Be aware of Google bias when testing for importance or existence: bear in mind that Google will be biased in favor of modern subjects of interest to people from developed countries with Internet access, so it should be used with some judgment.”


These are just 5 of Wikipedia’s contributors — granted, they are the 5 most prolific ones as measured by the number of their edits and posts, but they’re by no means the only ones. They’re just 5 out of a page listing over 5000 people. And there’s a second page dedicated to yet another 5000 (presumably to keep the lists navigable).

It’s a testament to these people, their work, and the mind-boggling wealth of data that Wikipedia encapsulates, that with very tiny exceptions (i.e., one Reddit threat, a quote, and picture), everything I’ve written here is available on the platform.

We tend to take it for granted. There’s absolutely no shred of doubt, at least to most of us in the West, that if an internet connection is available, we can access data on virtually anything; on everything. In 30 seconds we can use our smartphone to say “see, I was right” in an argument with our friends just by typing “wi” and hitting enter on Wikipedia when it comes up as the first suggestion.

For any of our ancestors, Wikipedia would be nothing short of a miracle. We know it’s not; it’s just a system constructed on electricity forced into silicon chips. But systems are only as powerful as the people who build them allow them to be. In that light, the work contributors such as these five here do, without asking for recognition, fame, praise, or fortune, often refusing it voluntarily, is nothing short of a modern miracle.

All images used in this post, unless otherwise stated, are sourced from Wikipedia.

Even bots have arguments. Some Wikipedia bots can undo each other for years before settling an edit

Credit: prisonplanet.com.

Computer scientists from the University of Oxford and the Alan Turing Institute in the UK analyzed how both human editors and bots interact on Wikipedia. Anyone can edit an entry on Wikipedia which is why human volunteers and bots specially made to check facts and edits are so important. Sometimes editors will make back and forth changes before reaching a final decision. The same goes for bots as well, the study found. Interestingly, bots on some pages of Wikipedia ‘argued’ for years before finally settling.

Robot debate

You might be surprised to learn that most of the internet’s traffic isn’t created by humans. Overall, bots are responsible for 52 percent of web traffic, according to the security firm Imperva. Most of these bots are malignant, designed to attack websites by scrapping content, spamming comments, forcing passwords or injecting malicious content such as malware. But there are many bots that serve a wide range of functions, most notably crawling. Facebook’s feed fetcher, by itself, accounted for 4.4 percent of all website traffic.

Wikipedia has its own bots and, frankly, the website couldn’t run without them. These bots edit millions of pages every year and are responsible for the bulk of mindless jobs such as formatting sources and links. Some can even start new pages with minimal content — entries known as stubs — to get the conversation going.

To get a sense of the scale involved, Thomas Steiner from Google Germany monitored bot activity across all 287 language versions of Wikipedia and on Wikidata. In 2014, he found that human and bot entries are roughly 50-50 tied across all language versions of Wikipedia but there was some huge variation on a language by language basis. Only 5 percent of the edits to the English language version of Wikipedia were made by bots in contrast to 94 percent of the edits to the Vietnamese version are by bots. And despite this huge bot activity, Wikipedia remains over 95% accurate, beating most textbooks.

Professor Taha Yasseri from the Oxford Internet Institute, and colleagues, wanted to see how these bots interacted seeing how some overlap in tasks. The team looked at Wikipedia entries across 13 different languages over ten years (2001 to 2010). The study suggests there are many instances where the bots interacted with unpredictable consequences.

Remarkably, the bots behaved more like humans than expected changing behavior depending on the cultural context. Bots on the German edition, for instance, had the fewest conflicts with each undoing another’s edit 24 times on average. Bots in the Portuguese edition, however, quarreled over edit as many as 185 times on average over the ten years monitoring period. For the English edition, there 105 back-and-forth edits or three times the rate of human reverts.

Some of these fights could rage on for years. In some situations, there were deadlocks as the bots would repeatedly undo one another.

“We find that bots behave differently in different cultural environments and their conflicts are also very different to the ones between human editors. This has implications not only for how we design artificial agents but also for how we study them. We need more research into the sociology of bots,” said Dr Milena Tsvetkova, from the Oxford Internet Institute.

The findings published in PLOS ONE should come as a warning to developers who write code for artificial intelligence systems ranging from autonomous driving to cyber security to managing social media. Developers should be aware of the bots’ different cultural contexts — or rather the cultural context of the human designer.

“The findings show that even the same technology leads to different outcomes depending on the cultural environment. An automated vehicle will drive differently on a German autobahn to how it will through the Tuscan hills of Italy. Similarly, the local online infrastructure that bots inhabit will have some bearing on how they behave and their performance. Bots are designed by humans from different countries so when they encounter one another, this can lead to online clashes. We see differences in the technology used in the different Wikipedia language editions and the different cultures of the communities of Wikipedia editors involved create complicated interactions. This complexity is a fundamental feature that needs to be considered in any conversation related to automation and artificial intelligence,” Yasseri said.

This paper is also one of the few that studies ‘bot sociology.’ As more and more AI systems will interact with one another, as well as humans, it will be very important that these interact only when there’s a consensus to avoid unintended, possibly catastrophic consequences.

 

 

French computer scientist turn Wikipedia into an universe of knowledge. Literally

What’s life worth if you don’t sometimes waste a whole afternoon on Wikipedia, chain-reading entries? Not much.

But with so much information available, I sometimes have difficulty staying focused on one topic and then I start shotgunning articles left and right. Thankfully, Wikiverse comes to help put order into the chaos by displaying all the articles on Wikipedia as a tiny universe of information for you to navigate. Which is awesome.

wikiwerse

Interconnected topics form clusters of stars, each one a single article (that will load up right in the interface if you click on it.) Each star is visually connected to related topics through colored loopy lines, so you can hop around like you would on the actual Wikipedia website. Zoom out to see how it all fits together, then zoom in for the actual information.

Wikiverse is the latest update of a 2014 Chrome experiment called WikiGalaxy, that sadly never truly took off. The software is designed by Owen Cornec, a French computer scientist who wanted to make Wikipedia more engaging. He initially tried to have star clusters color coded after which category they fit in, but there were just too much information and he ran out of colors.

So he just made different clusters stand out from each other and used colors to indicate whether an entry belonged to one cluster or another. Wikiverse also runs more smoothly than the older WikiGalaxy, even on browsers other than Chrome (I had a lot of fun on it and I run Mozilla.)

So if it’s been a long week and all you want to do is unwind, there’s now a whole universe (of information) you can explore.

The former U.S. president's Wikipedia entry has been edited tens of thousands of times. A controversial figure for sure.

The 15 most edited pages on Wikipedia

Wikipedia celebrates 15 years of feeding eager minds with knowledge and helping undergraduates turn reports on time. To mark the occasion, the website was gracious enough to post some interesting stats, among which its most edited entries. Eight years since he left office, George W. Bush tops the list to this day with 45,862 edits since the list was compiled by Wikipeda last week suggesting he’s maybe the most controversial public person in recent history.

The former U.S. president's Wikipedia entry has been edited tens of thousands of times. A controversial figure for sure.

The former U.S. president’s Wikipedia entry has been edited tens of thousands of times. A controversial figure for sure.

Bush’s Wiki page has twice as many edits than the closest historical figure/celebrity to make the list: Michael Jackson (28,152), followed by Jesus (28,084), Barack Obama (24,708), Adolf Hitler (24,612) and Britney Spears (23,802).

The second most edited English Wiki page of all time is the  List of WWE personnel. Is the controversy stemming from the fact that wrestling isn’t actually a real competitive sport or because people are so deeply passionate about it? Worth looking into, maybe.

The most edited page of last year (2015) was Deaths in 2015, which compiles the deceased personalities of the year.

With its ups and down, Wikipedia to this day remains the go-to source for quick access to knowledge, references and hours-long rabbit-hole incursions. While anyone can edit an entry in Wikipedia, that doesn’t make it less trustworthy. The website has been remarkably apt at moderating articles through its impressive network of editors and contributors. For instance, a study found the accuracy of drug information information on Wikipedia was 99.7%±0.2% when compared to the textbook data.

Here’s the full list:

  1. George W. Bush (45,862 edits)

  2. List of WWE personnel (42,863)

  3. United States (35,742)

  4. Wikipedia (33,958)

  5. Michael Jackson (28,152)

  6. Jesus (28,084)

  7. Catholic Church (26,421)

  8. List of programs broadcast by ABS-CBN (25,188)

  9. Barack Obama (24,708)

  10. Adolf Hitler (24,612)

  11. Britney Spears (23,802)

  12. World War II (23,739)

  13. Deaths in 2013 (22,529)

  14. The Beatles (22,399)

  15. India (22,271)

Study shows Wikipedia Accuracy is 99.5%

Wikipedia is a resource used by people everywhere, from middle school students to college students (and it’s safe to say that researchers also use it from time to time). But the blessing and the curse of Wikipedia is that everyone can edit it – that means that a massive amount of articles can be written and managed, but it also means that inaccurate information can easily sneak in articles. A group of German researchers set out to test that and see just how accurate Wikipedia really is.

“[The] lack of a formal editorial review and the heterogeneous expertise of contributors often results in skepticism by educators whether Wikipedia should be recommended to students as an information source. In this study we systematically analyzed the accuracy and completeness of drug information in the German and English language versions of Wikipedia in comparison to standard textbooks of pharmacology”, researchers write.

Image via Wiki Commons.

Indeed, it seems like a good place to start. They analyzed articles on drugs, drawing every piece of relevant information, as well as references, revision history and readability. Their conclusion is that the accuracy of drug information on Wikipedia was 99.7%±0.2% when compared to the textbook data. However, even though the articles were very accurate, they weren’t fully complete. Scientists rate the completeness of articles at 83.8±1.5%. However, completeness had a huge variation, ranging between 68.0% and 91.0%. This difference shows that Wikipedia is not always the best resource to draw complete information from, but it always provides over two thirds of the whole story. Furthermore, from the drug information missing in Wikipedia, 62.5% was rated as didactically non-relevant in a qualitative re-evaluation study.

This is crucial especially in areas which change a lot, such as pharmacology. The fact that you have this huge resource from which you can draw massive amounts of information is remarkable. The fact that it is open source, ad free, community driven (though moderated) and still manages to have an almost perfect accuracy is simply amazing! The only problem I have with this study is the sample size. Of course, it’s a tough analysis to conduct, but 100 drugs is still not enough to draw definite conclusions.

Journal Reference: Jona Kräenbring, Tika Monzon Penza, Joanna Gutmann, Susanne Muehlich, Oliver Zolk, Leszek Wojnowski, Renke Maas, Stefan Engelhardt, Antonio Sarikas. Accuracy and Completeness of Drug Information in Wikipedia: A Comparison with Standard Textbooks of PharmacologyDOI: 10.1371/journal.pone.0106930

wikipedia-bot

This author edits 10,000 Wikipedia entries a day

wikipedia-bot

Photo: prisonplanet.com

Sverker Johansson could encompass the definition of prolific. The 53-year-old Swede has edited so far 2.7 million articles on Wikipedia, or 8.5% of the entire collection. But there’s a catch – he did this with the help of a bot he wrote. Wait, you thought all Wikipedia articles are written by humans?

A good day’s work

“Lsjbot”, Johansson’s prolific bot, writes around 10,000 Wikipedia articles each day, mostly cataloging obscure animal species, including butterflies and beetles, as well as towns in the Philippines. About one-third of his entries are uploaded to the Swedish Wikipedia, while the rest are written in two version of Filipino, his wife’s native tongue.

Judging from this master list, there are a myrriad of Wikipedia bots, like the famous rambot, which is used to generate articles on U.S. cities and counties. In fact, half of all Wiki entries are written by bots, and the Lsjbot is the most prolific of them all.

So, how does the bot writes anything that a human can remotely understand? Well, computer semantics have come a long way, and to Johansson’s credit, who holds degrees in linguistics, civil engineering, economics and particle physics, he did a pretty good job. His algorithm pulls out information from credible sources, rehashes the information and arranges any figures, important numbers or categories in a predefined narrative. Don’t image the bot edits a whole novel, though.

wiki_bot

Sverker Johansson can take credit for 2.7 million Wikipedia articles. Most were created using a computer program, or ‘bot,’ that he made. Ellen Emmerentze Jervell/The Wall Street Journal

Lsjbot’s entries are categorized by Wikipedia as stubs – pages that contain only the most important, basic bits of information. This is why his bot works so well for animal species or towns, where it can make sense to automatize the process. In fact, if Wikipedia has a chance of reaching its goal of encompassing the sum of the whole human knowledge, it needs bots. It needs billions of entries, and this is no task a community of humans can achieve alone, not even one as active and large as Wikipedia.

Some people are against this sort of approach, like 41-year-old Achim Raschka, who claims he spends a whole days writing a single in-depth article about a plant.

“I am against production of bot-generated stubs in general,” he said. He is particularly irked by Mr. Johansson’s Lsjbot, which prizes quantity over quality and is “not helping the readers and users of Wikipedia.”

Johansson himself admits the entries are … bland at best, but even so this doesn’t mean they don’t provide value which is where he draws the line. For instance, Basey, a city of about 44,000 in the Philippines, was devastated by the Typhoon Yolanda. The Swedish Wikipedia entry for Basey was edited by Lsjbot, and contained information like coordinates, population and other details. Many people accessed the page to learn more. Moreover, Johansson stresses that his bot only writes stubs – as such they provide a basic starting ground for other contributors to come in and fill the gaps.

Criticism

The Lsjbot also provide a way for Johansson to combat the lack of obscure references and articles, at least on the Swedish Wikipedia, he says. For instance, there are more than 150 articles on characters from “The Lord of the Rings,” and fewer than 10 about people from the Vietnam War.

“I have nothing against Tolkien and I am also more familiar with the battle against Sauron than the Tet Offensive, but is this really a well-balanced encyclopedia?”

“It saddens me that some don’t think of Lsjbot as a worthy author,” he said. “I am a person; I am the one who created the bot. Without my work, all these articles would never have existed.”

via WSJ

The Internet’s response to the Japanese earthquake and tsunami

An immediate CNN.com reporting the Japanese quake and tsunami

In the wake of Japan’s most devastating recorded earthquake to date, the nation of the rising sun is still left in shock. Hundreds were killed, many more left homeless, countless financial damage and entire cities left with electricity – it might even get a heck of a lot worse. Another big issue is the telecom failure which makes phone communication practically impossible  – this is when the people turn to the internet.

In Tokyo alone, twitter reports that 1,200 minutes are been sent per minute, providing an insightful overview towards the escalation of the event. On facebook, you can imagine things are more intense, but due to its private nature it provides little to no insights. Ushahidi built a database to help those offering aid connect to those in need.

Maps:

View Japan Earthquake – March 11 in a larger map

Within a few hours after the calamity, Google immediately launched applications that try to direct, connect and help people near the disaster zone and their loved ones overseas as well. The first useful app is a special Google Maps crisis report that has exact positions of affected locations,  shelters in Tokyo, the earthquake’s epicenter and more. The Google owned YouTube has a channel up called CitizenTube, where you can watch raw footage of the earthquake and tsunami in Japan. Also, Google launched Person Finder— a tool that allows users to both report a missing person as well as enter any information they have on a missing person. It displays in English and Japanese and at moment I’m writing this, it has more than 58,000 records in its database. On the app you can either specify whether you’re looking for someone or post information about a missing person. A lot of worried family members in the states for example used this get in contact with loved ones based in Japan after seeing they’re unreachable on the phone.

On Wikipedia, a group of contributors opened a page detailing important information relating to the Japan earthquake. Twelve hours since the earthquake hit, that page has been edited more than 500 times and is rife with information, including other affected areas and international response.

Ten years ago, this couldn’t have been possible, but now with the help of the internet’s social media not only can people can get informed, they can also help and be helped through it. The internet never ceases to amaze me.

Wikipedia raises $16 million and remains ad free (shorties)

Say what you want about some articles or about the whole site in general, but Wikipedia is absolutely awesome; and the fact that they are ad free only contributes to their awesomeness. As you might know, in order to remain ad free, they appeal to people for donations. Founder Jimmy Wales said this year’s (actually last year’s) fundraises campaign was the shortest ever. Over 500.000 people donated, a number roughly double than that of the last year. Donors from 140 countries donated an average of $22, and $13.7 million of the donations came via the Internet, while the rest came from direct checks.

“This outpouring of support by hundreds of thousands of ordinary people from all walks of life is a testament to the spirit of the Wikimedia movement. Wikipedia is a public resource created and maintained by hundreds of thousands of volunteers, relied on by over 400 million people and paid for by half a million donors. It’s truly user-created, supported and maintained.”, concluded Wales

Life Encyclopedia; too popular for its own good?

encyclopedia

The idea of a comprehensive encyclopedia has been launched a long time ago, and the internet was the perfect way to make it possible and to promote it and make it popular. The internet has done its job too well, because the computers which hosted the encyclopedia were overwhelmed and couldn’t keep it alive when it debuted Tuesday.

I tried to give it a look, but it was down at the moment. Organizers said that this would be fixed after the 1 million page Encyclopedia of Life crashed in the very first day. They sought help from Wikipedia, because the massive interest was just overwhelming! Hopefully this will be just a temporary problem.

“We’ve been overwhelmed by traffic,” encyclopedia founding chairman Jesse Ausubel said. “We’re thrilled.”

The encyclopedia’s Web site logged 11.5 million hits over 5 1/2 hours, including two hours of down time, according to organizers. This happened although the fact that Tuesday’s unveiling included limited Web pages for 30,000 species.

What promises to be the greatest attraction is the fact that they included “exemplar pages” that go into more depth with photos, video, scientific references, maps and text of 25 species ranging from the common potato to the majestic peregrine falcon to a relatively newly discovered obscure marine single celled organism called Cafeteria roenbergensis. Eventually, they will have all 1.8 million species on the Web. Strangely enough the most popular of the species for Web searches is the poisonous death cap mushroom, which may say something about people’s homicidal intentions, joked Ausubel.