IIT-Guwahati researchers give Wikipedia a trust upgrade
Wikipedia is maintained by volunteers worldwide, and the developed method can help editors identify hidden typos and linking errors that might otherwise remain unnoticed for years.
Their solution is a multilingual method that uses mathematical frequency patterns to detect and correct subtle errors in Wikipedia, ensuring more reliable information for both human readers and the artificial intelligence systems trained on it. This was showcased at the India AI Impact Summit 2026.
Wikipedia is a cornerstone of digital knowledge, maintained by volunteers worldwide. But it isn’t flawless. A study by the IIT-Guwahati team found that 3–6% of all links contain mistakes — typos, misspellings, or extra words in the text that links one page to another.
These “Surface Name Errors” may look trivial, but they quietly erode trust. For readers, they reduce credibility. For AI, which often uses Wikipedia as a training dataset, they can distort learning and weaken performance.
The team built a three-step process. First, every link is broken down into four parts — the page it appears on, the page it points to, the word used, and the surrounding text. Next, the method applies a frequency test — a name is considered valid only if it appears at least 10 times and makes up 5% of all links to that page. Finally, flagged errors are classified as either simple typos — like “Gawahati” instead of “Guwahati” — or span errors, where extra or wrong words creep in.
Speaking about the real-world application of the developed method, Prof Awekar, said, “This work shows us that we should not be trusting the data from the web blindly, both for human use and training AI models. Good data is the beginning of any good AI model and downstream application.”
Even more striking, the Wikipedia community accepted over 99% of the manual corrections suggested by the researchers, the institute said.
For everyday users, this means cleaner articles. For AI, it means stronger models built on trustworthy data. And for Wikipedia’s volunteer editors, it offers a scalable way to catch mistakes that might otherwise remain hidden for years.
Popular from City
- Hyderabad fintech founder Pankaj Kumar held as cops bust Rs 13,000 cr online gaming racket; firm acted as 'critical gatekeeper'
- ‘Doctor from New Zealand’ asks Rs 51 lakh for daughter’s treatment, dupes elderly from Mumbai
- 'Mother died due to mental agony caused by her': How techie plotted pregnant ex-wife’s murder in Hyderabad; bought chainsaw to break in
- Beauty, blood and Bishnoi links: Arrest of ‘Madam Zeher’ opens can of worms in Delhi’s drug underworld
- Bid us goodbye 'with a cheerful heart': Chhattisgarh couple dies by suicide after losing son in road accident
end of article
Trending Stories
- US Supreme Court Ruling Trump Tariffs Live Updates: Top court's decision impacts some, but not all of Trump's levies
- Ronda Rousey vs Gina Carano: What makes the MMA showdown so special
- AUS vs OMAN, T20 WC: Australia beat Oman by nine wickets
- Alysa Liu family: Inside the story of Olympic figure skater's father Arthur Liu, surrogacy journey, and close bond with her siblings
- Bengal vs EC: SC takes 'extraordinary' step, judicial officers to be part of SIR duty
- Shahid Afridi issues fiery challenge to Shadab Khan over 'India World Cup' comment
- ‘How can army ignore us?’: BLA releases video of captured Pakistan soldiers pleading for help
Featured in city
- Massive fire at Hyderabad coaching centre triggers chaos; 80+ students rescued
- Indian teacher Rouble Nagi, who painted educational murals across slums, wins Global Teacher Prize
- CCTV shocker: Told not to smoke while refuelling, man sparks inferno at petrol pump
- Delhi HC asks Kuldeep Singh Sengar's brother Jaideep to surrender in Unnao custodial death case
- Missing teen's head found hanging from tree in Varanasi forest
- 'My girl lay soaked in blood. No one helped: 6-year-old dies in hit-and-run in Delhi; grandmother recounts horror
Photostories
- Top 5 Tier-2 cities in India driving real estate growth in 2026
- 5 fascinating facts about Indian hill stations
- From Shikhar Dhawan to Rashmika Mandanna: Indian celebrities who found love again after bitter divorce or break up
- 6 animals that have mastered cave life and are rarely seen by humans
- 10 best rated Jackfruit dishes from around the world
- Why is your expensive gold and silver always wrapped in pink paper? The surprising truth behind this tradition
- Bringing back the style: Zendaya to Margot Robbie, all the celebs who are bringing back fashion trends this season
- Top 6 tallest buildings in India in 2026
- Archana Puran Singh reveals how she and husband Parmeet Sethi have slept separately for ‘7 years’; “We are on a sleep divorce”
- Priyanka Chopra turns a film premiere into a pirate-core runway with these 3 striking outfits
Videos
06:43 Mamata Banerjee vs Election Commission: Supreme Court Orders Judicial Oversight in Bengal SIR05:44 India Delivers Grim Terror Reminder To Pakistan After Khawaja Asif Alleges Delhi-Kabul Proxy Nexus06:20 MEA's Clears India's Position On Russia Oil After US Envoy Conveys Trump's Wish On Venezuelan Oil27:23 High Drama At AI Summit Amid Congress’ Shirtless Protests, BJP Calls Party ‘Shameless’ | Headlines@803:01 AI Prodigy Raul John Aju Thrilled After Meeting UN Chief in Delhi05:03 BJP Targets Rahul Gandhi Over AI Summit Protest Row07:51 Beyond Oil: How India & US Are Securing the Future of AI Chips04:21 Congress Paradox: Tharoor Lauds AI Summit While Youth Stages Shirtless Protest at Venue03:47 Ex-Pakistan PM Imran Khan Turned Down Two Serious Offers To Get Released From Jail, Claims Islamabad
Up Next
Start a Conversation
Post comment