Skip to main content

Everything about Google News Technology

Google News made the first public appearance in Dec 2001 sourcing news from only 100 websites. Today, Google News scans 4,500 different websites in real time, determines which news stories are related and then groups them based on importance. And there aren't any journalists to work on the service, Google News is managed entirely by computer programs.

Google News started as a small demo created by one Google engineer on a weekend after he was frustrated trying to read news after Sept 11 event. More Googlers started to use it to read their news. Google then assigned 3 people (one was UI designer) and one PM to work on it. The rest is history.

Google News includes articles that have appeared within the past 30 days.

The inventor of Google News:

Krishna Bharat, the Google Principal Scientist, is the brain behind Google News.

John Burke covers the WEF conference where Krishna Bharat expressed his views about his own baby, Google News - "immune to local biases and constraints," like a web site "hosting a conversation about stories in the news" where "all newspapers are invited" and the host "won't take sides" and "won't have a point of view."

Krishna Bharat, Google Principal Scientist, Inventor of Google NewsKrishna further says "Our relationship with newspapers is symbiotic; we sent traffic directly to the content provider and we do not have special relationships, and we amplify the amount of news being read." "I am hoping, he says," to have all newspapers participate with Google News; for our readers we want an interesting debate that makes them think."

Krishna, how does Google news plan to deal with Rumours, Humour and false stories ? Axis of Logic took the top spot on Google News when it reported about George Bush arrest in Ottawa. See Picture

List of Google News sources:

Though Google remains silent when it comes to disclosing the details of their news sources, Newsknife.com is tracking Google news since early 2004 and they recently released a list of ~ 3000 news sources which feature on Google news. Private Radio is another Google News Tracking site which released a list of some 5136 news sources grouped by country and frequency of update.

Is Google News Biased

At PrivateRadio, JohnnyMC reasons that there are some human editors working behind the scenes based on rankings of Chinese Xinhua's and Voice of America. At the very least, their coders are weighting to favor certain sites. Eric Ulken of Online Journalism Review studied the 2004 US election and conclude that Google News results suggest a political bias and several relatively obscure, online-only news sources (read, weblogs) figure in the Google News sources list.

Google Patent Trustrank

Google is filing a patent for TrustRank, a technology that aims to sort Google news results by quality rather than simply by "date" and "relevance" to search terms.

Google database will be built by continually monitoring the number of stories from all news sources, along with average story length, number with bylines, and number of the bureaux cited, along with how long they have been in business, the number of staff a news source employs, the volume of internet traffic to its website and the number of countries accessing the site.

Google will take all these parameters, weight them according to formulae it is constructing, and distil them down to create a single value. This number will then be used to rank the results of any news search.

There are some interesting and valid points mentioned in the Google Research paper on Combating Web Spam with TrustRank.

Q. How is TrustRank different from applying a weighting to PageRank ?

A. It attempts to detect clusters of pages which have few inbound links, which also propagating "trust" scores to all other sites by using their linking structure. For sites that have many inbound links (high scroring in pagerank), the authors claim this modification tends to classify spam and reputable sites differently.

Q. What if such an owner decides to link to a page of commercial or spam links ?

A. The paper suggests using only highly reputable organizations with long-term stability for the seed pages. Government organizations, universities, very well known companies.

Popular posts from this blog

How to Download Contacts from Facebook To Outlook Address Book

Facebook users are not too pleased with the "walled garden" approach of Facebook. The reason is simple - while you can easily import your Outlook address book and GMail contacts into Facebook, the reverse path is closed. There's no "official" way to export your Facebook friends email addresses or contact phone numbers out as a CSV file so that you can sync the contacts data with Outlook, GMail or your BlackBerry. Some third-party Facebook hacks like "Facebook Sync" (for Mac) and "Facebook Downloader" (for Windows) did allow you to download your Facebook friends' names, emails, mobile phone number and profile photo to the desktop but they were quickly removed for violation of Facebook Terms of Use. How to Download Contacts from Facebook There are still some options to take Friends data outside the walls of Facebook wall. Facebook offers the Takeout option allowing you to download all Facebook data locally to the disk (include

Digital Inspiration

Digital Inspiration is a popular tech blog by  Amit Agarwal . Our popular Google Scripts include  Gmail Mail Merge  (send personalized emails with Gmail ),  Document Studio (generate PDFs from Google Forms ) and   File Upload Forms ( receive files  in Google Drive). Also see  Reverse Image Mobile Search , Online Speech Recognition and Website Screenshots , the most useful websites on the Internet.

PhishTank Detects Phishing Websites by Digg Style Voting

OpenDNS, a free service that helps anyone surf the Internet faster with a simple DNS tweak , will announce PhishTank today. PhishTank is a free public database of phishing URLs where anyone can submit their phishes via email or through the website. The submissions are verified by the other community members who then vote for the suspected site. This is such a neat idea as sites can be categorized just based on user feedback without even having to manually verify each and every submission. PhishTank employs the "feedback loop" mechanism where users will be kept updated with the status' of the phish they submit either via email alerts or a personal RSS feed . Naturally, once the PhishTank databases grows, other sites can harness the data using open APIs which will remain free. OpenDNS would also use this data to improve their existing phishing detection algorithms which are already very impressive and efficient. PhishTank | PhishTank Blog [Thanks Allison] Related: Google