In 2009 a new virus was discovered ,combining elements of bird flu & Seasonal flu the virus strain dubbed H1N1 and spread quickly similar to Spanish flu in 1918 that infected half billion and killed ten million. The worst part was that their was no vaccine & the authorities could only reduce its spreading speed.
United States Centers for Disease Control and Prevention (CDC) requested doctors to inform them about the new flu but the picture of pandemic that emerged was always a week or two out of date which completely blinded the public health agencies at the most crucial time. At that time engineers at the internet giant Google published a remarkable paper in a scientific journal Nature which created a splash among health official & computer scientist.
Since Google receives more than three billion queries everyday and saves all of them so that they have plenty of data to work with.
Google took 50 million most common search terms that Americans type and compared the list CDC data. The idea was to identify areas infected by the flu virus by what people searched for on the
internet but no one else other than Google had so much data processing power. What their system did was to look for correlations beteween the frequency of certain search quereies and the spread of the flu over time and space. They almost processed 450 million different mathematical models in order to test the search terms and compared it to actual CDC anlysis. Finally they found a combination of 45 search terms that when used together in a mathematical model had a strong correlation between their prediction and the official figures but unlike CDC they could tell it in real time and not a week or two.
United States Centers for Disease Control and Prevention (CDC) requested doctors to inform them about the new flu but the picture of pandemic that emerged was always a week or two out of date which completely blinded the public health agencies at the most crucial time. At that time engineers at the internet giant Google published a remarkable paper in a scientific journal Nature which created a splash among health official & computer scientist.
Since Google receives more than three billion queries everyday and saves all of them so that they have plenty of data to work with.
Google took 50 million most common search terms that Americans type and compared the list CDC data. The idea was to identify areas infected by the flu virus by what people searched for on the
internet but no one else other than Google had so much data processing power. What their system did was to look for correlations beteween the frequency of certain search quereies and the spread of the flu over time and space. They almost processed 450 million different mathematical models in order to test the search terms and compared it to actual CDC anlysis. Finally they found a combination of 45 search terms that when used together in a mathematical model had a strong correlation between their prediction and the official figures but unlike CDC they could tell it in real time and not a week or two.