Google looks smart and its people behave smart, but that doesn't mean its algorithms are smart. Machine learning works well when it comes to images, not language. Google's dirty little secret is that its algorithms are quite dumb and have trouble understanding what they see and read.
Take this example of Google recently saying that its search algorithm will be trained to highlight original news stories such as scoops and investigative pieces...
Marc Tracy in The New York Times reports[1]:
"After weeks of reporting, a journalist breaks a story. Moments after it goes online, another media organization posts an imitative article recycling the scoop that often grabs as much web traffic as the original. Publishers have complained about this dynamic for years…"
This has been a problem since Google News launched in September 2002. Finally, the head of Google News, Richard Gingras[2], has responded:
"An important element of the coverage we want to provide is original reporting, an endeavor which requires significant time, effort, and resources by the publisher. Some stories can also be both critically important in the impact they can have on our world and difficult to put together, requiring reporters to engage in deep investigative pursuits to dig up facts and sources."
Foremski's Take:
Why has it taken Google more than 17 years to deal with this? Why does Google's algorithm need thousands of "raters" to help train it to recognize original news?
Gingras said that Google has updated its manual