Google admits massive document leak related to search algorithm is authentic (2024)

Thomas Barrabi

·4 min read

Google has confirmed that a massive leak of some 2,500 internal documents related to its search engine is authentic – and one expert said the trove shows that “Google tells us one thing and they do another” when it comes to its mysterious algorithms.

The tech giant has been secretive about how its search engine works even as it has wielded outsize influence over the flow of information, traffic and ad revenue online.

Some details appeared to contradict past public statements by Google employees regarding which factors are and are not used to calculate rankings.

For example, a Google Search employee said in 2016 that the company doesn’t “have a website authority score.”

The company has also explicitly denied using Chrome data in search rankings.

Information in the documents, however, suggests that Google considers click rates, data from its Chrome web browser, website size and a factor called “domain authority” – a measure of a website’s importance or relevance on a particular subject – to guide rankings.

Google admits massive document leak related to search algorithm is authentic (1)

“The main takeaway here is Google tells us one thing and they do another,” iPullRank CEO Michael King, who published the first analysis of the trove, told The Post.

“These documents give us clarity on that,” King added. “We don’t have the recipe that Google is using for search, but we now have a really clear indication of what the ingredients are.”

Some experts, including the trade publication Search Engine Land, have noted the documents mention modules that suggest Google implements “whitelists” for certain topics, including searches related to elections (IsElectionAuthority) and the COVID-19 pandemic (IsCovidLocalAuthority).

King said the references are likely Google’s attempt to identify “quality sources” on a given subject.

Details about how the whitelists may operate are scant, but Google has faced allegations of exhibiting a left-wing bias for years. A recent analysis by media company AllSides found that63% of articles on Google News were from left-leaning outlets, compared to just 6% from right-leaning sources.

An analysis by right-leaning watchdog Media Research Center detailed 41 alleged instances of “election interference” at the online search giant since 2008.

The report cited data from Dr. Robert Epstein, whoonce testified to the Senate Judiciary Committeethat “biased search resulted generated by Google’s search algorithm” shifted “at least 2.6 million votes to Hillary Clinton.”

Google admits massive document leak related to search algorithm is authentic (2)

Google has long denied it is bias against conservative viewpoints and has said Epstein’s research is “widely debunked.”

The leaked search documents allegedly contain more than 14,000 ranking factors that Google considers when organizing websites – from news outlets like The Post to small business owners and beyond.

The internal data reportedly surfaced on the online code repository GitHub in March, but it did not receive public scrutiny until search engine optimization (SEO) experts Rand Fishkin and King obtained and posted separate breakdowns.

Google tacitly confirmed that the documents are real – though it warned that they lacked important context and shouldn’t be used by the public to glean any insights about how search works.

“We would caution against making inaccurate assumptions about Search based on out-of-context, outdated or incomplete information,” Google spokesperson Davis Thompson said in a statement.

“We’ve shared extensive information about how Search works and the types of factors that our systems weigh, while also working to protect the integrity of our results from manipulation,” the statement added.

Google admits massive document leak related to search algorithm is authentic (3)

Google also warned that the documents are not a comprehensive, relevant or up-to-date view of its Search ranking algorithm.

It’s still unclear if Google has actually implemented any of the ranking factors detailed in documents or was merely testing or experimenting with them. Some may have never been used at all.

Even if they were in use, it’s essentially impossible to assess how important they are in crafting what users see in search results.

The documents did not reveal how the ranking features are weighted.

The leaked documents provide an interesting, yet incomplete view of the company’s inner workings on search, according to Barry Schwartz, a prominent SEO expert and owner of the web consultancy RustyBrick.

Schwartz said the documents are best seen as a signal of “what Google is thinking about” as it relates to online search.

“How Google does that around certain factors like links and content quality and authority and authors – all of that’s in there,” Schwartz said. “The question is, we don’t know what they’re weighted, how important are these signals, are they used at all. That’s the issue with this.”

Nevertheless, the documents amount to “the biggest leak that we’ve ever seen come out of Google for search,” according to King.

“This is the biggest, most transparent that we’ve ever seen into how Google functions,” King said.

Google admits massive document leak related to search algorithm is authentic (2024)

References

Top Articles
Latest Posts
Article information

Author: Melvina Ondricka

Last Updated:

Views: 5993

Rating: 4.8 / 5 (48 voted)

Reviews: 95% of readers found this page helpful

Author information

Name: Melvina Ondricka

Birthday: 2000-12-23

Address: Suite 382 139 Shaniqua Locks, Paulaborough, UT 90498

Phone: +636383657021

Job: Dynamic Government Specialist

Hobby: Kite flying, Watching movies, Knitting, Model building, Reading, Wood carving, Paintball

Introduction: My name is Melvina Ondricka, I am a helpful, fancy, friendly, innocent, outstanding, courageous, thoughtful person who loves writing and wants to share my knowledge and understanding with you.