Why Did Google Gemini “Leak” Chat Data?

Feb 14, 2024 | SEO News Feeds | 0 comments

Why Did Google Gemini Leak Chat Data.png

SEO Content Writing Service

It only took twenty four hours after Google’s Gemini was publicly released for someone to notice that chats were being publicly displayed in Google’s search results. Google quickly responded to what appeared to be a leak. The reason how this happened is quite surprising and not as sinister as it first appears.

@shemiadhikarath tweeted:

“A few hours after the launch of @Google Gemini, search engines like Bing have indexed public conversations from Gemini.”

They posted a screenshot of the site search of gemini.google.com/share/

But if you look at the screenshot, you’ll see that there’s a message that says, “We would like to show you a description here but the site won’t allow us.”

By early morning on Tuesday February 13th the Google Gemini chats began dropping off of Google search results, Google was only showing three search results. By the afternoon the number of leaked Gemini chats showing in the search results had dwindled to just one search result.

Screenshot of Google's search results for pages indexed from the Google Gemini chat subdomain

How Did Gemini Chat Pages Get Created?

Gemini offers a way to create a link to a publicly viewable version of a private chat.

Google does not automatically create webpages out of private chats. Users create the chat pages through a link at the bottom of each chat.

Screenshot Of How To Create a Shared Chat Page

Screenshot of how to create a public webpage of a private Google Gemini ChatScreenshot of how to create a public webpage of a private Google Gemini Chat

Why Did Gemini Chat Pages Get Indexed?

The obvious reason for why the chat pages were crawled and indexed is because Google forgot to put a robots.txt in the root of the Gemini subdomain, (gemini.google.com).

Attorney Websites For Sale 4ebusiness Media Group

A robots.txt file is a document for controlling crawler activity on websites. A publisher can block specific crawlers by using commands standardized in the Robots.txt Protocol.

I checked the robots.txt at 4:19 AM on February 13th and saw that one was in place:

Google Gemini robots.txt fileGoogle Gemini robots.txt file

I next checked the Internet Archive to see how long the robots.txt file has been in place and discovered that it was there since at least February 8th, the day that the Gemini Apps were announced.

Screenshot From Internet Archive

Screenshot of Google Gemini robots. txt from Internet Archive showing it was there on February 8, 2024.Screenshot of Google Gemini robots. txt from Internet Archive showing it was there on February 8, 2024.

That means that the obvious reason for why the chat pages were crawled is not the correct reason, it’s just the most obvious reason.

Although the Google Gemini subdomain had a robots.txt that blocked web crawlers from both Bing and Google, how did they end up crawling those pages and indexing them?

Two Ways Private Chat Pages Discovered And Indexed

  • There may be a public link somewhere.
  • Less likely but maybe possible is that they were discovered through browsing history linked from cookies.

It’s likelier that there’s a public links.

I asked Bill Hartzer (@bhartzer) about it and he discovered a public link for one of the indexed pages:

Public link to a Google Gemini shared chat pagePublic link to a Google Gemini shared chat page

So now we know that it’s highly likely that a public link caused these Gemini Chat pages to be crawled and indexed.

Bill Hartzer offered this observation:

“Even though the Gemini URL is being blocked in the robots.txt file, there is a link to the Gemini URL in a blog comment, so that Gemini URL is getting indexed.

This just goes to show that Google will still index URLs that are blocked from crawling in the robots.txt file.

If Google really wanted to make sure that Gemini URL is not indexed, they would ALLOW crawling in the robots.txt file and add a noindex meta tag on the pages. Maybe Google should follow it’s own advice here?”

Why Did Chat Pages Begin Dropping Out Of Search Results?

But if there’s a public link then why did Google start dropping chat pages altogether? Did Google create an internal rule for the search crawler to exclude webpages from the /share/ folder from the search index, even if they’re publicly linked?

Insights Into How Bing and Google Search Index Content

Now here’s the really interesting part for all the search geeks interested in how Google and Bing index content.

The Microsoft Bing search index responded to the Gemini content differently from how Google search did. While Google was still showing three search results in the early morning of February 13th, Bing was only showing one result from the subdomain. There was a seemingly random quality to what was indexed and how much of it.

Why Did Gemini Chat Pages Leak?

Here are the known facts:

  • Google had a robots.txt in place since the February 8th.
  • Both Google and Bing indexed pages from the gemini.google.com subdomain.
  • Both Google and Bing may have discovered links to the chats and subsequently indexed them.
  • The search engines indexed the content regardless of the robots.txt and then began dumping them.

That brings us back to the question of why these pages started dropping off of the search results of both Google and Bing. My guess is that the Google Gemini chat pages are low quality webpages that are not worth showing for what are essentially longtail searches (site:gemini.google.com/share/). There’s really no useful reason to surface these pages in the search results.

Content that is blocked by Robots.txt can still be discovered, crawled and end up in the search index and if the pages are useful they can also rank, unless they are not useful. I think this may be the case.


Source link

Anxiety Stress Management

Live a Life of Contentment eBook We all want to be satisfied, even though we know some people who will never be that way, and others who see satisfaction as a foreign emotion that they can’t hope to ever feel.

Newspaper Ads Canyon Crest CA

Click To See Full Page Ads

Click To See Half Page Ads

Click To See Quarter Page Ads

Click To See Business Card Size Ads

If you have questions before you order, give me a call @ 951-235-3518 or email @ canyoncrestnewspaper@gmail.com Like us on Facebook Here

You May Also Like


Submit a Comment

Your email address will not be published. Required fields are marked *

Contact Us

Contact Us

Personal Injury Attorney

Websites For Sale Personal Injury Attorneys

Criminal Defense Attorneys

Websites For Sale Criminal Defense Attorney

Bankruptcy Attorneys

Websites For Sale Bankruptcy Attorneys

General Practice Attorneys

Websites For Sale General Practice Attorneys

Family Attorneys

Websites For Sale Family Attorneys

Corporate Attorneys

Websites For Sale Corporate Attorneys

Home Privacy Policy Terms Of Use Anti Spam Policy Contact Us Affiliate Disclosure Amazon Affiliate Disclaimer DMCA Earnings Disclaimer