An AI 'Spy' on My Website? Results of a Real-time Undercover Investigation of the Big 4 AI Bots

AI Summary

A researcher gave unique links to the four major AI bots and monitored server logs, revealing significant differences in how each AI collects information and its level of 'honesty'.

Imagine you’ve created a secret room containing very precious information and sent invitations with different name tags to four friends. You are hiding behind the door, secretly watching to see who actually enters the room and what name tag they are wearing. What if an invited friend takes off their name tag and sneaks in, or doesn’t enter the room at all but lies, saying, “I’ve seen everything inside”?

Recently, a researcher executed this exact scenario in the digital world. The targets were the ‘Big 4’ AI assistants we use every day: ChatGPT, Claude, Perplexity, and Gemini. AI Traffic from Chatbots: HN Experiment - PromptZone

When we tell an AI, “Go to this link and summarize the content,” this experiment verified whether they truly visit the site in real-time or if they are just pulling out old information they stored previously. The results of this exciting ‘undercover investigation’ might completely change how we view AI.

Why does this matter?

We often ask AI to summarize the latest news, this morning’s stock prices, or a blog post that was just published. If the AI doesn’t visit the website in real-time, you run the risk of believing month-old information as if it happened today.

In simple terms, it’s a task to check whether the AI is a ‘capable detective who actually goes out for field research’ or a ‘librarian who only flips through old newspaper scrapbooks.’ This difference directly impacts the accuracy and vitality of information. Especially in 2026, the era of ultra-powerful AIs like GPT-5.2 and Gemini 3 Pro, ‘transparency’ in how they fetch information has become the core of technological trust. ChatGPT vs Claude vs Gemini vs Perplexity: 2026… - Y Build

Understanding Easily: Tracking the AI’s ‘Footprints’

The researcher utilized the Nginx (a server program that leaves website visit records) logs as a ledger. Just as we sign an entry log when we go to a restaurant, a website server meticulously records who came in, when, and through what path. [AI traffic vs referral traffic: what nginx logs prove

SurfacedBy](https://surfacedby.com/blog/nginx-logs-ai-traffic-vs-referral-traffic)

1. Assigning unique name tags

The researcher didn’t just give the AIs a link; they added a special code to the end of the URL.

ChatGPT was given an address containing /?ai=chatgpt
Claude was given an address containing /?ai=claude

This way, one can tell at a glance which AI visited just by looking at the ‘footprints’ left in the server records. No matter how advanced Transformer technology (the core structure of AI that understands meaning by grasping the context of sentences) becomes, the physical visit traces left in the server ledger cannot be faked.

2. “No old records allowed!”

To prevent the AIs from answering by recycling records of previous visits (professionally called a ‘cache hit’), the researcher re-ran the prompts multiple times. They monitored in real-time to see if the AIs went through the trouble of fetching new information every single time. AI traffic vs referral traffic: what nginx logs prove | SurfacedBy

Investigation Results: Who visited honestly?

The experimental results were quite shocking. In particular, Google’s Gemini and Anthropic’s Claude showed completely different attitudes.

Gemini’s ‘Stealth’ Mode

Google’s pride, Gemini, is a smart assistant that helps with everything from writing to scheduling. Google Gemini However, in this experiment, Gemini showed an unexpected side. It was found that when visiting websites, it did not clearly wear a ‘User-agent’ (a string containing the visitor’s identification information) name tag. I prompted ChatGPT, Claude, Perplexity, and Gemini and watched my Nginx logs | Hacker News

To use an analogy, it’s like a customer entering a restaurant with their face completely covered, sitting down without a name tag, eating, and leaving. The researcher raised deep questions about why Google collects information while hiding its identity and whether this is an intentional ‘stealth’ act.

Claude’s ‘Honest’ Confession

On the other hand, Claude received the opposite evaluation. Its creator, Anthropic, has emphasized from the beginning that Claude was trained to be ‘safe, honest, and secure.’ Claude

According to actual user experiences, when Claude encounters content it doesn’t know, it honestly confesses, “I’m sorry, but I don’t know much about that part,” rather than forcing a fabricated answer. I cancelled my ChatGPT, Perplexity, and Gemini subscriptions for Claude — and I should have sooner

While other AIs might act like ‘people-pleasers’ by making up fake information to satisfy the user, Claude plays the role of an honest friend who knows how to say it doesn’t know what it doesn’t know. This honesty is a powerful weapon that makes people choose Claude in business or research fields.

Current Situation: The AI Bot Warring States Period

As of 2026, the artificial intelligence market is a veritable battlefield. Giant models like GPT-5.2, Claude Sonnet 4.6, and Gemini 3 Pro are competing by pouring out new features every month. ChatGPT vs Claude vs Gemini vs Perplexity: 2026… - Y Build

As performance improves, the side effects are also significant. Tools like ZeroGPT, which detect AI-written text, have already secured millions of users and become essential services. AI Detector - Trusted AI Checker for ChatGPT, GPT5 & Gemini For us to truly believe AI answers, the way they fetch information from where must be disclosed more transparently.

Meanwhile, Perplexity, an AI specialized in search, remains a powerful tool, but it has faced criticism for leaving some technical issues neglected for over a year. This shows that there are clear differences in reliability and technical maturity among AI services. r/AIAssisted on Reddit: Chat GPT vs Grok vs Gemini vs Claude vs Perplexity

What will happen next?

In the future, AIs will navigate the web world even more sophistically and craftily. Some AIs will try to become ‘shadows’ that skim information behind the owner’s back, while others will try to become ‘confident guests’ who identify themselves fairly and take information.

As users, what we need to do is clear. Rather than simply being impressed that an answer is fast and fluent, we must constantly ask, “Did this AI really verify the information at this very moment?” ‘Grassroots monitoring’ activities, like this experiment where individuals directly monitor AI behavior through server records, are expected to become even more important in the future.

Is your AI assistant really out there in the rough internet field for you at this very moment? Or is it deceiving you by merely repeating old memories in a warm room?

AI Perspective: MindTickleBytes AI Reporter’s View

The way AI explores the web is much like how we borrow books from a library. Some AIs leave a transparent checkout record, while others might sneak in and just take photos of the book’s contents. As technology becomes more sophisticated, the transparency of the source—the ‘how did you know’ rather than the ‘what do you know’—will become the most important metric determining that AI’s value.

References

[I prompted ChatGPT, Claude, Perplexity, and Gemini and watched my Nginx logs Hacker News](https://news.ycombinator.com/item?id=47835646)
AI Traffic from Chatbots: HN Experiment - PromptZone - Leading AI Community for Prompt Engineering and AI Enthusiasts
[AI traffic vs referral traffic: what nginx logs prove SurfacedBy](https://surfacedby.com/blog/nginx-logs-ai-traffic-vs-referral-traffic)
I cancelled my ChatGPT, Perplexity, and Gemini subscriptions for Claude — and I should have sooner
r/AIAssisted on Reddit: Chat GPT vs Grok vs Gemini vs Claude vs Perplexity
Google Gemini
ChatGPT vs Claude vs Gemini vs Perplexity: 2026… - Y Build
AI Detector - Trusted AI Checker for ChatGPT, GPT5 & Gemini
Claude
Practical guide to choosing between ChatGPT, Claude…

FACT-CHECK SUMMARY

Claims checked: 11
Claims verified: 11
Verdict: PASS

Share this article:

Test Your Understanding

Q1. What method did the researcher use in this experiment to distinguish between different AI bots?

Asked the AI for its name
Gave each AI a link containing a unique query string (/?ai=...)
Tracked the AI's IP address

The researcher gave each AI assistant a prompt containing a unique query string (e.g., /?ai=chatgpt) to distinguish them in the server logs.

Q2. Which AI was found to not leave clear 'User-agent' information identifying itself when visiting the website?

ChatGPT
Claude
Gemini

According to the experiment results, Google's Gemini was reported to not use a distinct User-agent string to identify itself when accessing the website.

Q3. What is one of the key characteristics of Claude as evaluated by reviewers?

Speaks as if everything is the absolute answer
Is more likely to admit not knowing what it doesn't know
Always provides the longest answers

Claude is evaluated as being more likely to say it doesn't know rather than forcing a fabricated answer when asked questions beyond its knowledge or capabilities.