A quick, plain-English look at which automated visitors we welcome on the real estate sites we host — and which ones we turn away. If you’re an agent or broker wondering why “bots” keep coming up, start with the companion article: Bot management for real estate websites.
Roughly half of all web traffic today is automated. Some of it is essential — the search engines that help buyers find you. A lot of it is just noise, and some of it is actively trying to copy your listings, hammer your forms, or probe for a way in. So every site we host sorts arriving automated traffic into two buckets: a short allow list of trusted bots, and everything else, which is challenged or denied.
✓ Allow list — trusted bots we let through
Search & index
- Googlebot — Google Search (incl. Image/Video/News/Mobile)
- bingbot — Microsoft / Bing
- DuckDuckBot — DuckDuckGo
- Applebot — Apple / Siri / Spotlight
- YandexBot, Baiduspider — Yandex, Baidu
Google ecosystem
- Google-InspectionTool, GoogleOther, Google-Site-Verification, AdsBot-Google, Mediapartners-Google, Storebot-Google, Google-Safety — Search Console, Shopping, ads & quality, abuse scanning
- Google-Extended — the AI-training opt-in/out token (Gemini)
- adidxbot — Microsoft Advertising / Bing Ads
AI search & assistants
- ClaudeBot, Claude-SearchBot, Claude-User, anthropic-ai — Anthropic
- GPTBot, ChatGPT-User, OAI-SearchBot, OAI-AdsBot — OpenAI
- PerplexityBot, Perplexity-User — Perplexity
- DuckAssistBot, Amazonbot, Amzn-SearchBot, Mistral, CCBot (Common Crawl), PetalBot — other AI / training
Social & link previews
- facebookexternalhit / Meta crawlers, Twitterbot (X), LinkedInBot, Pinterestbot, WhatsApp, Slack, Discordbot, TelegramBot, TikTok, Skype/Teams — so shared links show a proper preview
Platform & archive
- IDX embed wrappers, Iframely, Embedly — embeds
- Zapier — integration & automation webhooks
- Internet Archive (Wayback) — archiving
- Cloudflare — our own edge/network services
✕ Deny list — what we challenge or block
Content scrapers & data harvesters
- Bots whose purpose is to copy listings, photos, or market data wholesale — including aggressive SEO/backlink crawlers we don’t permit (e.g. DataForSeoBot).
Security scanners & exploit tools
- Vulnerability and intrusion tools such as sqlmap, nikto, nmap, masscan, nuclei, and zgrab.
Credential-stuffing & brute-force agents
- Automated traffic hammering login and form endpoints to guess passwords or submit spam.
Anonymous & unidentified automation
- Requests with no real browser identity, missing or empty user-agents, and generic scripted clients that aren’t on the allow list.
Everything else automated, by default
- If a visitor is clearly a bot and isn’t on the allow list above, it doesn’t get in. Verified search and AI bots are the exception — not the rule.
We intentionally don’t publish our exact detection rules — that would just hand a roadmap to the bots we’re keeping out.
- Bot management for real estate websites — the friendly, fuller explainer.
- The busiest page on your site is the one that doesn’t exist — why dead listing URLs draw so much bot traffic.
- We built your website a second website — just for the robots — how we keep bots off the server real buyers use.
- Signal & Noise — the full series on bots, 404s, and protecting your site.
Want your site’s bot handling reviewed, or have a tool you need allowed? Get in touch — or email support@virtualresults.net. VR members can request a hardening review at no additional cost.