Low Websites system

Robots Crawl Policy

A code-defined robots.txt via Next.js MetadataRoute that opens public routes, blocks /admin and /api, names the AI crawler user-agents explicitly (GPTBot, ClaudeBot, OAI-SearchBot and peers), and points to the sitemap — making crawl and AI-access policy reviewable in version control.

Start a Project All Systems

hmx__websites__

HMX Zone

TypeScript

1-2 days

timeline

Low

complexity

tools

steps

Next.js robots (MetadataRoute)

Verified HMX-owned system

System facts

Robots Crawl Policy uses a web app route, data, and conversion layer for Full-Stack Websites. A code-defined robots.txt via Next.js MetadataRoute that opens public routes, blocks /admin and /api, names the AI crawler user-agents explicitly (... The architecture connects implement app/robots, next, typescript, and clear with an explicit control path.

Outcome

Clear, versioned control over what search and AI crawlers may access, with private paths blocked and the sitemap advertised.

Main risk

An overly broad disallow accidentally deindexes public pages, or sensitive paths stay crawlable.

Prevention

Explicit allow/disallow lists reviewed in PRs, with /admin and /api always disallowed and public paths verified.

Fallback

On any generation error, serve a minimal allow-public/deny-admin policy plus the sitemap reference.

System architecture

Robots Crawl Policy Architecture

6 nodes

Implement app/robots

explicit user-agent groups

TypeScript

Fallback Path

Clear

01Implement app/robots
A code-defined robots.txt via Next.js MetadataRoute that opens public routes, blocks /admin and /api, names the AI crawler user-agents explicitly (...
02explicit user-agent groups
Add explicit user-agent groups for AI crawlers (GPTBot, ChatGPT-User, ClaudeBot, OAI-SearchBot, Claude-SearchBot)
03Next
Next.js robots (MetadataRoute) supports the route, form, or data boundary for Robots Crawl Policy so public UX and backend state stay connected.
04TypeScript
Reference the absolute sitemap URL so crawlers discover the full URL set
05Fallback Path
On any generation error, serve a minimal allow-public/deny-admin policy plus the sitemap reference.
06Clear
Clear, versioned control over what search and AI crawlers may access, with private paths blocked and the sitemap advertised.

1-2 days

How it is built

01Implement app/robots.ts returning rules that allow / and disallow /admin and /api for all agents
02Add explicit user-agent groups for AI crawlers (GPTBot, ChatGPT-User, ClaudeBot, OAI-SearchBot, Claude-SearchBot)
03Reference the absolute sitemap URL so crawlers discover the full URL set
04Keep the policy in code review so any change to AI access or blocked paths is auditable

Tools

Workflow surface

Next.js robots (MetadataRoute)
TypeScript

Experience layer: Implement app/robots.ts returning rules that allow / and disallow /admin and /api for all agents
Server layer: Add explicit user-agent groups for AI crawlers (GPTBot, ChatGPT-User, ClaudeBot, OAI-SearchBot, Claude-SearchBot)
Database layer: Next.js robots (MetadataRoute) supports the route, form, or data boundary for Robots Crawl Policy so public UX and backend state stay connected.
Automation layer: TypeScript handles routine steps while explicit allow/disallow lists reviewed in PRs, with /admin and /api always disallowed and public paths verified.
Measurement layer: Clear, versioned control over what search and AI crawlers may access, with private paths blocked and the sitemap advertised.

Data flow

01Implement app/robots.ts returning rules that allow / and disallow /admin and /api for all agents
02Add explicit user-agent groups for AI crawlers (GPTBot, ChatGPT-User, ClaudeBot, OAI-SearchBot, Claude-SearchBot)
03Reference the absolute sitemap URL so crawlers discover the full URL set
04Keep the policy in code review so any change to AI access or blocked paths is auditable

Controls and fallbacks

An overly broad disallow accidentally deindexes public pages, or sensitive paths stay crawlable.
Explicit allow/disallow lists reviewed in PRs, with /admin and /api always disallowed and public paths verified.
On any generation error, serve a minimal allow-public/deny-admin policy plus the sitemap reference.

Research basis

System path inside the website build

Full-stack websites for service businesses and operators: route architecture, service pages, lead capture, metadata, proof boundaries, blog/database paths, analytics, and deployment checks.

Route map

Service architecture

Clear service routes

01active

Progress72%

Lead capture

Form and context flow

Lead capture that saves context

02active

Progress86%

Public metadata

SEO and schema layer

SEO and schema on public pages

03active

Progress64%

Launch QA

Analytics and deployment checks

Analytics events tied to CTAs

04active

Progress91%