get_content_types Architecture

Discover the full data model of the source site — content types, field schemas, and relationships

Overview

Part of the Architecture Toolkit. After pages are discovered, DoneDone.Run uses LLM analysis to identify structured content types present on the site — Product, Blog Post, Review, FAQ, etc. — along with their field schemas (field names, data types, sample values), page associations, and inter-type relationships (e.g. Product has_many Reviews).

How It Works

  1. After route extraction completes during discovery, the content type extraction pipeline fires automatically in the background.
  2. An LLM analyzes all discovered pages, their URLs, titles, and page types to identify distinct content types.
  3. For each content type, the LLM extracts a field schema with field names, data types (text, number, image_url, rich_text, array, boolean, date), and sample values from actual pages.
  4. Inter-type relationships are identified (has_many, belongs_to, has_one) to build a complete data model.
  5. Results are stored in the database and linked to the pages and route patterns they were extracted from.

Input Parameters

ParameterTypeRequiredDescription
job_id string required The clone job ID returned by discover_all_pages

What You Get Back

Example Use Case

An agent calls get_content_types to learn that airbornesensor.com has 9 content types including Product Page (5 fields), Blog Post (9 fields), and Contact Page (4 fields), with 12 inter-type relationships. The agent uses this data model to design the WordPress/WooCommerce schema for the cloned site.

Tips

Run after discover_all_pages completes — content types are extracted automatically after route discovery.
Content types are linked to route patterns from Phase 2, giving you both URL structure and data model.
Sample values show real data from the source site, useful for understanding field semantics.
Relationships help you set up WordPress taxonomies, custom fields, and post type connections.
Re-running discovery will re-extract content types with fresh LLM analysis.

Related Tools