get_page_templates Architecture
Derive page templates on-the-fly from existing architecture facets — no new pipeline step
Overview
Part of the Architecture Toolkit. Returns the set of page templates for a clone job, derived on-the-fly each call by joining facets already in the database: route patterns, design system clusters, dominant content types, and ordered shared page elements. The tool is read-only and stateless — it adds no DB rows and runs no LLM calls. Because it derives results live, outputs may shift after any underlying facet is re-extracted. Template IDs are deterministic hashes (tpl_<16hex>) and labels are auto-synthesized, not curated. Missing facets degrade gracefully and are reported in extraction_status.missing_inputs.
How It Works
- For each page, the tool joins the deepest matching route pattern, the page's design system cluster (canonical preferred), the dominant content type for the page's route (highest page-count association), and the ordered list of shared page element IDs the page is associated with.
- These four facets are concatenated into a deterministic signature, hashed to produce a stable template_id (tpl_<16-hex>).
- Pages with identical signatures are grouped into one template. Missing facets degrade gracefully — they appear as empty slots in the signature and the template is still returned.
- Each template gets a synthesized label preferring '<content type label> — <route pattern>', falling back to the route pattern, then to a hash-suffixed placeholder.
- Templates are returned sorted by page_count descending. A representative sample of up to 50 pages is always included; pass include_pages=true (or template_id=...) to get the full page list, capped by pages_per_template_limit.
Input Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
job_id |
string |
required | Cloning job ID |
include_pages |
boolean |
optional | If true, include the full page list per template (capped by pages_per_template_limit). Default false. |
pages_per_template_limit |
number |
optional | Max pages per template when include_pages or template_id is set. Default 100, hard max 10000. |
template_id |
string |
optional | Optional: return only this template with its full page list. |
representative_pages_limit |
number |
optional | Max pages in representative_pages per template (always returned). Default 50, hard max 200. |
What You Get Back
- template_id — stable hash-derived ID (tpl_<16-hex>)
- label — synthesized from dominant content type and route pattern
- page_count — number of pages matching this template signature
- route_pattern_id and route_pattern — the deepest matching route, if any
- dominant_content_type_id and label — most common content type on this route
- design_system_id and label — design system cluster the pages belong to
- ordered_shared_elements — sorted list of element id/label/type that compose the template
- representative_pages — up to 50 sample pages (always returned)
- pages — full page list when include_pages=true or template_id is set
- pages_truncated — boolean per template indicating whether the page list was capped by pages_per_template_limit
- extraction_status — response-level summary with per-facet page coverage and a missing_inputs array (e.g. ['design_systems','content_types']) so callers can tell which facets have not yet been extracted
Example Use Case
Before scaffolding a clone, an agent calls get_page_templates to learn the source site has 7 distinct templates: Product Detail (47 pages), Blog Post (23), Category Listing (12), and four singletons. Each template returns its route pattern, design system, content type, and ordered shared elements (header → nav → product-card-grid → footer), so the agent can build one component skeleton per template instead of one per page.
