KnowQL

Working Draft — May 2026

Introduction

This is the specification for KnowQL, a declarative query language for agent knowledge access — SQL’s equivalent for the agentic era.

A conforming implementation of KnowQL must fulfill all normative requirements described in this specification (see Conformance). The KnowQL specification is provided under the Apache License 2.0 (see Copyright and Licensing).

KnowQL was created in 2026 as part of the Pinecone Nexus platform. This open specification is published to enable ecosystem adoption across agent frameworks, knowledge services, and tooling.

This specification is developed on GitHub at pinecone-io/knowql-spec. Contributions and discussion are welcome.

KnowQL has evolved and may continue to evolve in future editions of this specification. The latest working draft can be found at https://spec.knowql.org/draft.

1Overview

KnowQL is a declarative query language for agent knowledge access. It provides a typed, composable set of primitives that agents use to retrieve, filter, compose, and ground information from knowledge contexts — without specifying how to fetch it.

For example, this KnowQL query asks whether a customer qualifies for a renewal discount, drawing from multiple knowledge contexts:

{
  "ask": "Does Acme Corp qualify for a renewal discount?",
  "scope": ["ctx_contracts", "ctx_usage", "ctx_pricing_policy"],
  "where": { "customer_id": "acme_corp_001" },
  "shape": {
    "qualifies": "Boolean!",
    "discount_pct": "Float",
    "applicable_rules": "[{ rule_id: ID!, reason: String }]"
  },
  "ground": { "per_field": true },
  "budget": { "max_tokens": 2000, "depth": "standard" }
}

Which produces a structured, grounded response:

{
  "data": {
    "qualifies": true,
    "discount_pct": 12.5,
    "applicable_rules": [
      {
        "rule_id": "R-42",
        "reason": "12-month contract renewal with >80% usage"
      }
    ]
  },
  "ground": {
    "qualifies": {
      "confidence": "high",
      "sources": ["ctx_contracts/acme_corp_001"]
    },
    "discount_pct": {
      "confidence": "medium",
      "sources": ["ctx_pricing_policy/tier-2"]
    }
  }
}

KnowQL is not a programming language capable of arbitrary computation. It is a language for expressing what an agent wants to know — not how to retrieve it. KnowQL does not mandate a particular retrieval strategy, embedding model, or storage backend. Instead, knowledge services map their capabilities to a uniform language, type system, and set of execution guarantees that KnowQL defines.

KnowQL has a number of design principles:

Agent-centric: KnowQL is driven by the requirements of AI agents and the engineers who build them. KnowQL starts with what an agent needs to know and builds the language and runtime necessary to deliver it reliably and efficiently.

Declarative: KnowQL queries declare what is needed, not how to retrieve it. Clause order does not change query semantics. The execution engine is free to choose its retrieval, synthesis, and scoring strategy.

Strongly typed: Every KnowQL service defines a schema of available contexts, fields, and shapes. Queries are validated against this schema before execution. Type errors are surfaced at query time, not response time.

Composable: The output of a KnowQL query is a valid input to another KnowQL query. Contexts compose. Results can be scoped into follow-up queries without additional round-trips.

Layered determinism: KnowQL draws a hard line between what is deterministic and what is probabilistic. Filter predicates, temporal constraints, precedence rules, and output shapes are enforced deterministically. Semantic matching and synthesis are probabilistic and are surfaced honestly as confidence values in the response. The contract is deterministic; the scoring is probabilistic and explicit.

Grounded: Provenance is a first-class feature of KnowQL. Every response can carry field-level citations and confidence scores. Agents and humans can audit where an answer came from and how certain the system is.

Introspectable: A KnowQL service’s schema — its contexts, shapes, links, and capabilities — is queryable at runtime using KnowQL itself. Agents can discover what is available before issuing queries.

Explainable: KnowQL provides an explain primitive that returns the execution plan for a query without running it. This enables debugging, evaluation pipelines, and cost estimation.

Transport-agnostic: KnowQL does not prescribe a serialization format or transport mechanism. The canonical representation is a typed JSON document.

Because of these principles, KnowQL is a productive environment for building agents that need reliable, auditable, and composable access to knowledge.

The following formal specification serves as a reference for implementers. It describes the query language and its grammar, the type system and introspection system, execution semantics, validation rules, and response format. The goal of this specification is to provide a foundation for an ecosystem of KnowQL tools, client libraries, agent frameworks, and knowledge service implementations.

2Language

A KnowQL document is the canonical unit of communication between a caller and a KnowQL service. It is represented as a UTF-8 encoded JSON object. A document contains exactly one query at the top level.

2.1Source Text

A KnowQL document is a valid JSON object conforming to this specification. :: A clause is a named field within a KnowQL document, consisting of a clause name and a clause value. Clause names are lowercase ASCII strings. Unknown clauses MUST be rejected by a conforming implementation unless the clause name begins with x-, which is reserved for implementation-specific extensions.

2.1.1Whitespace and Comments

JSON serialization rules apply for whitespace. Line-comment style (//) and block-comment style (/* */) are not part of the KnowQL document format, as these are not valid JSON. Implementation SDKs MAY provide a pre-processing layer that strips comments before parsing.

2.2Document Structure

A KnowQL document has the following top-level structure:

Document :
  QueryDocument
  ExplainDocument
  IntrospectionDocument

QueryDocument :
  { QueryClauses }

ExplainDocument :
  { "explain": true, QueryClauses }

IntrospectionDocument :
  { "introspect": IntrospectionTarget }

A QueryDocument is the standard form for retrieving knowledge.

An ExplainDocument includes "explain": true alongside query clauses. The service MUST return an execution plan instead of executing the query.

An IntrospectionDocument queries the service’s schema.

2.3Clauses

A primitive is a named clause with formal, defined semantics. Clauses are order-independent: a KnowQL service MUST produce equivalent results regardless of the order in which clauses appear in the document.

2.3.1Required Clauses

A valid QueryDocument MUST contain at least one of:

  • ask — a natural-language statement of intent
  • shape — a typed output schema

A document containing neither is invalid.

2.3.2Clause Reference

Clause Group Required Description
ask Intent see §2 Natural-language goal
shape Intent see §2 Declarative output schema
scope Intent optional Context / source binding
where Filter optional Exact-match metadata predicates
ground Provenance optional Field-level citation and confidence
budget Control optional Token / latency / depth envelope
explain Meta optional Return execution plan without executing
as_of Filter optional Point-in-time temporal constraint
since Filter optional Range-start temporal constraint
window Filter optional Temporal window (start + end)
link Composition optional Cross-context join on shared entities
resolve Composition optional Authority / precedence rules
trace Provenance optional Introspectable reasoning handle
await Control optional Sync / async / stream mode
apply Control optional Operator overrides

2.4Values

KnowQL clause values follow standard JSON value types:

Value :
  StringValue
  NumberValue
  BooleanValue
  NullValue
  ListValue
  ObjectValue

2.4.1StringValue

A JSON string. Used for ask, as_of, since, and string fields in where predicates.

2.4.2NumberValue

A JSON number. Used for numeric fields in where predicates and budget limits.

2.4.3BooleanValue

true or false. Used for explain and boolean flags.

2.4.4NullValue

null. A null value in a shape field carries semantic meaning: the field was requested but the service could not confidently resolve it. See the Execution section for null-with-confidence semantics.

2.4.5ListValue

A JSON array. Used for scope (list of context identifiers) and list-typed shape fields.

2.4.6ObjectValue

A JSON object. Used for where, shape, ground, budget, link, resolve, window, and trace.

2.5Identifiers

Context identifiers referenced in scope and link are opaque strings from the perspective of the KnowQL language. Their structure is defined by the service’s schema. By convention, service implementations use a namespace/name or ctx_name format, but this is not normative.

2.6Fragments

A shape fragment is a reusable named shape template that can be referenced by name. Shape fragments allow callers to define common output schemas once and reuse them across queries.

ShapeFragment :
  { "fragment": FragmentName, "shape": ShapeValue }

FragmentName :
  String

Fragment definitions are registered with the service schema. A fragment reference within a shape value uses a $ref key:

{
  "ask": "Summarize the contract terms",
  "scope": ["ctx_contracts"],
  "shape": { "$ref": "ContractSummary" }
}

Fragment support is OPTIONAL. Services MUST document whether fragments are supported.

2.7Schema Coordinates

A schema coordinate is a human-readable string that uniquely identifies a schema element within a KnowQL service’s schema. Schema coordinates follow the form:

SchemaCoordinate :
  ContextCoordinate
  FieldCoordinate

ContextCoordinate :
  ContextName

FieldCoordinate :
  ContextName . FieldName

Examples:

Element Schema Coordinate
Context ctx_contracts
Field on context ctx_contracts.customer_id
Nested field ctx_pricing_policy.tiers.discount_pct

Schema coordinates are used in error messages, ground source references, trace output, and tooling.

3Type System

KnowQL services define a schema that describes available contexts, their fields, the shapes they can produce, and the links between them. Queries are validated against this schema before execution.

3.1Type System Descriptions

Documentation is a first-class feature of the KnowQL type system. Every named definition in a schema SHOULD carry a human-readable description. These descriptions are returned via introspection and remain consistent with the service’s capabilities.

Descriptions are provided as plain strings or as Markdown (CommonMark). Description strings appear as a "description" field on each schema object.

{
  "context": "ctx_contracts",
  "description": "Active and historical contracts for all customers.",
  "fields": {
    "customer_id": {
      "type": "ID!",
      "description": "Unique identifier for the customer."
    }
  }
}

3.2Named Types

KnowQL uses a JSON Schema-aligned type notation for shape fields:

Type :
  NamedType
  ListType
  NonNullType

NamedType :
  String
  Int
  Float
  Boolean
  ID
  Date
  DateTime
  JSON

ListType :
  [ Type ]

NonNullType :
  Type !

3.2.1Scalars

A scalar is a primitive leaf type representing a single value. KnowQL defines the following built-in scalar types:

String: A UTF-8 character sequence.

Int: A signed 32-bit integer.

Float: A signed double-precision floating-point value.

Boolean: true or false.

ID: A unique identifier. Serialized as a string. Semantically opaque to the query language.

Date: A calendar date in ISO 8601 format: YYYY-MM-DD.

DateTime: A point in time in ISO 8601 format with timezone offset: YYYY-MM-DDTHH:mm:ssZ.

JSON: An arbitrary JSON value. Used for open-ended structured output. Services SHOULD avoid JSON in favor of more specific types where possible.

Custom scalars MAY be defined by a service schema. Custom scalars MUST be documented and SHOULD link to a specification.

3.2.2Objects

An object type defines a named set of fields, each with a type. Object types appear as values within shape.

{
  "shape": {
    "contract": {
      "id": "ID!",
      "start_date": "Date",
      "value_usd": "Float"
    }
  }
}

3.2.3Lists

A list type wraps another type and indicates that the value will be an ordered array.

{
  "shape": {
    "applicable_rules": [{ "rule_id": "ID!", "reason": "String" }]
  }
}

3.2.4Non-Null

A non-null type is formed by appending ! to a type name, indicating that the field must not be null in a valid response. If the service cannot resolve a non-null field, it MUST return an error for that field rather than a null value with low confidence.

{
  "shape": {
    "qualifies": "Boolean!",
    "discount_pct": "Float"
  }
}

In this example, qualifies must be present in the response. discount_pct may be null if the service cannot determine it with sufficient confidence.

3.3Contexts

A context is the fundamental unit of knowledge in KnowQL. It corresponds to a named, queryable source of information — analogous to a table or view in SQL, or a dataset in a vector database.

ContextDefinition :
  {
    "context": ContextName,
    "description": String?,
    "fields": { FieldName: FieldDefinition },
    "links": [ LinkDefinition ]?,
    "version": String?
  }

Each context MUST have:

  • A unique context name within the schema
  • A fields object defining the filterable and returnable fields

Each context MAY have:

  • description for documentation
  • links defining cross-context join targets
  • version for versioning (see Versioning)

3.4Fields

A field definition declares the type and metadata of a queryable field within a context.

FieldDefinition :
  {
    "type": TypeExpression,
    "description": String?,
    "filterable": Boolean?,
    "returnable": Boolean?
  }

filterable — if true, this field can appear in where predicates. Defaults to true for scalar fields.

returnable — if true, this field can appear in shape. Defaults to true.

3.5Shape Types

A shape defines the structure and types of the response object a caller expects. Shape definitions are validated against the schema at query time.

Shape types extend the base type system with two additional constructs:

TypedNull: When a field in the response is null because the service could not confidently resolve it (as opposed to being legitimately absent), the service SHOULD accompany the null with a confidence indicator in the ground response (see the Primitives and Response sections).

ConfidenceAnnotation: Every field in a shaped response MAY carry a confidence value (high, medium, low, or derived) in the ground section of the response. This is independent of the field’s type.

3.7Schema

A KnowQL schema is the top-level container for all type definitions accessible to a service:

Schema :
  {
    "version": SchemaVersion,
    "name": String?,
    "description": String?,
    "contexts": [ ContextDefinition ],
    "fragments": [ FragmentDefinition ]?,
    "directives": [ DirectiveDefinition ]?
  }

The schema is returned by the introspection system (see the Introspection section).

3.8Versioning

KnowQL schemas evolve additively. New contexts and fields MAY be added in minor versions. Existing contexts and fields MUST NOT be removed or have their types narrowed without a major version increment.

Additive changes (always safe):

  • Adding a new context
  • Adding a new field to an existing context
  • Adding a new link
  • Adding a new enum value

Breaking changes (require major version):

  • Removing or renaming a context
  • Removing or renaming a field
  • Changing a field’s type in a non-widening way
  • Changing a field from non-null to null or vice versa

Services SHOULD document their versioning policy. The RECOMMENDED versioning scheme is calendar-based (e.g., 2026-04) for major releases, following GraphQL’s precedent.

4Primitives

A KnowQL primitive is a named clause with defined, formal semantics. Each primitive belongs to a functional group. Primitives are composable: a query is a composition of primitives, and the engine reduces the composition to a single structured response.

This section defines the formal semantics of every KnowQL primitive.

4.1Group A: Intent

Intent primitives declare what the caller wants to know. They are the semantic core of a KnowQL query.

4.1.1`ask`

Required if `shape` is absent
ask : StringValue

ask is a natural-language statement of the caller’s goal. It is first-class: it is not buried in a request body, and it is not a prompt. The engine treats ask as the primary semantic signal for retrieval and synthesis.

Semantic retrieval is the process of identifying records relevant to a natural-language query using embedding-based or hybrid search, producing a ranked set of retrieved records with relevance scores.

Synthesis is the process of constructing a structured response from retrieved records, conforming to the declared shape and addressing the ask goal.

ask is probabilistic: the engine uses semantic matching to identify relevant content, and the quality of the answer depends on the quality of the indexed knowledge. The engine MUST reflect its confidence in the response’s ground section when ground is requested.

Formal semantics:

Given ask value q and resolved contexts from scope:

  1. The engine performs semantic retrieval against each context to find content relevant to q.
  2. The engine synthesizes a response that addresses q using the retrieved content.
  3. If shape is present, the response MUST conform to the declared shape.
  4. If ground is present, every response field MUST carry a confidence value.
Example:
{
  "ask": "What are the payment terms for Acme Corp's current contract?",
  "scope": ["ctx_contracts"]
}

4.1.2`shape`

Required if `ask` is absent
shape : ShapeValue

shape declares the typed structure of the expected response. It is a declarative output schema: the engine MUST conform the response to this shape, or return typed nulls with confidence annotations for fields it cannot resolve.

shape is deterministic in its structural contract: the response will always have the declared shape. It is probabilistic in its values: the engine may not be able to fill every field.

Formal semantics:

Given shape value S:

  1. The engine MUST return a data object whose top-level keys match the keys declared in S.
  2. For each field f in S:
    • If the engine resolves f with any confidence, it returns the value.
    • If the engine cannot resolve f and f is not marked ! (non-null), it returns null and MUST annotate f in the ground section with confidence: "none".
    • If f is marked ! (non-null) and the engine cannot resolve it, the engine MUST return a field-level error.
  3. The engine MUST NOT return fields not declared in S in the data object.
Example:
{
  "ask": "Does Acme Corp qualify for a renewal discount?",
  "scope": ["ctx_contracts", "ctx_pricing_policy"],
  "shape": {
    "qualifies": "Boolean!",
    "discount_pct": "Float",
    "reason": "String"
  }
}

4.1.3`scope`

Optional
scope : [ ContextIdentifier ]

scope binds the query to one or more named knowledge contexts. It is semantically equivalent to FROM in SQL: it declares where to look.

If scope is omitted, the engine searches all contexts accessible to the caller. Services MAY restrict the default scope or require explicit scoping.

Formal semantics:

Given scope value [c1, c2, ..., cn]:

  1. Each ci MUST be a context identifier present in the schema accessible to the caller. An unrecognized context identifier is a query error.
  2. The engine restricts all retrieval and synthesis operations to the union of the declared contexts.
  3. where predicates are applied within the scoped contexts.
  4. link operations are restricted to links declared between scoped contexts, or explicitly named in the link clause.
Example:
{
  "ask": "What is the pricing for enterprise tier?",
  "scope": ["ctx_pricing_policy"]
}

---

4.2Group B: Filter

Filter primitives narrow the result set deterministically. Their semantics are fully deterministic: a filter either includes or excludes a record with no probabilistic component.

4.2.1`where`

Optional
where : { FieldName: PredicateValue }

where applies exact-match metadata predicates to narrow the knowledge contexts before retrieval. It is semantically equivalent to WHERE in SQL.

Only fields declared as filterable: true in the schema MAY appear in where clauses. Filtering on a non-filterable field is a validation error.

Predicate forms:
Form Meaning
{ "field": "value" } Exact match
{ "field": { "$eq": "value" } } Explicit equality
{ "field": { "$ne": "value" } } Not equal
{ "field": { "$in": ["a","b"] } } Set membership
{ "field": { "$gt": 100 } } Greater than
{ "field": { "$gte": 100 } } Greater than or equal
{ "field": { "$lt": 100 } } Less than
{ "field": { "$lte": 100 } } Less than or equal

Compound predicates use $and and $or:

{
  "where": {
    "$and": [
      { "customer_id": "acme_corp_001" },
      { "status": { "$in": ["active", "pending"] } }
    ]
  }
}
Formal semantics:

Given where predicate P:

  1. The engine evaluates P against the filterable metadata of each record in the scoped contexts.
  2. Records that do not satisfy P are excluded before retrieval or synthesis.
  3. where predicates MUST be evaluated before semantic retrieval.
  4. The result of where filtering is deterministic for any given dataset.

4.2.2`as_of`

Optional
as_of : DateTimeValue

as_of constrains the query to knowledge that was valid at the specified point in time. This is a first-class temporal constraint, not a metadata filter.

Formal semantics:

Given as_of value t:

  1. The engine resolves the state of each scoped context as it existed at time t.
  2. Records created after t are excluded.
  3. Records that were superseded or deleted before t are included as they were at t.
  4. The engine MUST document the temporal model it uses (event time vs. ingest time).
Example:
{
  "ask": "What were the contract terms for Acme Corp?",
  "scope": ["ctx_contracts"],
  "where": { "customer_id": "acme_corp_001" },
  "as_of": "2025-12-31T23:59:59Z"
}

4.2.3`since`

Optional
since : DateTimeValue

since filters knowledge to records created or updated after the specified point in time.

4.2.4`window`

Optional
window : { "from": DateTimeValue, "to": DateTimeValue }

window filters knowledge to records valid within the specified time range. from and to are both inclusive.

---

4.3Group C: Composition

Composition primitives combine knowledge from multiple sources before returning. They execute entirely within the engine before the response is formed.

4.3.2`resolve`

Optional
resolve : ResolutionPolicy

ResolutionPolicy :
  { "precedence": [ ContextName ] }
  | { "strategy": "latest" | "highest_confidence" | "union" }

resolve declares how the engine should handle conflicting information about the same entity from different scoped contexts. Without resolve, the engine uses its default conflict strategy (implementation-defined).

Formal semantics:

Given resolve: { "precedence": [c1, c2, c3] }:

  1. When the same field is present in multiple contexts with different values, the engine uses the value from the highest-precedence context (earliest in the list).
  2. The ground section MUST record which context contributed each conflicted field and the resolution decision.

Given resolve: { "strategy": "latest" }:

  1. The engine uses the most recently updated value across all contexts.

Given resolve: { "strategy": "highest_confidence" }:

  1. The engine uses the value with the highest confidence score.

Given resolve: { "strategy": "union" }:

  1. The engine returns all values as a list with per-value provenance.

---

4.4Group D: Provenance

Provenance is the citable origin of a response value — the source records that were retrieved, the confidence with which the engine resolved each field, and the reasoning that connected sources to the final answer.

Provenance primitives ground the answer in citable knowledge.

4.4.1`ground`

Optional
ground : GroundOptions

GroundOptions :
  {
    "per_field": Boolean,
    "min_confidence": "high" | "medium" | "low"
  }

ground requests field-level citations and confidence scores in the response. When ground is present, the response MUST include a ground section alongside data (see the Response section).

A confidence level is an assertion by the engine about the quality of evidence supporting a response value. It is one of high, medium, low, derived, or none.

Confidence levels:
Level Meaning
high The answer is directly supported by retrieved source material with strong relevance.
medium The answer is supported by retrieved material but requires inference.
low The answer is weakly supported; significant inference was required.
derived The value was computed from other grounded fields, not retrieved directly.
none The engine could not resolve the field. Value is null.

per_field: When true, every field in the shape receives an individual confidence annotation and source list. When false, a single query-level confidence is returned.

min_confidence: When specified, any field resolved below this threshold is returned as null with the actual confidence in ground. This allows callers to enforce a quality floor declaratively.

Formal semantics:

Given ground: { "per_field": true } and shape S:

  1. For each field f in S, the engine MUST include in the ground response:
    • confidence: one of high, medium, low, derived, none
    • sources: a list of schema coordinates identifying the source records
  2. If min_confidence is set to level L, fields resolved below L MUST be set to null in data and annotated with the actual confidence in ground.
Example:
{
  "ask": "What is Acme Corp's annual contract value?",
  "scope": ["ctx_contracts"],
  "where": { "customer_id": "acme_corp_001" },
  "shape": { "acv_usd": "Float!" },
  "ground": { "per_field": true, "min_confidence": "medium" }
}

4.4.2`trace`

Optional
trace : TraceOptions

TraceOptions :
  {
    "enabled": Boolean,
    "handle": Boolean
  }

trace requests an introspectable reasoning trace. Unlike ground, trace captures the engine’s internal reasoning steps, not just the final provenance.

When handle: true, the response includes a trace_id opaque string that can be used to retrieve the full trace via a separate introspection call. This avoids bloating inline responses.

When handle: false, the trace is returned inline in the trace section of the response.

Trace content is implementation-defined. Services MUST document the trace format they return.

---

4.5Group E: Control

Control primitives declare the resource envelope and execution mode.

4.5.1`budget`

Optional
budget : BudgetOptions

BudgetOptions :
  {
    "max_tokens": Int,
    "max_latency_ms": Int,
    "depth": "shallow" | "standard" | "deep"
  }

budget declares the caller’s resource constraints. The engine MUST respect these as hard limits where possible and SHOULD communicate when a limit was binding in the response.

max_tokens — maximum number of tokens the engine may consume in synthesis.

max_latency_ms — maximum wall-clock time in milliseconds. If the engine cannot complete within this budget, it MUST return whatever it has with an appropriate warning in the response metadata.

depth — a declarative hint for retrieval depth:

  • shallow — prioritize speed; retrieve top-K most relevant records only.
  • standard — default behavior; balance completeness and latency.
  • deep — prioritize completeness; retrieve exhaustively within other budget constraints.
Formal semantics:
  1. budget constraints apply to the entire query execution.
  2. If max_tokens is exceeded, the engine MUST truncate synthesis and return a partial: true flag in the response metadata.
  3. depth is a hint, not a guarantee. The engine MAY honor it differently depending on implementation.

4.5.2`await`

Optional
await : "sync" | "async" | "stream"

await declares the execution mode:

sync (default): The response is returned synchronously when execution completes.

async: The engine accepts the query and returns immediately with a job_id. The caller polls or is notified when results are ready.

stream: The engine returns a stream of partial results as they become available. The format of the stream is transport-dependent.

4.5.3`apply`

Optional
apply : { DirectiveName: DirectiveValue }

A directive is a named, operator-defined override applied at query time via the apply clause that modifies the engine’s default behavior for the duration of the query.

apply allows callers to pass operator-defined directives at query time. These override service defaults for the duration of the query.

Common use cases:

  • Override the retrieval model
  • Override the synthesis model
  • Enable or disable specific retrieval strategies
  • Pass tenant-specific configuration

The set of supported directives is defined by the service schema. Unsupported directives MUST produce a validation warning (not an error) unless the service explicitly marks directives as required.

---

4.6Meta Primitive: explain

Meta Primitive
explain : true

An execution plan is a description of the steps the engine would take to execute a query, including retrieval strategy, synthesis approach, and estimated resource costs, returned without actually executing the query.

When explain: true is present in a document, the engine MUST return an execution plan. The execution plan describes what the engine would do, including:

  • The contexts it would search
  • The retrieval strategy for each context
  • The synthesis approach
  • Estimated token and latency costs
  • Any validation warnings

The explain response is returned in the plan field of the response object (see the Response section).

explain is essential for:

  • Debugging query behavior before execution
  • Evaluation pipeline dry-runs
  • Cost estimation
  • Automated agent planning loops
Example:
{
  "explain": true,
  "ask": "Does Acme qualify for a renewal discount?",
  "scope": ["ctx_contracts", "ctx_pricing_policy"],
  "where": { "customer_id": "acme_corp_001" }
}

Returns:

{
  "plan": {
    "steps": [
      {
        "type": "filter",
        "context": "ctx_contracts",
        "predicate": { "customer_id": "acme_corp_001" },
        "estimated_records": 1
      },
      {
        "type": "semantic_retrieval",
        "context": "ctx_pricing_policy",
        "query": "renewal discount qualification criteria",
        "strategy": "dense_vector",
        "estimated_records": 4
      },
      {
        "type": "synthesis",
        "model": "default",
        "estimated_tokens": 850
      }
    ],
    "estimated_total_tokens": 850,
    "estimated_latency_ms": 320
  }
}

5Validation

A KnowQL document MUST be validated against the service’s schema before execution. Validation ensures that a query is well-formed and semantically meaningful within the target schema without executing it.

Implementations SHOULD report all validation errors in a single pass rather than failing on the first error encountered.

5.1Document Validation

5.1.1Valid JSON

A KnowQL document MUST be a valid JSON object. Documents that fail JSON parsing MUST produce a parse error before any validation occurs.

5.1.2Known Clauses

Every clause key in the document MUST be a defined KnowQL primitive or a valid extension clause (prefixed with x-). Unknown clause names MUST produce a validation error.

Counter Example:
{
  "ask": "What are the contract terms?",
  "filter": { "customer_id": "acme_corp_001" }
}

filter is not a valid KnowQL clause. This document is invalid. The correct clause is where.

5.1.3Minimum Viable Query

A valid QueryDocument MUST contain at least one of ask or shape. A document containing neither is invalid.

Counter Example:
{
  "scope": ["ctx_contracts"],
  "where": { "customer_id": "acme_corp_001" }
}

This document is invalid: it has neither ask nor shape.

5.1.4Type Correctness of Clause Values

Each clause MUST receive a value of the correct type. The expected types are defined in the Primitives section. Providing a value of the wrong type MUST produce a validation error.

Counter Example:
{
  "ask": 42
}

ask requires a StringValue. 42 is a number. This document is invalid.

Counter Example:
{
  "ask": "What is the ACV?",
  "budget": 2000
}

budget requires an ObjectValue. 2000 is a number. This document is invalid.

5.2Scope Validation

5.2.1Context Existence

Every context identifier in scope MUST exist in the service schema accessible to the caller.

Counter Example:
{
  "ask": "What are the contract terms?",
  "scope": ["ctx_contracts", "ctx_does_not_exist"]
}

ctx_does_not_exist is not a defined context. This document is invalid.

5.2.2Caller Access

If the service implements access control, every context in scope MUST be accessible to the caller. Attempting to access a restricted context MUST produce an authorization error (not a validation error).

5.3Where Validation

5.3.1Filterable Fields

Every field referenced in a where predicate MUST be declared in the schema with filterable: true.

Counter Example:
{
  "ask": "Summarize contract terms",
  "scope": ["ctx_contracts"],
  "where": { "full_text_content": "annual" }
}

If full_text_content is declared as filterable: false, this document is invalid.

5.3.2Predicate Type Compatibility

The value in each predicate MUST be compatible with the declared type of the field.

Counter Example:
{
  "where": { "customer_id": { "$gt": "acme" } }
}

If customer_id is of type ID, the $gt operator is not valid for non- numeric types. This predicate is invalid.

5.3.3Known Operators

Predicate operators ($eq, $ne, $in, $gt, $gte, $lt, $lte, $and, $or) MUST be from the defined set. Unknown operators are validation errors.

5.4Shape Validation

5.4.1Shape Types

Every type expression in a shape value MUST be a valid KnowQL type or a custom scalar defined in the schema.

Counter Example:
{
  "ask": "What is the discount?",
  "shape": { "discount_pct": "Percentage" }
}

If Percentage is not a defined type in the schema, this document is invalid.

5.4.2No Empty Shape

A shape value MUST contain at least one field.

Counter Example:
{
  "ask": "What is the discount?",
  "shape": {}
}

An empty shape object is invalid.

5.5Ground Validation

5.5.1Ground Requires Shape

ground: { "per_field": true } requires shape to be present, since per-field grounding requires declared fields to annotate.

Counter Example:
{
  "ask": "What is the discount?",
  "ground": { "per_field": true }
}

Without shape, per_field: true cannot be meaningfully applied. This document is invalid.

5.5.2Valid Confidence Threshold

The min_confidence value in ground MUST be one of "high", "medium", or "low".

5.6Budget Validation

5.6.1Positive Budget Values

max_tokens and max_latency_ms MUST be positive integers when present.

Counter Example:
{
  "ask": "What is the discount?",
  "budget": { "max_tokens": -100 }
}

Negative token budgets are invalid.

5.6.2Valid Depth Values

depth MUST be one of "shallow", "standard", or "deep".

5.8Temporal Validation

5.8.1Valid DateTime Format

as_of, since, and the from / to fields of window MUST be valid ISO 8601 datetime strings.

5.8.2Window Order

In a window clause, from MUST be before or equal to to.

Counter Example:
{
  "window": { "from": "2026-12-31", "to": "2026-01-01" }
}

from is after to. This document is invalid.

5.8.3Temporal Clause Exclusivity

as_of and since and window MUST NOT all be present simultaneously. At most two temporal clauses may coexist: since and as_of (interpreted as a time range from since up through as_of). Using all three simultaneously is a validation error.

5.9Explain Validation

5.9.1Explain Value

explain MUST be the boolean true when present. Any other value is a validation error.

6Execution

This section describes the normative execution semantics of a KnowQL query. A conforming implementation MUST produce results observably equivalent to the algorithms described here, though it is free to use any internal strategy that produces the same results.

6.1Executing a Request

A KnowQL request carries a document, optional caller identity, and optional variable bindings. The service executes the request and produces a response.

ExecuteRequest(schema, document, caller, variables):
  1. Validate document against schema. If validation fails, return a request
     error (see Response section).
  2. If document contains "explain: true", return ExecutePlan(schema, document).
  3. If document contains "introspect", return ExecuteIntrospection(schema,
     document).
  4. Return ExecuteQuery(schema, document, caller, variables).

6.2Executing a Query

ExecuteQuery(schema, document, caller, variables):
  1. Let scope = ResolveScope(schema, document, caller).
  2. Let filtered = ApplyFilters(scope, document).
  3. Let linked = ApplyLinks(filtered, document).
  4. Let resolved = ApplyResolution(linked, document).
  5. Let budget = ResolveBudget(document).
  6. Let retrieved = Retrieve(resolved, document, budget).
  7. Let data = Synthesize(retrieved, document, budget).
  8. Let grounded = ApplyGround(data, retrieved, document).
  9. Return FormatResponse(data, grounded, document).

6.2.1ResolveScope

ResolveScope(schema, document, caller):
  1. If "scope" is present, let contextIds = document.scope.
  2. If "scope" is absent, let contextIds = all contexts accessible to caller
     in schema.
  3. For each id in contextIds:
     a. Assert id exists in schema. (Guaranteed by validation.)
     b. Assert caller has access to id.
  4. Return the set of context definitions for contextIds.

6.2.2ApplyFilters

ApplyFilters(contexts, document):
  1. If "where" is absent, return contexts unchanged.
  2. Let predicate = document.where.
  3. For each context in contexts:
     a. Evaluate predicate against each record's filterable metadata.
     b. Exclude records that do not satisfy the predicate.
  4. Apply temporal filters if "as_of", "since", or "window" are present.
  5. Return the filtered contexts.

Temporal filtering:

ApplyTemporalFilter(contexts, document):
  1. If "as_of" is present, include only records valid at document.as_of.
  2. If "since" is present, include only records created/updated after
     document.since.
  3. If "window" is present, include only records valid within the window.
  4. Return the temporally filtered contexts.

6.2.4ApplyResolution

ApplyResolution(contexts, document):
  1. If "resolve" is absent, use the implementation's default conflict
     strategy.
  2. For each field F that appears in multiple contexts with different values:
     a. If resolve.precedence is set, use the value from the highest-
        precedence context.
     b. If resolve.strategy is "latest", use the most recently updated value.
     c. If resolve.strategy is "highest_confidence", use the value that
        retrieval assigns the highest relevance score to.
     d. If resolve.strategy is "union", collect all values as a list.
  3. Record every resolution decision for the "ground" output.
  4. Return the resolved record set.

6.2.5Retrieve

Retrieve(contexts, document, budget):
  1. Let ask = document.ask (if present).
  2. Let shape = document.shape (if present).
  3. Derive a retrieval query from ask and/or the field names in shape.
  4. For each context in contexts:
     a. Perform semantic retrieval using the retrieval query.
     b. Respect budget.depth hint (shallow/standard/deep).
     c. Collect retrieved records with relevance scores.
  5. If budget.max_tokens is approaching, truncate retrieved records,
     prioritizing highest-relevance records.
  6. Return the retrieved record set with relevance scores.

Retrieval strategy is implementation-defined. Conforming implementations SHOULD use hybrid retrieval (combining dense vector search with sparse/keyword search) for standard and deep depth levels.

6.2.6Synthesize

Synthesize(retrieved, document, budget):
  1. Let ask = document.ask (if present).
  2. Let shape = document.shape (if present).
  3. Using the retrieved records as context:
     a. If shape is present, produce a JSON object conforming to shape.
     b. If ask is present without shape, produce a free-form response.
     c. If both are present, produce a shaped response that addresses ask.
  4. For each field f in shape:
     a. Attempt to extract or infer the value of f from retrieved records.
     b. If f cannot be resolved and f is non-null (!), raise a field error.
     c. If f cannot be resolved and f is nullable, set f to null.
  5. Track per-field confidence and source records for ground output.
  6. Return the synthesized data object.

6.2.7ApplyGround

ApplyGround(data, retrieved, document):
  1. If "ground" is absent, return null (no ground section in response).
  2. Let options = document.ground.
  3. If options.per_field is true:
     a. For each field f in data:
        i. Let confidence = the confidence level assigned during synthesis.
        ii. Let sources = the schema coordinates of records that contributed.
        iii. If options.min_confidence is set and confidence < min_confidence:
             - Set data[f] = null.
             - Set groundEntry.confidence = actual confidence.
             - Set groundEntry.suppressed = true.
        iv. Else record { confidence, sources } for f.
     b. Return a map of field -> groundEntry.
  4. If options.per_field is false:
     a. Compute an aggregate confidence for the entire response.
     b. Return { confidence, sources } for the whole response.

6.3Executing an Explain Request

ExecutePlan(schema, document):
  1. Validate document against schema.
  2. Build an execution plan as if executing the query, but do not perform
     retrieval or synthesis.
  3. Estimate costs (tokens, latency) based on schema statistics.
  4. Return a "plan" object (see Response section).

6.4Error Handling

6.4.1Request Errors

A request error is an error that occurs before or during validation, preventing execution entirely. Examples:

  • Invalid JSON
  • Unknown clause
  • Unrecognized context in scope
  • Validation rule violation

Request errors MUST be returned in the errors array with no data field.

6.4.2Field Errors

A field error is an error that occurs during synthesis for a single field, which does not necessarily prevent the rest of the response from being returned. Examples:

  • Non-null field could not be resolved
  • Budget exceeded during field synthesis

Field errors MUST be returned in the errors array alongside a partial data object. The errored field MUST be set to null in data.

6.4.3Partial Responses

A partial response is a response returned when a budget constraint prevented full execution, containing the fields that were synthesized and null for those that were not.

When budget.max_tokens or budget.max_latency_ms is exceeded:

  1. The engine MUST set partial: true in the response metadata.
  2. Fields that were synthesized MUST be present in data.
  3. Fields not yet synthesized MUST be null with a field error.

6.5Determinism Guarantees

The following aspects of execution are REQUIRED to be deterministic for any given dataset and query:

  • The set of records included or excluded by where predicates
  • The set of records included or excluded by temporal filters
  • The structure of the data object (keys and types match shape)
  • The resolution decisions made by resolve with deterministic strategies (precedence, latest)

The following aspects are explicitly probabilistic:

  • The relevance scores assigned during semantic retrieval
  • The values synthesized from retrieved records
  • The confidence levels assigned during synthesis
  • The resolve: highest_confidence strategy (based on retrieval scores)

Probabilistic aspects MUST be surfaced via the ground section. They MUST NOT be silently presented as certain.

7Introspection

Introspection is the capability of a KnowQL service to expose its own schema — its available contexts, fields, links, and shape fragments — for runtime discovery by agents and tooling.

A conforming KnowQL service MUST expose its schema for runtime discovery. Agents and tooling use introspection to discover available contexts, their fields and types, supported links, and shape fragments before issuing queries.

7.1Introspection Documents

An introspection request is a document with an introspect clause:

IntrospectionDocument :
  { "introspect": IntrospectionTarget }

IntrospectionTarget :
  "__schema"
  | "__context"  ":" ContextName
  | "__field"    ":" FieldCoordinate
  | "__trace"    ":" TraceId

7.1.1Schema Introspection

{ "introspect": "__schema" }

Returns the full schema descriptor. This is the primary discovery mechanism for agents and tooling.

Response:
{
  "data": {
    "__schema": {
      "version": "2026-04",
      "name": "Nexus Knowledge Service",
      "contexts": [
        {
          "name": "ctx_contracts",
          "description": "Active and historical contracts.",
          "fields": {
            "customer_id": { "type": "ID!", "filterable": true },
            "start_date": { "type": "Date", "filterable": true },
            "value_usd": { "type": "Float", "filterable": false }
          },
          "links": [
            {
              "from": "customer_id",
              "to": "ctx_usage.customer_id",
              "description": "Join contracts to usage metrics"
            }
          ]
        }
      ],
      "fragments": [],
      "knowql_version": "April2026"
    }
  }
}

7.1.2Context Introspection

{ "introspect": "__context:ctx_contracts" }

Returns the definition for a single context. Useful for agents that have already identified the relevant context and want field-level detail.

Response:
{
  "data": {
    "__context": {
      "name": "ctx_contracts",
      "description": "Active and historical contracts for all customers.",
      "fields": {
        "customer_id": {
          "type": "ID!",
          "description": "Unique customer identifier.",
          "filterable": true,
          "returnable": true
        },
        "start_date": {
          "type": "Date",
          "description": "Contract start date.",
          "filterable": true,
          "returnable": true
        },
        "value_usd": {
          "type": "Float",
          "description": "Annual contract value in USD.",
          "filterable": false,
          "returnable": true
        }
      },
      "links": [
        {
          "from": "customer_id",
          "to": "ctx_usage.customer_id",
          "description": "Links to usage metrics by customer"
        }
      ],
      "version": "2026-04"
    }
  }
}

7.1.3Field Introspection

{ "introspect": "__field:ctx_contracts.value_usd" }

Returns the definition for a single field. Uses schema coordinate syntax.

Response:
{
  "data": {
    "__field": {
      "coordinate": "ctx_contracts.value_usd",
      "type": "Float",
      "description": "Annual contract value in USD.",
      "filterable": false,
      "returnable": true,
      "context": "ctx_contracts"
    }
  }
}

7.1.4Trace Introspection

{ "introspect": "__trace:tr_abc123" }

Retrieves a previously issued reasoning trace by its handle ID. Only available when the original query used trace: { "handle": true }.

7.2Introspection Schema

The introspection system is itself typed. The schema introspection response conforms to the following type definitions:

__Schema :
  {
    version: String!
    name: String
    description: String
    contexts: [__Context!]!
    fragments: [__Fragment!]!
    knowql_version: String!
  }

__Context :
  {
    name: String!
    description: String
    fields: { FieldName: __Field }
    links: [__Link!]
    version: String
  }

__Field :
  {
    type: String!
    description: String
    filterable: Boolean!
    returnable: Boolean!
  }

__Link :
  {
    from: String!
    to: String!
    description: String
  }

__Fragment :
  {
    name: String!
    description: String
    shape: JSON!
  }

7.3First-Class Documentation

All schema elements — contexts, fields, links, fragments — SHOULD provide a description field. Descriptions are returned via introspection and MUST remain consistent with the service’s actual capabilities.

Descriptions SHOULD be written in Markdown (CommonMark). Tools that display descriptions SHOULD render Markdown.

Introspection is not just for machines. Human operators building agents, writing evals, or debugging production queries benefit from a clear, browsable description of the available knowledge.

7.4Stable Ordering

Introspection responses SHOULD return schema elements in a stable, consistent order. The RECOMMENDED order is the source order in which elements were defined in the schema. Stable ordering improves readability, diffs, and deterministic tooling behavior.

7.5Conformance Requirement

A conforming KnowQL implementation MUST support __schema introspection. Support for __context, __field, and __trace introspection is RECOMMENDED but not required for Core conformance.

8Response

A KnowQL response is a JSON object returned by the service after executing a request. The response format is defined here for all response types: query responses, explain responses, introspection responses, and error responses.

8.1Response Format

Every KnowQL response is a JSON object with the following possible top-level fields:

Response :
  {
    "data":   DataObject?,
    "ground": GroundObject?,
    "trace":  TraceObject?,
    "plan":   PlanObject?,
    "errors": [ ErrorObject ]?,
    "meta":   MetaObject?
  }

A response MUST contain at least one of data, errors, or plan.

A response MUST NOT contain both data (with a non-null value) and a request error in errors at the same time.

A response MAY contain both data and field errors in errors simultaneously (partial response).

8.1.1`data`

The data field contains the result of a successfully executed query. Its structure conforms to the shape declared in the query, or is free-form if no shape was declared.

When shape is present:

  • The data object MUST contain exactly the fields declared in shape.
  • Fields that could not be resolved MUST be null (subject to non-null rules).
  • Additional fields MUST NOT be present in data.

When shape is absent:

  • The data object’s structure is implementation-defined.
  • The implementation SHOULD attempt to address the ask query in a structured way.

8.1.2`ground`

The ground field is present when the query included "ground": { ... }.

GroundObject :
  {
    FieldName: FieldGroundEntry
  }
  |
  {
    "confidence": ConfidenceLevel,
    "sources": [ SchemaCoordinate ]
  }

FieldGroundEntry :
  {
    "confidence": ConfidenceLevel,
    "sources": [ SchemaCoordinate ],
    "suppressed": Boolean?,
    "resolution": ResolutionEntry?
  }

ConfidenceLevel :
  "high" | "medium" | "low" | "derived" | "none"

ResolutionEntry :
  {
    "strategy": String,
    "candidates": [ { "source": SchemaCoordinate, "value": JSON } ]
  }
Example:
{
  "data": {
    "qualifies": true,
    "discount_pct": 12.5,
    "reason": null
  },
  "ground": {
    "qualifies": {
      "confidence": "high",
      "sources": ["ctx_contracts/record_001"]
    },
    "discount_pct": {
      "confidence": "medium",
      "sources": ["ctx_pricing_policy/tier-2"]
    },
    "reason": {
      "confidence": "none",
      "sources": [],
      "suppressed": false
    }
  }
}

8.1.3`trace`

The trace field is present when the query included "trace": { "handle": false }.

When trace: { "handle": true } was used, the trace field contains a trace_id string instead of inline trace content:

{
  "data": { ... },
  "trace": { "trace_id": "tr_abc123def456" }
}

8.1.4`plan`

The plan field is present when the query included "explain": true. No data field is present in an explain response.

PlanObject :
  {
    "steps": [ PlanStep ],
    "estimated_total_tokens": Int,
    "estimated_latency_ms": Int,
    "warnings": [ String ]?
  }

PlanStep :
  {
    "type": StepType,
    "context": ContextName?,
    "predicate": JSON?,
    "query": String?,
    "strategy": String?,
    "estimated_records": Int?,
    "model": String?,
    "estimated_tokens": Int?
  }

StepType :
  "filter" | "semantic_retrieval" | "link" | "resolve" | "synthesis" | "ground"

8.1.5`errors`

The errors field is an array of error objects. It is present when one or more errors occurred during validation or execution.

ErrorObject :
  {
    "message": String!,
    "type": ErrorType!,
    "path": [ String | Int ]?,
    "locations": [ { "key": String } ]?
  }

ErrorType :
  "REQUEST_ERROR"
  | "VALIDATION_ERROR"
  | "FIELD_ERROR"
  | "BUDGET_EXCEEDED"
  | "AUTHORIZATION_ERROR"
  | "CONTEXT_NOT_FOUND"

message: A human-readable description of the error intended for developers. REQUIRED.

type: A machine-readable error classification. REQUIRED.

path: When the error is associated with a specific field in the response, path is a list of keys and/or indices that identifies the location of the field in data. OPTIONAL.

locations: When the error is associated with a specific clause in the request document, locations identifies which clause. OPTIONAL.

Example — request error:
{
  "errors": [
    {
      "message": "Context 'ctx_does_not_exist' is not defined in the schema.",
      "type": "CONTEXT_NOT_FOUND",
      "locations": [{ "key": "scope" }]
    }
  ]
}
Example — field error alongside partial data:
{
  "data": {
    "qualifies": true,
    "discount_pct": null
  },
  "errors": [
    {
      "message": "Field 'discount_pct' declared non-null but could not be resolved.",
      "type": "FIELD_ERROR",
      "path": ["discount_pct"]
    }
  ]
}

8.1.6`meta`

The meta field carries execution metadata that does not belong in data or ground:

MetaObject :
  {
    "partial": Boolean?,
    "tokens_used": Int?,
    "latency_ms": Int?,
    "knowql_version": String?,
    "request_id": String?
  }

partial: true when the response is incomplete due to a budget constraint. Callers MUST NOT treat a partial response as authoritative.

tokens_used: The number of tokens consumed during synthesis.

latency_ms: The wall-clock time in milliseconds for the full execution.

knowql_version: The version of the KnowQL spec this response conforms to.

request_id: An opaque identifier for the request, useful for logging and support.

8.2Query Response

A successful query response with both data and ground:

{
  "data": {
    "qualifies": true,
    "discount_pct": 12.5,
    "applicable_rules": [
      { "rule_id": "R-42", "reason": "12-month renewal with >80% usage" }
    ],
    "conflicts": []
  },
  "ground": {
    "qualifies": {
      "confidence": "high",
      "sources": ["ctx_contracts/record_acme_001"]
    },
    "discount_pct": {
      "confidence": "medium",
      "sources": ["ctx_pricing_policy/tier-2-rules"]
    },
    "applicable_rules": {
      "confidence": "medium",
      "sources": [
        "ctx_pricing_policy/rule_R42",
        "ctx_contracts/usage_summary_acme"
      ]
    },
    "conflicts": {
      "confidence": "high",
      "sources": []
    }
  },
  "meta": {
    "tokens_used": 1240,
    "latency_ms": 310,
    "knowql_version": "April2026",
    "request_id": "req_xk9a2b3c4d"
  }
}

8.3Error Response (Request Error)

When a request error prevents execution entirely, no data field is present:

{
  "errors": [
    {
      "message": "Query must contain at least one of 'ask' or 'shape'.",
      "type": "VALIDATION_ERROR",
      "locations": []
    }
  ]
}

8.4Explain Response

{
  "plan": {
    "steps": [
      {
        "type": "filter",
        "context": "ctx_contracts",
        "predicate": { "customer_id": "acme_corp_001" },
        "estimated_records": 1
      },
      {
        "type": "semantic_retrieval",
        "context": "ctx_pricing_policy",
        "query": "renewal discount qualification",
        "strategy": "dense_vector",
        "estimated_records": 4
      },
      {
        "type": "synthesis",
        "model": "default",
        "estimated_tokens": 850
      }
    ],
    "estimated_total_tokens": 850,
    "estimated_latency_ms": 320
  },
  "meta": {
    "knowql_version": "April2026",
    "request_id": "req_explain_zz001"
  }
}

8.5Introspection Response

Introspection responses follow the same envelope format with data containing the introspection result (see the Introspection section for examples).

8.6Serialization

KnowQL responses MUST be serialized as JSON. No other serialization format is defined in this specification. Transport-level encoding (e.g., gzip) is outside the scope of this specification.

Numeric values in responses MUST NOT use JSON serializations that are not universally parseable, such as Infinity, -Infinity, or NaN. These values MUST be represented as null with an appropriate error or confidence annotation.

AAppendix: Conformance

A conforming implementation of KnowQL must fulfill all normative requirements described in this specification. Conformance requirements are described in this document via both descriptive assertions and key words with clearly defined meanings.

The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative portions of this document are to be interpreted as described in IETF RFC 2119. These key words may appear in lowercase and still retain their meaning unless explicitly declared as non-normative.

A conforming implementation of KnowQL may provide additional functionality, but must not do so where explicitly disallowed or where it would otherwise result in non-conformance.

A.1Conformance Levels

A.1.1Core Conformance

A KnowQL Core conforming implementation MUST:

  1. Accept and execute QueryDocument values containing any combination of the core primitives: ask, shape, scope, where, ground, budget.
  2. Support the explain meta primitive.
  3. Support __schema introspection.
  4. Validate documents against the service schema before execution and return validation errors in the standard error format.
  5. Return responses in the format defined in the Response section.
  6. Implement the layered determinism guarantee: where predicates, shape structure, and confidence suppression via min_confidence MUST be deterministic for any given dataset.
  7. Return confidence annotations in the ground section when ground is requested.

A.1.2Extended Conformance

A KnowQL Extended conforming implementation additionally MUST:

  1. Support __context and __field introspection.
  2. Support shape non-null enforcement (field errors for unresolvable ! fields).
  3. Support the meta response object including tokens_used and latency_ms.
  4. Support the extended primitives: link, resolve, as_of, since, window, trace, await, apply as defined in the Primitives section.

A.2Conforming Algorithms

Algorithms in this specification are normative with respect to their observable results. Implementations may use any equivalent strategy. An implementation must not produce results that differ from the specified algorithms in any observable way.

A.3Non-Normative Portions

All contents of this document are normative except portions explicitly declared as non-normative.

Examples in this document are non-normative, and are presented to aid understanding. Examples are either introduced explicitly in prose (e.g., “for example”) or are set apart in labeled blocks.

Notes in this document are non-normative, and are presented to clarify intent, draw attention to edge cases, and answer common implementation questions.

BAppendix: Notation Conventions

This specification uses a number of notation conventions to describe technical concepts including document structure, grammar, algorithms, and data types.

B.1B.1 Grammar Notation

This specification describes the structure of KnowQL documents using a simplified JSON grammar notation. Non-terminal production rules use the following form:

NonTerminal :
  Component1 Component2
  | Alternative

In this notation:

  • NonTerminal names the construct being defined.
  • Components separated by newlines are alternatives.
  • Components on the same line are sequences.
  • ? following a component means it is optional.
  • [ ] denotes a list (JSON array).
  • { } denotes an object (JSON object).
  • | separates alternatives in a compact list.
  • Quoted strings (e.g., "ask") represent literal JSON string values.

B.2B.2 Algorithm Notation

Algorithms in this specification use pseudo-code with the following conventions:

  • Let x = expression — binds a name to a value.
  • Assert: condition — asserts that a condition must be true; if false, the algorithm has a bug.
  • If condition: ... — conditional branch.
  • For each x in collection: ... — iteration.
  • Return value — terminates the algorithm with a result.
  • Raise error — terminates the algorithm with an error condition.

Algorithms are normative with respect to their observable results. Equivalent implementations are conforming.

B.3B.3 RFC 2119 Key Words

Normative key words (MUST, SHOULD, MAY, etc.) are from IETF RFC 2119. They retain their defined meaning when written in uppercase throughout this specification.

B.4B.4 Confidence Ordering

Confidence levels form a total order for the purposes of min_confidence comparison:

none < low < medium < high

derived is a special case: it represents a computed value and is treated as equivalent to medium for threshold comparisons unless otherwise specified.

B.5B.5 Schema Coordinate Syntax

Schema coordinates use dotted-path notation:

ContextName           → identifies a context
ContextName.FieldName → identifies a field within a context

In source references within ground, schema coordinates MAY include a record identifier appended with /:

ctx_contracts/record_acme_001

This is an informational extension and implementations may use any opaque string format for record identifiers within schema coordinates.

B.6B.6 Type Expression Syntax

Type expressions in shape values follow a shorthand notation aligned with JSON Schema and GraphQL:

Expression Meaning
"String" Nullable string
"String!" Non-null string
["String"] Nullable list of nullable strings
["String!"] Nullable list of non-null strings
["String!"]! Non-null list of non-null strings
{ "field": "Type" } Nested object

In shape values, the type expression is provided as the value for the field key:

{
  "shape": {
    "name": "String!",
    "score": "Float",
    "tags": ["String"]
  }
}

§Index

  1. confidence level
  2. context
  3. directive
  4. document
  5. execution plan
  6. field definition
  7. field error
  8. Introspection
  9. KnowQL schema
  10. link definition
  11. non-null type
  12. object type
  13. partial response
  14. primitive
  15. Provenance
  16. request
  17. request error
  18. response
  19. scalar
  20. schema coordinate
  21. Semantic retrieval
  22. shape
  23. shape fragment
  24. Synthesis
  1. 1Overview
  2. 2Language
    1. 2.1Source Text
      1. 2.1.1Whitespace and Comments
    2. 2.2Document Structure
    3. 2.3Clauses
      1. 2.3.1Required Clauses
      2. 2.3.2Clause Reference
    4. 2.4Values
      1. 2.4.1StringValue
      2. 2.4.2NumberValue
      3. 2.4.3BooleanValue
      4. 2.4.4NullValue
      5. 2.4.5ListValue
      6. 2.4.6ObjectValue
    5. 2.5Identifiers
    6. 2.6Fragments
    7. 2.7Schema Coordinates
  3. 3Type System
    1. 3.1Type System Descriptions
    2. 3.2Named Types
      1. 3.2.1Scalars
      2. 3.2.2Objects
      3. 3.2.3Lists
      4. 3.2.4Non-Null
    3. 3.3Contexts
    4. 3.4Fields
    5. 3.5Shape Types
    6. 3.6Link Types
    7. 3.7Schema
    8. 3.8Versioning
  4. 4Primitives
    1. 4.1Group A: Intent
      1. 4.1.1`ask`
      2. 4.1.2`shape`
      3. 4.1.3`scope`
    2. 4.2Group B: Filter
      1. 4.2.1`where`
      2. 4.2.2`as_of`
      3. 4.2.3`since`
      4. 4.2.4`window`
    3. 4.3Group C: Composition
      1. 4.3.1`link`
      2. 4.3.2`resolve`
    4. 4.4Group D: Provenance
      1. 4.4.1`ground`
      2. 4.4.2`trace`
    5. 4.5Group E: Control
      1. 4.5.1`budget`
      2. 4.5.2`await`
      3. 4.5.3`apply`
    6. 4.6Meta Primitive: explain
  5. 5Validation
    1. 5.1Document Validation
      1. 5.1.1Valid JSON
      2. 5.1.2Known Clauses
      3. 5.1.3Minimum Viable Query
      4. 5.1.4Type Correctness of Clause Values
    2. 5.2Scope Validation
      1. 5.2.1Context Existence
      2. 5.2.2Caller Access
    3. 5.3Where Validation
      1. 5.3.1Filterable Fields
      2. 5.3.2Predicate Type Compatibility
      3. 5.3.3Known Operators
    4. 5.4Shape Validation
      1. 5.4.1Shape Types
      2. 5.4.2No Empty Shape
    5. 5.5Ground Validation
      1. 5.5.1Ground Requires Shape
      2. 5.5.2Valid Confidence Threshold
    6. 5.6Budget Validation
      1. 5.6.1Positive Budget Values
      2. 5.6.2Valid Depth Values
    7. 5.7Link Validation
      1. 5.7.1Link Context Existence
      2. 5.7.2Link Field Existence
      3. 5.7.3Link Scope Consistency
    8. 5.8Temporal Validation
      1. 5.8.1Valid DateTime Format
      2. 5.8.2Window Order
      3. 5.8.3Temporal Clause Exclusivity
    9. 5.9Explain Validation
      1. 5.9.1Explain Value
  6. 6Execution
    1. 6.1Executing a Request
    2. 6.2Executing a Query
      1. 6.2.1ResolveScope
      2. 6.2.2ApplyFilters
      3. 6.2.3ApplyLinks
      4. 6.2.4ApplyResolution
      5. 6.2.5Retrieve
      6. 6.2.6Synthesize
      7. 6.2.7ApplyGround
    3. 6.3Executing an Explain Request
    4. 6.4Error Handling
      1. 6.4.1Request Errors
      2. 6.4.2Field Errors
      3. 6.4.3Partial Responses
    5. 6.5Determinism Guarantees
  7. 7Introspection
    1. 7.1Introspection Documents
      1. 7.1.1Schema Introspection
      2. 7.1.2Context Introspection
      3. 7.1.3Field Introspection
      4. 7.1.4Trace Introspection
    2. 7.2Introspection Schema
    3. 7.3First-Class Documentation
    4. 7.4Stable Ordering
    5. 7.5Conformance Requirement
  8. 8Response
    1. 8.1Response Format
      1. 8.1.1`data`
      2. 8.1.2`ground`
      3. 8.1.3`trace`
      4. 8.1.4`plan`
      5. 8.1.5`errors`
      6. 8.1.6`meta`
    2. 8.2Query Response
    3. 8.3Error Response (Request Error)
    4. 8.4Explain Response
    5. 8.5Introspection Response
    6. 8.6Serialization
  9. AAppendix: Conformance
    1. A.1Conformance Levels
      1. A.1.1Core Conformance
      2. A.1.2Extended Conformance
    2. A.2Conforming Algorithms
    3. A.3Non-Normative Portions
  10. BAppendix: Notation Conventions
    1. B.1B.1 Grammar Notation
    2. B.2B.2 Algorithm Notation
    3. B.3B.3 RFC 2119 Key Words
    4. B.4B.4 Confidence Ordering
    5. B.5B.5 Schema Coordinate Syntax
    6. B.6B.6 Type Expression Syntax
  11. CAppendix: Copyright and Licensing
  12. §Index