Skip to content

AI Implementation Guide

This guide covers the core functions, optional library features, behavior choices, and test-suite workflow for implementing CCL (Categorical Configuration Language).

CCL is a minimal configuration language based on key-value pairs with recursive structure. The core insight: if a value contains = characters, it can be parsed as nested CCL. This recursive fixed-point parsing is what creates hierarchy from flat text.

Required functions:

  • parse - Convert text to flat key-value entries
  • build_hierarchy - Convert entries to nested structure

Everything else is optional. Typed access, filtering, and formatting are library conveniences.

ResourceURL
Documentationhttps://ccl.tylerbutler.com
Test Suitehttps://github.com/CatConfLang/ccl-test-data
TypeScript Implementationhttps://github.com/CatConfLang/ccl-typescript
Gleam Implementationhttps://github.com/tylerbutler/ccl_gleam
OCaml Implementationhttps://github.com/chshersh/ccl
Rust (ccl-rs)https://github.com/hon-gyu/ccl-rs
Rust (serde_ccl)https://github.com/LechintanTudor/serde_ccl
Rust (sickle)https://github.com/tylerbutler/santa/tree/main/crates/sickle

Converts raw CCL text into a flat list of key-value entries.

Signature:

parse(text: string) -> List[Entry]

Entry type:

Entry {
key: string // Configuration key (empty string for list items)
value: string // Raw value (may contain nested CCL syntax)
}

Algorithm:

  1. Find the first = character in the input
  2. Everything before = is the key (trimmed of all whitespace including newlines)
  3. Construct the value string (see precise rules below)
  4. Repeat for remaining input

The value string is assembled as follows:

  1. First line: Everything after = on the same line, with leading whitespace trimmed.
  2. Continuation lines: Each subsequent line with indentation > N (the baseline) is appended verbatim, preserving its leading whitespace.
  3. Joining: Lines are joined with \n separators.
  4. Final trim: Trailing whitespace is trimmed from the complete result.

Empty first line with continuations: When the text after = is empty (or whitespace-only) and continuation lines follow, the trimmed first line is empty, so the value begins with \n followed by the first continuation line.

For example, database =\n host = localhost produces value "\n host = localhost" — the empty first line becomes "", joined with \n to the continuation " host = localhost".

Single-line values are the trimmed text after =: key = value"value".

Example:

name = Alice
database =
host = localhost
port = 5432

Parses to:

[
Entry {key: "name", value: "Alice"},
Entry {key: "database", value: "\n host = localhost\n port = 5432"}
]

Key rules:

  • Split on first = only: a = b = c → key: a, value: b = c
  • Trim all whitespace from keys (including newlines): " key ""key", "key \n""key"
  • Empty key = value → list item (key is empty string)
  • Comment entry /= text → key is /, value is text

Value rules (summary — see “Value construction” above for full algorithm):

  • First line: trim leading whitespace after =: key = value → value is "value"
  • Continuation lines: preserved verbatim, including their leading whitespace
  • Final line: trim trailing whitespace: key = value → value is "value"
  • Lines joined with \n; internal newlines and indentation preserved

Indentation handling:

The baseline indentation (N) determines which lines are continuations vs new entries. How N is determined depends on the toplevel_indent_strip vs toplevel_indent_preserve behavior:

  • toplevel_indent_strip (OCaml reference): Top-level parsing uses N=0; nested parsing uses first content line’s indent
  • toplevel_indent_preserve (simpler): Always use first content line’s indent for all contexts

For each subsequent line, count its leading whitespace:

  • If indentation > N → continuation line (append to value)
  • If indentation ≤ N → new entry (stop parsing current value)

Note: With toplevel_indent_preserve, you only need one parsing algorithm. With toplevel_indent_strip, you need context detection to distinguish top-level from nested parsing. See Continuation Lines for details.

Whitespace counting: Both spaces and tabs count as indentation whitespace; CCL counts characters, not visual columns. See Behavior Reference — Tab Handling for the related choice about how leading tabs on continuation lines are normalized.

See Parsing Algorithm for complete details.


Converts flat entries into a nested object structure via recursive parsing.

Signature:

build_hierarchy(entries: List[Entry]) -> CCL

Return type:

build_hierarchy always returns a map (object/dict), even when all entries have empty keys. The observable structure of a CCL value is:

  • String — terminal value (no = in content, fixed point reached)
  • Map — nested object (from recursive parsing of a value containing =)
  • List — array of values (from multiple entries sharing the same key)

Different languages encode this differently. The pseudocode uses:

CCL = Map[string, CCL | string | List[CCLValue]]

The OCaml reference uses a uniform recursive type Fix of t Map.Make(String).t where every value is a nested map (strings are represented as single-key maps). What matters is the observable output, not the internal type encoding.

Algorithm:

function build_hierarchy(entries):
result = {}
for entry in entries:
if entry.key == "":
# Empty key = list item (accumulate under "" key)
accumulate_list(result, "", entry.value)
else if contains_ccl_syntax(entry.value):
# Value has '=' → parse recursively
nested_entries = parse(entry.value)
result[entry.key] = build_hierarchy(nested_entries)
else:
# Terminal value (fixed point reached)
result[entry.key] = entry.value
return result
function contains_ccl_syntax(value):
return "=" in value

List accumulation for empty keys: When sibling entries share the empty key "" inside a parent value, their values are collected into a list. For example, input users =\n = alice\n = bob produces {"users": ["alice", "bob"]} — the bare entries become a flat list of strings under the parent key. When the bare entries themselves contain nested CCL (e.g., each = is followed by an indented name = ... block), their values are recursively built into objects, producing a list of objects. See Bare List Hierarchy Representation for the canonical shape.

Fixed-point termination: Recursion stops when values contain no = characters. Plain strings like "localhost" or "5432" have no structure to parse.

Example:

Input entries:

[
Entry {key: "database", value: "\n host = localhost\n port = 5432"},
Entry {key: "users", value: "\n = alice\n = bob"}
]

After recursive parsing:

{
"database": {
"host": "localhost",
"port": "5432"
},
"users": ["alice", "bob"]
}

Note: users is a flat list of strings because each bare entry (= alice, = bob) has a string value. If the bare entries contained nested CCL, users would be a list of objects instead. See Bare List Hierarchy Representation.

Handling special cases:

  • Empty keys: Multiple entries with empty key "" accumulate into a list stored in the map under key ""
  • Duplicate keys: Merge values or convert to list (implementation choice)
  • Nested values: Any value containing = is parsed recursively

See Parsing Algorithm for the complete algorithm with examples.


These provide convenient, type-safe value extraction. All are optional library features.

get_string(ccl: CCL, ...path: string[]) -> string | Error

Navigate to path and return string value. Error if path not found or value is not a string.

Example: get_string(config, "database", "host") navigates to config["database"]["host"].

get_int(ccl: CCL, ...path: string[]) -> int | Error

Navigate to path, parse value as integer. Error if not a valid integer.

get_bool(ccl: CCL, ...path: string[]) -> bool | Error

Navigate to path, parse value as boolean.

Behavior choice:

  • boolean_strict: Only "true" and "false"
  • boolean_lenient: Also accepts "yes"/"no", "1"/"0"
get_float(ccl: CCL, ...path: string[]) -> float | Error

Navigate to path, parse value as floating-point number.

get_list(ccl: CCL, ...path: string[]) -> List[string] | Error

Navigate to path, return list of values.

Behavior choice:

  • list_coercion_enabled: Single value returns [value]
  • list_coercion_disabled: Error if not actually a list

Path navigation: Pass each path segment as a separate argument: get_string(config, "database", "host")

See Library Features for implementation details.


filter(entries: List[Entry], predicate: fn(Entry) -> bool) -> List[Entry]

Filter entries based on predicate. Common use: remove comments (entries where key starts with /).

compose(entries1: List[Entry], entries2: List[Entry]) -> List[Entry]

Concatenate entry lists. This is a monoid operation - entries form a monoid under composition with empty list as identity.

See Library Features for details on entry processing.


print(ccl: CCL) -> string

Render a CCL value back to text. Implementations that support structure-preserving printing must retain enough source structure to preserve comments, ordering, and formatting.

canonical_format(ccl: CCL) -> string

Convert CCL object to standardized text format. Semantic-preserving: Normalizes formatting but preserves meaning.

Key difference:

  • print preserves original structure (comments, ordering, formatting)
  • canonical_format produces normalized output from the parsed model

See Library Features: Formatting for details.


CCL implementations make choices about edge cases. Declare your choices and the test suite will filter appropriately.

Behavior GroupOptionsDescription
Continuation Baselinetoplevel_indent_strip / toplevel_indent_preserveTop-level N=0 (reference) or N=first key’s indent (simpler)
Line Endingscrlf_preserve_literal / crlf_normalize_to_lfKeep \r chars or normalize to LF
Boolean Parsingboolean_strict / boolean_lenientOnly true/false or also yes/no
Tab Handlingcontinuation_tab_to_space / continuation_tab_preserveLeading tabs on continuation lines: normalize to space (OCaml reference) or preserve verbatim
Delimiterdelimiter_first_equals / delimiter_prefer_spacedSplit on the first = or prefer spaced = when present
Indentationindent_spaces / indent_tabsOutput formatting style
List Coercionlist_coercion_enabled / list_coercion_disabledSingle value as one-item list
Array Orderingarray_order_insertion / array_order_lexicographicPreserve order or sort

See Behavior Reference for detailed documentation of each behavior.


The official test suite at https://github.com/CatConfLang/ccl-test-data provides comprehensive validation:

  • Hundreds of assertions across a growing test suite
  • JSON format in generated_tests/ directory
  • Capability-based filtering by function, behavior, and variant

Each test specifies one or more inputs:

{
"name": "basic_key_value_pairs_parse",
"validation": "parse",
"inputs": ["name = Alice\nage = 42"],
"expected": {
"count": 2,
"entries": [
{"key": "name", "value": "Alice"},
{"key": "age", "value": "42"}
]
},
"functions": ["parse"],
"features": [],
"behaviors": []
}

Filter by your implementation’s capabilities:

const runnable = tests.filter(test =>
// Only run tests for functions you've implemented
test.functions.every(fn => implementedFunctions.includes(fn)) &&
// Skip tests with conflicting behaviors
!test.conflicts?.behaviors?.some(b => myBehaviors.includes(b))
);

Before building a test runner, check if one exists for your language:

  • Go: Built-in test runner in ccl-test-data repository
  • Other languages: You may need to build a test loader

See Test Suite Guide for complete filtering examples and test format documentation.



The fundamental unit from parsing:

Entry {
key: string // Configuration key (empty string for list items)
value: string // Raw value (may contain nested CCL syntax)
}

The hierarchical structure after build_hierarchy. The top-level return is always a map:

CCL = Map[string, CCLValue]
CCLValue = string | CCL | List[CCLValue]

Where values can be:

  • String - Terminal value (no = in content)
  • CCL - Nested object (parsed from value containing =)
  • List - Array of values (from multiple entries with the same key, including empty-key "" list items)

Start with core functions and add features incrementally:

  1. parse - Basic key-value parsing
  2. build_hierarchy - Recursive object construction
  3. get_string - Simple path navigation
  4. get_int, get_bool, get_float - Type conversions
  5. get_list - List extraction
  6. filter, compose - Entry processing
  7. print, canonical_format - Output formatting

The test suite supports this progression - filter tests by functions array to run only relevant tests at each stage.


REQUIRED: parse, build_hierarchy
TYPED ACCESS: get_string, get_int, get_bool, get_float, get_list
PROCESSING: filter, compose
FORMATTING: print, canonical_format
TERMINOLOGY: Always use snake_case
ALGORITHM: Recursive fixed-point parsing
TEST SUITE: github.com/CatConfLang/ccl-test-data
DOCUMENTATION: ccl.tylerbutler.com