AI Implementation Guide
This guide covers the core functions, optional library features, behavior choices, and test-suite workflow for implementing CCL (Categorical Configuration Language).
Quick Start
Section titled “Quick Start”CCL is a minimal configuration language based on key-value pairs with recursive structure. The core insight: if a value contains = characters, it can be parsed as nested CCL. This recursive fixed-point parsing is what creates hierarchy from flat text.
Required functions:
parse- Convert text to flat key-value entriesbuild_hierarchy- Convert entries to nested structure
Everything else is optional. Typed access, filtering, and formatting are library conveniences.
Resources
Section titled “Resources”| Resource | URL |
|---|---|
| Documentation | https://ccl.tylerbutler.com |
| Test Suite | https://github.com/CatConfLang/ccl-test-data |
| TypeScript Implementation | https://github.com/CatConfLang/ccl-typescript |
| Gleam Implementation | https://github.com/tylerbutler/ccl_gleam |
| OCaml Implementation | https://github.com/chshersh/ccl |
| Rust (ccl-rs) | https://github.com/hon-gyu/ccl-rs |
| Rust (serde_ccl) | https://github.com/LechintanTudor/serde_ccl |
| Rust (sickle) | https://github.com/tylerbutler/santa/tree/main/crates/sickle |
Core Functions (Required)
Section titled “Core Functions (Required)”Converts raw CCL text into a flat list of key-value entries.
Signature:
parse(text: string) -> List[Entry]Entry type:
Entry { key: string // Configuration key (empty string for list items) value: string // Raw value (may contain nested CCL syntax)}Algorithm:
- Find the first
=character in the input - Everything before
=is the key (trimmed of all whitespace including newlines) - Construct the value string (see precise rules below)
- Repeat for remaining input
Value Construction
Section titled “Value Construction”The value string is assembled as follows:
- First line: Everything after
=on the same line, with leading whitespace trimmed. - Continuation lines: Each subsequent line with indentation > N (the baseline) is appended verbatim, preserving its leading whitespace.
- Joining: Lines are joined with
\nseparators. - Final trim: Trailing whitespace is trimmed from the complete result.
Empty first line with continuations: When the text after = is empty (or whitespace-only) and continuation lines follow, the trimmed first line is empty, so the value begins with \n followed by the first continuation line.
For example, database =\n host = localhost produces value "\n host = localhost" — the empty first line becomes "", joined with \n to the continuation " host = localhost".
Single-line values are the trimmed text after =: key = value → "value".
Example:
name = Alicedatabase = host = localhost port = 5432Parses to:
[ Entry {key: "name", value: "Alice"}, Entry {key: "database", value: "\n host = localhost\n port = 5432"}]Key rules:
- Split on first
=only:a = b = c→ key:a, value:b = c - Trim all whitespace from keys (including newlines):
" key "→"key","key \n"→"key" - Empty key
= value→ list item (key is empty string) - Comment entry
/= text→ key is/, value istext
Value rules (summary — see “Value construction” above for full algorithm):
- First line: trim leading whitespace after
=:key = value→ value is"value" - Continuation lines: preserved verbatim, including their leading whitespace
- Final line: trim trailing whitespace:
key = value→ value is"value" - Lines joined with
\n; internal newlines and indentation preserved
Indentation handling:
The baseline indentation (N) determines which lines are continuations vs new entries. How N is determined depends on the toplevel_indent_strip vs toplevel_indent_preserve behavior:
toplevel_indent_strip(OCaml reference): Top-level parsing uses N=0; nested parsing uses first content line’s indenttoplevel_indent_preserve(simpler): Always use first content line’s indent for all contexts
For each subsequent line, count its leading whitespace:
- If indentation > N → continuation line (append to value)
- If indentation ≤ N → new entry (stop parsing current value)
Note: With toplevel_indent_preserve, you only need one parsing algorithm. With toplevel_indent_strip, you need context detection to distinguish top-level from nested parsing. See Continuation Lines for details.
Whitespace counting: Both spaces and tabs count as indentation whitespace; CCL counts characters, not visual columns. See Behavior Reference — Tab Handling for the related choice about how leading tabs on continuation lines are normalized.
See Parsing Algorithm for complete details.
build_hierarchy
Section titled “build_hierarchy”Converts flat entries into a nested object structure via recursive parsing.
Signature:
build_hierarchy(entries: List[Entry]) -> CCLReturn type:
build_hierarchy always returns a map (object/dict), even when all entries have empty keys. The observable structure of a CCL value is:
- String — terminal value (no
=in content, fixed point reached) - Map — nested object (from recursive parsing of a value containing
=) - List — array of values (from multiple entries sharing the same key)
Different languages encode this differently. The pseudocode uses:
CCL = Map[string, CCL | string | List[CCLValue]]The OCaml reference uses a uniform recursive type Fix of t Map.Make(String).t where every value is a nested map (strings are represented as single-key maps). What matters is the observable output, not the internal type encoding.
Algorithm:
function build_hierarchy(entries): result = {} for entry in entries: if entry.key == "": # Empty key = list item (accumulate under "" key) accumulate_list(result, "", entry.value) else if contains_ccl_syntax(entry.value): # Value has '=' → parse recursively nested_entries = parse(entry.value) result[entry.key] = build_hierarchy(nested_entries) else: # Terminal value (fixed point reached) result[entry.key] = entry.value return result
function contains_ccl_syntax(value): return "=" in valueList accumulation for empty keys: When sibling entries share the empty key "" inside a parent value, their values are collected into a list. For example, input users =\n = alice\n = bob produces {"users": ["alice", "bob"]} — the bare entries become a flat list of strings under the parent key. When the bare entries themselves contain nested CCL (e.g., each = is followed by an indented name = ... block), their values are recursively built into objects, producing a list of objects. See Bare List Hierarchy Representation for the canonical shape.
Fixed-point termination: Recursion stops when values contain no = characters. Plain strings like "localhost" or "5432" have no structure to parse.
Example:
Input entries:
[ Entry {key: "database", value: "\n host = localhost\n port = 5432"}, Entry {key: "users", value: "\n = alice\n = bob"}]After recursive parsing:
{ "database": { "host": "localhost", "port": "5432" }, "users": ["alice", "bob"]}Note: users is a flat list of strings because each bare entry (= alice, = bob) has a string value. If the bare entries contained nested CCL, users would be a list of objects instead. See Bare List Hierarchy Representation.
Handling special cases:
- Empty keys: Multiple entries with empty key
""accumulate into a list stored in the map under key"" - Duplicate keys: Merge values or convert to list (implementation choice)
- Nested values: Any value containing
=is parsed recursively
See Parsing Algorithm for the complete algorithm with examples.
Typed Access Functions (Optional)
Section titled “Typed Access Functions (Optional)”These provide convenient, type-safe value extraction. All are optional library features.
get_string
Section titled “get_string”get_string(ccl: CCL, ...path: string[]) -> string | ErrorNavigate to path and return string value. Error if path not found or value is not a string.
Example: get_string(config, "database", "host") navigates to config["database"]["host"].
get_int
Section titled “get_int”get_int(ccl: CCL, ...path: string[]) -> int | ErrorNavigate to path, parse value as integer. Error if not a valid integer.
get_bool
Section titled “get_bool”get_bool(ccl: CCL, ...path: string[]) -> bool | ErrorNavigate to path, parse value as boolean.
Behavior choice:
boolean_strict: Only"true"and"false"boolean_lenient: Also accepts"yes"/"no","1"/"0"
get_float
Section titled “get_float”get_float(ccl: CCL, ...path: string[]) -> float | ErrorNavigate to path, parse value as floating-point number.
get_list
Section titled “get_list”get_list(ccl: CCL, ...path: string[]) -> List[string] | ErrorNavigate to path, return list of values.
Behavior choice:
list_coercion_enabled: Single value returns[value]list_coercion_disabled: Error if not actually a list
Path navigation: Pass each path segment as a separate argument: get_string(config, "database", "host")
See Library Features for implementation details.
Processing Functions (Optional)
Section titled “Processing Functions (Optional)”filter
Section titled “filter”filter(entries: List[Entry], predicate: fn(Entry) -> bool) -> List[Entry]Filter entries based on predicate. Common use: remove comments (entries where key starts with /).
compose
Section titled “compose”compose(entries1: List[Entry], entries2: List[Entry]) -> List[Entry]Concatenate entry lists. This is a monoid operation - entries form a monoid under composition with empty list as identity.
See Library Features for details on entry processing.
Formatting Functions (Optional)
Section titled “Formatting Functions (Optional)”print(ccl: CCL) -> stringRender a CCL value back to text. Implementations that support structure-preserving printing must retain enough source structure to preserve comments, ordering, and formatting.
canonical_format
Section titled “canonical_format”canonical_format(ccl: CCL) -> stringConvert CCL object to standardized text format. Semantic-preserving: Normalizes formatting but preserves meaning.
Key difference:
printpreserves original structure (comments, ordering, formatting)canonical_formatproduces normalized output from the parsed model
See Library Features: Formatting for details.
Implementation Behaviors
Section titled “Implementation Behaviors”CCL implementations make choices about edge cases. Declare your choices and the test suite will filter appropriately.
| Behavior Group | Options | Description |
|---|---|---|
| Continuation Baseline | toplevel_indent_strip / toplevel_indent_preserve | Top-level N=0 (reference) or N=first key’s indent (simpler) |
| Line Endings | crlf_preserve_literal / crlf_normalize_to_lf | Keep \r chars or normalize to LF |
| Boolean Parsing | boolean_strict / boolean_lenient | Only true/false or also yes/no |
| Tab Handling | continuation_tab_to_space / continuation_tab_preserve | Leading tabs on continuation lines: normalize to space (OCaml reference) or preserve verbatim |
| Delimiter | delimiter_first_equals / delimiter_prefer_spaced | Split on the first = or prefer spaced = when present |
| Indentation | indent_spaces / indent_tabs | Output formatting style |
| List Coercion | list_coercion_enabled / list_coercion_disabled | Single value as one-item list |
| Array Ordering | array_order_insertion / array_order_lexicographic | Preserve order or sort |
See Behavior Reference for detailed documentation of each behavior.
Testing Your Implementation
Section titled “Testing Your Implementation”Test Suite
Section titled “Test Suite”The official test suite at https://github.com/CatConfLang/ccl-test-data provides comprehensive validation:
- Hundreds of assertions across a growing test suite
- JSON format in
generated_tests/directory - Capability-based filtering by function, behavior, and variant
Test Format
Section titled “Test Format”Each test specifies one or more inputs:
{ "name": "basic_key_value_pairs_parse", "validation": "parse", "inputs": ["name = Alice\nage = 42"], "expected": { "count": 2, "entries": [ {"key": "name", "value": "Alice"}, {"key": "age", "value": "42"} ] }, "functions": ["parse"], "features": [], "behaviors": []}Filtering Tests
Section titled “Filtering Tests”Filter by your implementation’s capabilities:
const runnable = tests.filter(test => // Only run tests for functions you've implemented test.functions.every(fn => implementedFunctions.includes(fn)) && // Skip tests with conflicting behaviors !test.conflicts?.behaviors?.some(b => myBehaviors.includes(b)));Check for Existing Test Runners
Section titled “Check for Existing Test Runners”Before building a test runner, check if one exists for your language:
- Go: Built-in test runner in ccl-test-data repository
- Other languages: You may need to build a test loader
See Test Suite Guide for complete filtering examples and test format documentation.
Common Pitfalls
Section titled “Common Pitfalls”Data Types Summary
Section titled “Data Types Summary”The fundamental unit from parsing:
Entry { key: string // Configuration key (empty string for list items) value: string // Raw value (may contain nested CCL syntax)}CCL Object
Section titled “CCL Object”The hierarchical structure after build_hierarchy. The top-level return is always a map:
CCL = Map[string, CCLValue]CCLValue = string | CCL | List[CCLValue]Where values can be:
- String - Terminal value (no
=in content) - CCL - Nested object (parsed from value containing
=) - List - Array of values (from multiple entries with the same key, including empty-key
""list items)
Recommended Implementation Order
Section titled “Recommended Implementation Order”Start with core functions and add features incrementally:
parse- Basic key-value parsingbuild_hierarchy- Recursive object constructionget_string- Simple path navigationget_int,get_bool,get_float- Type conversionsget_list- List extractionfilter,compose- Entry processingprint,canonical_format- Output formatting
The test suite supports this progression - filter tests by functions array to run only relevant tests at each stage.
Quick Reference
Section titled “Quick Reference”REQUIRED: parse, build_hierarchyTYPED ACCESS: get_string, get_int, get_bool, get_float, get_listPROCESSING: filter, composeFORMATTING: print, canonical_formatTERMINOLOGY: Always use snake_caseALGORITHM: Recursive fixed-point parsingTEST SUITE: github.com/CatConfLang/ccl-test-dataDOCUMENTATION: ccl.tylerbutler.com