Query DSL
Grammar, tokenisation, stopwords, alias expansion, FTS5 compilation.
Every query to memory search, memory recall, or memory_search runs through the query DSL. It’s a deliberately small grammar that normalises input, filters noise, and compiles to SQLite FTS5 MATCH syntax while preserving intent.
Pipeline
normalise (NFC, whitespace) → parse → stopword filter → alias expand → FTS5 compile
Grammar
query := clause ( WS clause )*
clause := term | phrase | not
term := WORD ( "*" )?
phrase := '"' WORD ( WS WORD )* '"'
not := "NOT" clause
WORDis a maximal run of characters excluding whitespace and the reserved operators.- Operators are uppercase only:
AND,OR,NOT. Lowercaseand,or,notare treated as ordinary terms. - Juxtaposition is implicit
AND:hedgehog winteris the same ashedgehog AND winter. - A trailing
*on a term is a prefix wildcard:hedge*matcheshedgehog,hedgerow, etc.
Example queries
hedgehog winter # implicit AND
"gardens and parks" # phrase, bypasses stopwords
hedge* # prefix wildcard
NOT urban # negation
hedgehog OR shrew # explicit OR
Tokenisation
Tokens are normalised by NFC and folded to lowercase before parsing. FTS5 control characters (* except as a trailing wildcard, backtick, parentheses, double quotes outside phrases) are stripped to stop user input from smuggling operators into the compiled query.
Stopword filter
Stopwords are removed only from bare-word queries. A query that uses an explicit AND / OR / NOT, that contains a phrase, or that ends up empty after stripping, bypasses the filter. The English and Dutch stopword lists are the canonical JSONs at spec/fixtures/stopwords/en.json and nl.json, and every SDK loads the same file.
Alias expansion
Before compiling to FTS5, the DSL expands caller-supplied aliases. An alias is a term-to-term map, typically { "ts": ["typescript"], "go": ["golang"] }. Alias expansion runs only on bare terms, never inside phrases — so "ts deep dive" stays unchanged.
The alias map is an in-memory ReadonlyMap<string, readonly string[]> only; there is no on-disk alias-table format in v1.0.
Temporal expansion
Three English recognisers append (around YYYY/MM/DD) tokens so phrases like “2 weeks ago” and “last Thursday” route to documents stamped near the anchor date the caller supplies:
| Phrase family | Example | Appended token |
|---|---|---|
| Relative offset | 3 days ago | (around 2026/04/16) |
| Last weekday | last Thursday | (around 2026/04/16) |
| Ordering hint | the first meeting | (ordinal 1) |
FTS5 compilation
The AST compiles to SQLite FTS5 MATCH syntax term-by-term:
term→termterm*→term*phrase→"phrase"AND/OR→AND/ORNOT→NOT
If a query collapses to empty after normalisation and filtering, the pipeline short-circuits to an empty result without issuing FTS5.
Why a DSL
A tiny grammar means every SDK implements the same parser and every caller gets predictable behaviour across TypeScript, Go, and Python. The alternative (pass the raw string to FTS5) fails badly on operator characters in ingested content and makes the stopword-exemption rule impossible to express.
Reference
See spec/QUERY-DSL.md for the normative grammar, stopword list, and FTS5 escaping rules. Canonical parser fixtures live at spec/fixtures/query-parser/cases.json.