Source guide/collections.simd
1# Collections 2 3Silo's standard collections come in two flavours: **containers** 4— concrete data structures like `Vec`, `HashMap`, `HashSet` — 5and the **traits** that describe how to traverse and transform 6them (`Foldable`, `Mappable`, `Filterable`, `Iterator`). The 7traits are where most of the interesting code lives: the same 8`.map`, `.filter`, `.fold` methods work on every container that 9implements them. 10 11This chapter covers the main containers, the traversal traits, 12and the typical pipelines you'll write with them. 13 14## Vec — length in the type 15 16`Vec` is the workhorse sequential container. Its length can 17appear in the type, which lets the compiler track size 18invariants through chained operations. 19 20There are three common ways to construct one: 21 22```silo 231 2 3 vec/3 # variadic call — explicit count 241 2 3 vec/3 .collect # same, via a quotation of pushes 25(Vec .default) # an empty (Vec a | 0) 26``` 27 28Any quotation that pushes a statically-known number of values 29can be fed through `.collect`: 30 31```silo 32[ 1 2 3 4 5 ] .collect # (Vec Int | 5) 33[ "a" "b" ] .collect # (Vec Str | 2) 34``` 35 36The length shows up in the type, and the compiler tracks it 37through operations that preserve or combine lengths: 38 39```silo 401 2 vec/2 3 4 vec/2 .append # (Vec Int | 4) 41``` 42 43`.first` and `.last` retrieve the endpoints, always returning 44`(Option elem)`: 45 46```silo 471 2 3 vec/3 .first # ⌊Some(1)⌉ 48(Vec .default) .first # ⌊None⌉ 49``` 50 51Each has an unwrapping `!` sibling whose signature includes 52`+Panic` — it panics on empty input. The interesting bit is that 53the `+Panic` gets **elided** when the compiler can prove the 54input is non-empty from its type alone: 55 56```silo 571 2 3 vec/3 .first! # ⌊1⌉ — type encodes length 3, +Panic elided 58some-vec .first! # needs +Panic; caller must discharge it 59``` 60 61On a `(Vec Int | 3)` the panic branch is unreachable, so the 62compiler drops `+Panic` from the call's effect row. On an 63existential-length input like `(Vec Int)`, emptiness is a 64runtime question and the caller inherits `+Panic` as normal. 65 66### Existential-length `Vec` 67 68When a `Vec`'s length comes from a runtime source — user input, 69an I/O result, another Vec of unknown size — the type 70`(Vec elem)` marks the length as existential: present at 71runtime, erased at the type level. Most library functions accept 72`(Vec elem)` so they compose with both length-indexed and 73existential-length Vecs. 74 75## HashMap and HashSet 76 77Hash containers follow the standard conventions. Both are 78mutable (via `.insert` etc.) and both derive their capabilities 79from the element or key type implementing `Eq` and `Hash`. 80 81```silo 82(HashMap .default) pop-> m 83"Alice" "name" m .insert pop-> m 8430 "age" m .insert pop-> m 85 86"name" m .lookup # ⌊Some("Alice")⌉ 87"email" m .lookup # ⌊None⌉ 88``` 89 90`.lookup` returns `(Option val)`. `.insert` returns a pair of 91(updated map, displaced value if any), so you typically 92re-bind the map after each mutation. 93 94Sets follow the same shape: 95 96```silo 97(HashSet .default) pop-> s 981 s .insert pop-> s 992 s .insert pop-> s 1001 s .insert pop-> s # still { 1 2 } — duplicates ignored 101``` 102 103For ordered variants (iteration order matters, range queries 104work, but hashing isn't required), reach for `BTreeMap` / 105`BTreeSet`. Same API, different complexity trade-offs. 106 107## Ranges 108 109Ranges are first-class, not syntactic sugar. They're ordinary 110values implementing [`Foldable`](#traversal-protocols): 111 112```silo 1130..10 [ "{}" format print ] .for-each 1140..100 0 [ + ] .fold # ⌊4950⌉ 115``` 116 117Half-open (`a..b`), from-only (`a..`), to-only (`..b`), and 118unbounded (`..`) all exist. An unbounded range isn't foldable — 119it has no end — but it's still useful in other contexts 120(slicing, pattern matching). 121 122## Traversal traits 123 124The heavy lifting happens in four traits: 125 126| Trait | Core methods | Semantics | 127|----------------|-------------------------------------------------|-------------------------| 128| `Foldable` | `.fold`, `.for-each`, `.any`, `.all`, `.count` | Eager whole-collection | 129| `Mappable` | `.map`, `.flat-map` | Shape-preserving xform | 130| `Filterable` | `.filter` | Keep some, drop others | 131| `Iterator` | `.next` | Lazy, element-at-a-time | 132 133`Vec`, `HashMap`, `HashSet`, `BTreeMap`, `BTreeSet`, `Range`, 134and `Str` all implement `Foldable` / `Mappable` / `Filterable`. 135`Iterator` is implemented by types that need lazy traversal — 136directory iterators, character iterators over strings, lookahead 137parsers — and is a sibling to `Foldable` rather than a 138replacement: you reach for `Iterator` when short-circuiting or 139statefulness matters, otherwise the `Foldable` surface is 140cleaner and more efficient. 141 142See [the traits chapter](./09-traits.simd) for how traits 143compose; see 144:gloss[`:impl Iterator`](./A1-glossary.simd#row-variable) in the 145standard library for the core pattern, and 146[the pattern matching chapter](./08-pattern-matching.simd) for 147how `(Option elem)` returns interact with `:let`/`:if-let`. 148 149## Functional pipelines 150 151The typical shape is: source `->` transforms `->` terminator. 152 153```silo 1541 2 3 4 5 6 7 8 9 10 vec/10 155 [ 2 * ] .map # ⌊[2 4 6 8 10 12 14 16 18 20]⌉ 156 [ 10 ">" ] .filter # ⌊[12 14 16 18 20]⌉ 157 0 [ + ] .fold # ⌊80⌉ 158``` 159 160`.map` preserves length and shape; `.filter` preserves shape but 161shortens; `.fold` terminates the pipeline into a scalar. The 162same pattern works with any `Foldable` / `Mappable` / 163`Filterable` source — replace `vec/10` with a range, a HashSet, 164or a string-as-codepoint-sequence and the middle doesn't 165change. 166 167Effects flow through naturally. A `.map` body with `+Console` 168just makes the whole pipeline `+Console`: 169 170```silo 171:fn chatty-double ( (Vec Int) -> (Vec Int) ) +Console 172 [ peek-> x x "doubling {}" format print x 2 * ] .map 173:end 174``` 175 176The `.map` method's signature is polymorphic over the 177quotation's effect row (see [quotations](./04-quotations.simd)), 178so it composes cleanly. 179 180## Building collections 181 182`.collect` is the universal "materialise this pipeline into a 183collection" terminator. It works because 184:gloss[output variadics](./A1-glossary.simd#output-variadic) let the 185quotation signal its element count to the runtime: 186 187```silo 188[ 1 2 3 ] .collect # (Vec Int | 3) 1890..100 [ 2 * ] .collect # (Vec Int | 100) 1900..10 [ peek-> n n 2 * ] .collect # (Vec (Pair Int Int) 10) 191``` 192 193The target type (`Vec`, `HashSet`, `HashMap`, ...) is inferred 194from context. When context doesn't determine the target, name 195it explicitly with qualified dispatch: 196 197```silo 198[ 1 2 3 ] (HashSet .collect) # (HashSet Int) 199``` 200 201## Unwrapping `!` siblings 202 203Several endpoint accessors come in two shapes: the 204`Option`-returning safe form and an `!`-suffixed unwrapping 205sibling. 206 207| Safe form | Returns | Unwrapping sibling | Signature returns | 208|------------|------------------|---------------------|-----------------------| 209| `.first` | `(Option elem)` | `.first!` | `elem`, with `+Panic` | 210| `.last` | `(Option elem)` | `.last!` | `elem`, with `+Panic` | 211| `.min` | `(Option elem)` | `.min!` | `elem`, with `+Panic` | 212| `.max` | `(Option elem)` | `.max!` | `elem`, with `+Panic` | 213| `.reduce` | `(Option elem)` | `.reduce!` | `elem`, with `+Panic` | 214 215The `!` variants declare `+Panic` because they panic on empty 216input. They don't *require* a non-empty type — they work on any 217`Vec` — but when the call site's type proves the input is 218non-empty, the compiler elides `+Panic` from the caller's effect 219row. That's how you get the "unwrap freely when it's safe, see 220the panic effect when it isn't" behaviour without two separate 221method names or manual proof. 222 223Use the safe form when you want to branch on emptiness 224explicitly; reach for the `!` sibling when the type already 225makes emptiness impossible — you'll get the bare value and no 226effect-row cost. 227 228## Key points 229 230- `Vec` tracks length in its type when it can; existential 231 length is `(Vec elem)`. Most library functions accept the 232 existential form. 233- `.collect` is the idiomatic way to build a `Vec` (or any 234 target container) from a quotation of pushes. Output 235 variadics let the compiler know the element count. 236- `HashMap` and `HashSet` use `.insert` / `.lookup` / 237 `.contains`. Ordered siblings are `BTreeMap` / `BTreeSet`. 238- Ranges are ordinary values implementing `Foldable`. Bounded 239 ranges are foldable; unbounded ranges aren't. 240- Four traversal traits: `Foldable`, `Mappable`, `Filterable`, 241 `Iterator`. The first three are the eager surface; `Iterator` 242 is lazy, element-at-a-time. 243- `.first` / `.last` / `.min` / `.max` / `.reduce` always 244 return `(Option elem)`. Their `!` siblings return the bare 245 element with `+Panic` declared; the `+Panic` is elided when 246 the input's type proves non-emptiness. 247 248Next: [strings and formatting](./13-strings.simd) — the 249string-specific API on top of the generic collection protocols.