Source guide/collections.simd

1# Collections
2
3Silo's standard collections come in two flavours: **containers**
4— concrete data structures like `Vec`, `HashMap`, `HashSet` —
5and the **traits** that describe how to traverse and transform
6them (`Foldable`, `Mappable`, `Filterable`, `Iterator`). The
7traits are where most of the interesting code lives: the same
8`.map`, `.filter`, `.fold` methods work on every container that
9implements them.
10
11This chapter covers the main containers, the traversal traits,
12and the typical pipelines you'll write with them.
13
14## Vec — length in the type
15
16`Vec` is the workhorse sequential container. Its length can
17appear in the type, which lets the compiler track size
18invariants through chained operations.
19
20There are three common ways to construct one:
21
22```silo
231 2 3 vec/3                    # variadic call — explicit count
241 2 3 vec/3 .collect           # same, via a quotation of pushes
25(Vec .default)                 # an empty (Vec a | 0)
26```
27
28Any quotation that pushes a statically-known number of values
29can be fed through `.collect`:
30
31```silo
32[ 1 2 3 4 5 ] .collect         # (Vec Int | 5)
33[ "a" "b" ] .collect           # (Vec Str | 2)
34```
35
36The length shows up in the type, and the compiler tracks it
37through operations that preserve or combine lengths:
38
39```silo
401 2 vec/2 3 4 vec/2 .append    # (Vec Int | 4)
41```
42
43`.first` and `.last` retrieve the endpoints, always returning
44`(Option elem)`:
45
46```silo
471 2 3 vec/3 .first             # ⌊Some(1)⌉
48(Vec .default) .first          # ⌊None⌉
49```
50
51Each has an unwrapping `!` sibling whose signature includes
52`+Panic` — it panics on empty input. The interesting bit is that
53the `+Panic` gets **elided** when the compiler can prove the
54input is non-empty from its type alone:
55
56```silo
571 2 3 vec/3 .first!            # ⌊1⌉ — type encodes length 3, +Panic elided
58some-vec .first!               # needs +Panic; caller must discharge it
59```
60
61On a `(Vec Int | 3)` the panic branch is unreachable, so the
62compiler drops `+Panic` from the call's effect row. On an
63existential-length input like `(Vec Int)`, emptiness is a
64runtime question and the caller inherits `+Panic` as normal.
65
66### Existential-length `Vec`
67
68When a `Vec`'s length comes from a runtime source — user input,
69an I/O result, another Vec of unknown size — the type
70`(Vec elem)` marks the length as existential: present at
71runtime, erased at the type level. Most library functions accept
72`(Vec elem)` so they compose with both length-indexed and
73existential-length Vecs.
74
75## HashMap and HashSet
76
77Hash containers follow the standard conventions. Both are
78mutable (via `.insert` etc.) and both derive their capabilities
79from the element or key type implementing `Eq` and `Hash`.
80
81```silo
82(HashMap .default) pop-> m
83"Alice" "name" m .insert pop-> m
8430 "age" m .insert pop-> m
85
86"name"  m .lookup              # ⌊Some("Alice")⌉
87"email" m .lookup              # ⌊None⌉
88```
89
90`.lookup` returns `(Option val)`. `.insert` returns a pair of
91(updated map, displaced value if any), so you typically
92re-bind the map after each mutation.
93
94Sets follow the same shape:
95
96```silo
97(HashSet .default) pop-> s
981 s .insert pop-> s
992 s .insert pop-> s
1001 s .insert pop-> s            # still { 1 2 } — duplicates ignored
101```
102
103For ordered variants (iteration order matters, range queries
104work, but hashing isn't required), reach for `BTreeMap` /
105`BTreeSet`. Same API, different complexity trade-offs.
106
107## Ranges
108
109Ranges are first-class, not syntactic sugar. They're ordinary
110values implementing [`Foldable`](#traversal-protocols):
111
112```silo
1130..10 [ "{}" format print ] .for-each
1140..100 0 [ + ] .fold           # ⌊4950⌉
115```
116
117Half-open (`a..b`), from-only (`a..`), to-only (`..b`), and
118unbounded (`..`) all exist. An unbounded range isn't foldable —
119it has no end — but it's still useful in other contexts
120(slicing, pattern matching).
121
122## Traversal traits
123
124The heavy lifting happens in four traits:
125
126| Trait          | Core methods                                    | Semantics               |
127|----------------|-------------------------------------------------|-------------------------|
128| `Foldable`     | `.fold`, `.for-each`, `.any`, `.all`, `.count`  | Eager whole-collection  |
129| `Mappable`     | `.map`, `.flat-map`                             | Shape-preserving xform  |
130| `Filterable`   | `.filter`                                       | Keep some, drop others  |
131| `Iterator`     | `.next`                                         | Lazy, element-at-a-time |
132
133`Vec`, `HashMap`, `HashSet`, `BTreeMap`, `BTreeSet`, `Range`,
134and `Str` all implement `Foldable` / `Mappable` / `Filterable`.
135`Iterator` is implemented by types that need lazy traversal —
136directory iterators, character iterators over strings, lookahead
137parsers — and is a sibling to `Foldable` rather than a
138replacement: you reach for `Iterator` when short-circuiting or
139statefulness matters, otherwise the `Foldable` surface is
140cleaner and more efficient.
141
142See [the traits chapter](./09-traits.simd) for how traits
143compose; see
144:gloss[`:impl Iterator`](./A1-glossary.simd#row-variable) in the
145standard library for the core pattern, and
146[the pattern matching chapter](./08-pattern-matching.simd) for
147how `(Option elem)` returns interact with `:let`/`:if-let`.
148
149## Functional pipelines
150
151The typical shape is: source `->` transforms `->` terminator.
152
153```silo
1541 2 3 4 5 6 7 8 9 10 vec/10
155  [ 2 * ] .map                 # ⌊[2 4 6 8 10 12 14 16 18 20]⌉
156  [ 10 ">" ] .filter             # ⌊[12 14 16 18 20]⌉
157  0 [ + ] .fold                # ⌊80⌉
158```
159
160`.map` preserves length and shape; `.filter` preserves shape but
161shortens; `.fold` terminates the pipeline into a scalar. The
162same pattern works with any `Foldable` / `Mappable` /
163`Filterable` source — replace `vec/10` with a range, a HashSet,
164or a string-as-codepoint-sequence and the middle doesn't
165change.
166
167Effects flow through naturally. A `.map` body with `+Console`
168just makes the whole pipeline `+Console`:
169
170```silo
171:fn chatty-double ( (Vec Int) -> (Vec Int) ) +Console
172  [ peek-> x x "doubling {}" format print x 2 * ] .map
173:end
174```
175
176The `.map` method's signature is polymorphic over the
177quotation's effect row (see [quotations](./04-quotations.simd)),
178so it composes cleanly.
179
180## Building collections
181
182`.collect` is the universal "materialise this pipeline into a
183collection" terminator. It works because
184:gloss[output variadics](./A1-glossary.simd#output-variadic) let the
185quotation signal its element count to the runtime:
186
187```silo
188[ 1 2 3 ] .collect                      # (Vec Int | 3)
1890..100 [ 2 * ] .collect                 # (Vec Int | 100)
1900..10 [ peek-> n n 2 * ] .collect       # (Vec (Pair Int Int) 10)
191```
192
193The target type (`Vec`, `HashSet`, `HashMap`, ...) is inferred
194from context. When context doesn't determine the target, name
195it explicitly with qualified dispatch:
196
197```silo
198[ 1 2 3 ] (HashSet .collect)            # (HashSet Int)
199```
200
201## Unwrapping `!` siblings
202
203Several endpoint accessors come in two shapes: the
204`Option`-returning safe form and an `!`-suffixed unwrapping
205sibling.
206
207| Safe form  | Returns          | Unwrapping sibling | Signature returns     |
208|------------|------------------|---------------------|-----------------------|
209| `.first`   | `(Option elem)`  | `.first!`           | `elem`, with `+Panic` |
210| `.last`    | `(Option elem)`  | `.last!`            | `elem`, with `+Panic` |
211| `.min`     | `(Option elem)`  | `.min!`             | `elem`, with `+Panic` |
212| `.max`     | `(Option elem)`  | `.max!`             | `elem`, with `+Panic` |
213| `.reduce`  | `(Option elem)`  | `.reduce!`          | `elem`, with `+Panic` |
214
215The `!` variants declare `+Panic` because they panic on empty
216input. They don't *require* a non-empty type — they work on any
217`Vec` — but when the call site's type proves the input is
218non-empty, the compiler elides `+Panic` from the caller's effect
219row. That's how you get the "unwrap freely when it's safe, see
220the panic effect when it isn't" behaviour without two separate
221method names or manual proof.
222
223Use the safe form when you want to branch on emptiness
224explicitly; reach for the `!` sibling when the type already
225makes emptiness impossible — you'll get the bare value and no
226effect-row cost.
227
228## Key points
229
230- `Vec` tracks length in its type when it can; existential
231  length is `(Vec elem)`. Most library functions accept the
232  existential form.
233- `.collect` is the idiomatic way to build a `Vec` (or any
234  target container) from a quotation of pushes. Output
235  variadics let the compiler know the element count.
236- `HashMap` and `HashSet` use `.insert` / `.lookup` /
237  `.contains`. Ordered siblings are `BTreeMap` / `BTreeSet`.
238- Ranges are ordinary values implementing `Foldable`. Bounded
239  ranges are foldable; unbounded ranges aren't.
240- Four traversal traits: `Foldable`, `Mappable`, `Filterable`,
241  `Iterator`. The first three are the eager surface; `Iterator`
242  is lazy, element-at-a-time.
243- `.first` / `.last` / `.min` / `.max` / `.reduce` always
244  return `(Option elem)`. Their `!` siblings return the bare
245  element with `+Panic` declared; the `+Panic` is elided when
246  the input's type proves non-emptiness.
247
248Next: [strings and formatting](./13-strings.simd) — the
249string-specific API on top of the generic collection protocols.