Integer-indexed types#1065
Draft
strub wants to merge 42 commits into
Draft
Conversation
Mechanical migration of every call site to the new
ty_params = {idxvars; tyvars} record and targs = {indices; types}
record introduced in b15e335. Repairs the broken gen_op/mk_op
bodies in ecDecl and the syntax error in ecInductive (c413f37 WIP).
No semantic change: indices are uniformly empty, tyvars carry the
existing behaviour. Phase-0 design choices and the rest of the
roadmap are documented in memory.md.
tindex equality and hashing now go through a canonical sum-of-monomials normalisation, so n+1 and 1+n (and (n+m)^2 and n^2+2nm+m^2) are recognised as equal. Coefficients are EcBigInt with the natural-number invariant; canonical_const refuses negative TIConst. ecUnify now compares indices (with the previous polarity bug fixed) and ecReduction.for_targs no longer skips them. Memory.md updated with the Phase-1 deliverables and what was deferred (no TIUnivar / UF participation yet — gated on Phase 3 needs).
tindex_subst is no longer a no-op. It consults fs_loc / sb_flocal (indices share the formula-locals namespace), reinterprets the bound formula as a polynomial via the new tindex_of_form recogniser, and panics if the binding is non-polynomial. targs_fv (and so ty_fv on Tconstr) now folds over indices, so the per-formula short-circuits in Fsubst correctly fire for types with TIVar occurrences. is_ty_subst_id now also checks fs_loc emptiness — a formula substitution that touches an int-typed local can affect any type whose Tconstr carries a TIVar of that local. Cost is a wider substitution walk; correctness comes first here. memory.md updated with deliverables, design choices (no eager re-canonicalisation, fs_eloc not consulted), and the remaining risk (audit f_bind_local callers for the polynomial invariant).
Tydecls, operators, predicates and axioms can now declare integer
index parameters; type-constructor applications can supply index
arguments.
type [n m] ('a, 'b) vec.
op f [n 'a] (xs : 'a vec<:n>) : 'a vec<:n+1>.
pred p [n 'a] : 'a vec<:n>.
axiom A [n 'a] : true.
Index binders use plain (no apostrophe) identifiers in `[...]`. Index
applications are framed by `<:...>` rather than `[...]` to avoid a
shift/reduce conflict with codepos brackets in module-update
syntax. Index expressions inside the framing are restricted to the
polynomial fragment (`+`, `*`, non-negative literals, identifiers).
Datatype/record indexed types and cloning of indexed declarations
are refused with clean errors — useful index-instantiation at op
call sites still requires the deferred TIUnivar / polynomial-with-
univars unification work, also flagged in memory.md.
A regression test lives at tests/indexed-types.ec.
Adds the TIUnivar machinery Phase 3 deferred. Indexed ops can now be
called: each idxvar of the op being applied is freshened to a TIUnivar
and unified against the call site via polynomial-normal-form equality.
op concat [n m 'a] (xs : 'a vec<:n>) (ys : 'a vec<:m>) : 'a vec<:n+m>.
op cons [n 'a] (x : 'a) (xs : 'a vec<:n>) : 'a vec<:n+1>.
op test [n m 'a] (x : 'a) (ys : 'a vec<:n>) (zs : 'a vec<:m>)
: 'a vec<:n+(1+m)> (* canonically equal to (n+1)+m *)
= concat (cons x ys) zs.
The typecheck of `test` works because `cons`'s `?u` unifies with `n`,
`concat`'s `?u_n` unifies with `(n+1)`, `?u_m` with `m`, and the
inferred return type `(n+1)+m` is canonically equal to the annotated
`n+(1+m)`.
MVP scope: handles "naked TIUnivar = arbitrary polynomial" (with
occurs check) and canonical equality after resolution. Refuses
genuine polynomial unification (e.g. `?u + 1 = n` would need
subtraction-inversion) with a clean IndexMismatch error.
Also fixes a lurking bug in `ty_subst` where the `Tconstr` case fell
through to `ty_map`, which preserves indices verbatim — silently
dropping op-application index substitution.
clone T as T2 with type [k] 'a vec = body The optional [k] mirrors the tydecl binder syntax. body may reference both k and the type binders; when vec<:e> appears in T, the substitution binds k to e in body. ty_override_def is widened to (idxvars, tyvars, body); subst gains sb_idxvar; subst_ty's Tconstr-with-tydef branch binds both source binder lists to the call-site indices/types before substituting through body. The previous CE_IndexedNotYetSupported is replaced with CE_IdxArgMism so the user gets a precise arity message. Three clone cases land in the regression: drop the index (= int), propagate (= 'a coll<:k>), and a polynomial of the binder (= 'a coll<:k+1>). 77 declarations now compile. Two gaps are flagged in memory.md but kept out of scope: reaching into a cloned theory's ops whose signature got touched (looks orthogonal to indexed types), and explicit index-instantiation syntax at op call sites.
The two assert (List.is_empty *.indices) panics in ecSmt.ml become
raise CanNotTranslate, and check / execute_task catch the exception
to skip the goal cleanly. The user now sees a warning
("SMT: skipped goal containing constructs not yet exported to Why3
(e.g. indexed types)") followed by "cannot prove goal", instead of
an anomaly crash.
Translating indexed types to Why3 stays out of scope per the
original Phase-0 punt.
Errors and transcripts now print 'a vec<:n> instead of 'a vec; pp_tindex handles variables, univars (?#N), constants, and the polynomial *- and +-forms with standard precedence. Two vestigial Phase-0 asserts become clean failures: - ecReduction: indexed-op heads in a user rewrite rule raise NotReducible instead of crashing (they just don't match). - ecMatching: Fop vs Fop in pattern matching uses tindex_equal plus a length check, failing cleanly rather than asserting. The two asserts in ecInductive's positivity check are intentional "should never happen" guards — Phase-3 Slice-A already refuses indexed binders on datatype/record, so they aren't reachable. Closes the roadmap in memory.md. Remaining gaps are documented (op call-site index instantiation syntax, SMT translation of indexed types, indexed datatypes) but out of scope.
E — extend index-binder support to abbreviations and notations. Both rules now accept mixed_tyvars_decl (the bracket binder list that splits into idxvars and tyvars). pabbrev / pnotation gain *_idx fields; ecHiNotations threads them via the new ~idxparams parameter on transtyvars. abbrev my_alias [n 'a] : 'a vec<:n+1> = ... . notation %"..."% [n 'a] (...) = ... . D — investigation of the Phase-4 "T2.make_vec unknown" report showed it was a misuse of the alias `=` operator instead of the inline `<-` operator (alias creates a new name; inline propagates the body). Both modes work correctly. While tracing, fresh_tparams was discovered to freshen tyvars but not idxvars — fix this so op_tparams alpha-renaming includes both.
op count [n 'a] : int. op test : int = count[:5]<:int>. New lexer token LBRACKETCOLON (matches `[:` glued). Grammar adds `f[:idx]`, `f[:idx]<:ty>`, alongside existing `f<:ty>`. Parsetree TVIunamed and ecUnify tvar_inst.TVIunamed are widened to (indices, types). Producers updated mechanically (PFrecord, PTHO_*, inductive constructors, scope's tycinstance loop, printer's op_symb resolution, transtvi). EcUnify.openidx now consumes user- supplied indices when given, falling back to fresh TIUnivars otherwise. select_op's filter validates either side independently when non-empty. Useful when an op's idxvar is unreachable by argument-type inference. Phase-3.5 inference still handles the common case where the index can be derived from an argument's type. The test file gets three new cases (size[:5] xs, count[:5]<:int>, inferred-only baseline). 91 declarations now compile.
Document the plan and effort estimates for the three documented-but- unscheduled gaps from the original A-F plan. Order: B (polynomial unification beyond naked TIUnivar) -> C (non-refining indexed datatypes/records) -> F (SMT translation via per-index monomorphize).
Generalise the index unifier so [?u + k = poly] is solved when ?u
appears with net coefficient ±1 and the residual stays non-negative.
Previously only the naked-univar special case [?u = poly] worked, so
e.g. [tail xs : 'a vec<:n>] (where tail expects [vec<:n+1>]) failed
to unify against a caller-supplied [vec<:5>].
The new tindex_solve_for_univar walks the signed difference of the
two canonical polynomials, accepting only equations where:
- exactly one TIUnivar has non-zero net coefficient,
- that coefficient is ±1,
- every monomial mixing univars with other variables (or with a
univar at degree > 1) cancels to zero on net,
- the resulting value of ?u has non-negative coefficient on every
remaining monomial and constant.
The MVP scope deliberately excludes multi-univar Diophantine and
cases like [?u + 1 = n] for free n (no symbolic guarantee n >= 1).
Lift the Phase-3 Slice-A refusal that blocked index binders on datatype/record declarations. trans_datatype and trans_record now take an optional ~idxparams; ecScope's tydecl path threads it through the existing ~idxparams plumbing of transtyvars. Constructor and projector signature construction in ecEnv builds the result type via tconstr ~indices ~tyargs so e.g. INil is registered as 'a vec<:n>, not 'a vec. The positivity checker drops its assert (List.is_empty args.indices); indices play no role in positivity since the recursion is on the type and indices carry no embedded type information. Match elaboration in trans_branch (ecTyping) and the matchfix twin in ecHiInductive previously broke for 0-field constructors of indexed datatypes: opentys would allocate fresh index univars that appeared in no unified type, leaving them dangling at closed-check time. Fix: prepend a hand-built result type to the opened list so the freshly allocated univars are anchored to a type that participates in the subsequent unification against the scrutinee. Index refinement on match is deliberately out of scope: a vec<:0> scrutinee still admits an ICons-shaped pattern at the type level. Matches OCaml/Haskell parametric ADT semantics. Test file grows to 139 declarations covering constructor application, plain match, matchfix, indexed records, and field projection. Non-indexed datatypes/records continue to work unchanged.
Constructors of an indexed datatype are universally quantified over the index just like over type variables: INil has type forall n 'a. 'a ivec<:n>. Per-constructor result indices (GADT style) require both per-ctor result-type syntax AND index refinement on match, the latter being a much bigger dependent-typing feature.
…phization Replace the two CanNotTranslate raise sites in ecSmt for indexed Tconstr / Fop with a monomorphisation path. New helper EcAst.tindex_to_int reduces a tindex to a closed integer when possible (no free vars, no leftover univars); the SMT pipeline uses it to key two new caches (te_ty_idx, te_op_idx) by "path<:i,j,...>". trans_pty_idx / trans_tydecl_idx substitute idxvars by TIConst via EcCoreSubst.f_subst_init ~idx and emit fresh Why3 sorts named <path>_<i>_<j>... For indexed datatypes / records, all per-index constructor and projector variants are populated as a side-effect. trans_op_idx checks op_kind first: constructors / projectors / record-makers force their carrying type's monomorphisation (which populates te_op_idx); plain indexed operators get a fresh abstract Why3 symbol via create_op_idx. Bodies of plain indexed ops are dropped in this MVP — sound (treats the op opaquely) but limits SMT's unfolding across index instances. Goals with free index variables (e.g. an axiom binding [n]) still hit CanNotTranslate, preserving the per-goal skip behaviour: the existing try/catch around init / make_task emits the warning and the lemma falls through to "no provers" without a crash. Verified via 4 new SMT-discharge lemmas (160 declarations total in tests/indexed-types.ec) and a smoke test confirming non-indexed SMT goals are unaffected.
Lock in that lemma headers accept index params via the existing [n 'a] mixed_tyvars_decl syntax (shared with op binders). Discovered no fix was needed — the original "lemma binders don't accept index params" finding was a syntax confusion: separate brackets ['a] [n] are not supported (only the combined ['a n] form), and <:n> is the type-application framing while [:n] is the op-call framing. Two new lemmas exercise: a parametric proof using [trivial], and a quantified form using [move => ; trivial]. SMT discharge of goals with bound (non-closed) indices remains correctly skipped.
The printers for type declarations, operators, predicates, abbreviations, axioms and added-ops only emitted [tparams.tyvars], silently dropping the index binders. So [type [n] word.] would print as [type word.], hiding the index parameter from the user. New helper [pp_paramsannot ppe fmt (idxvars, tyvars)] prints the combined-bracket form [n 'a] matching the input syntax. All five printers now consult both lists; the per-kind operator printers (pp_opdecl_op / _pr / _nt) take a [ty_params] record instead of a bare tyvars list, and the dispatch in pp_opdecl passes op.op_tparams. Verified against /tmp/print_idx.ec: [type [n] word.], [type [n m] 'a vec.], [op cons [n 'a] : ...], [pred ix_pr [n 'a]], and [axiom ix_ax [n 'a]] all print their index binders.
The op grammar accepts an optional bracket-before-name for the opacity tags (opaque, smt_opaque). When users write [op [n] "_.[_]" (w : word<:n>) : bool] hoping to bind an idxvar [n], the parser greedily consumed the [n] as tags and silently discarded it, leaving [n] unbound in the type signature. Add a [disambiguate_op_brackets] helper invoked from both operator rule alternatives. If every entry in the leading bracket is in the known-tag whitelist, treat as tags (existing behaviour preserved for [op [opaque] foo], [op [opaque smt_opaque] foo], etc.). Otherwise reinterpret the bracket as a pure idxvar binder; if both the leading bracket and an after-name binder are present, raise a clear parse error rather than guessing. This fixes the canonical infix-style indexed-op declaration: type [n] word. op [n] "_.[_]" (w : word<:n>) : bool. which now parses with [n] correctly bound as an idxvar. Verified: full regression (182 decls), the original report case, and theories/datatypes/FMap.ec (heavy [opaque] tag user) still compile unchanged.
Indices and type variables now use distinct bracket families:
- {n m} for index binders
- ['a 'b] for type-variable binders
Indices come first when both are present.
Why: the previous mixed bracket [n 'a] was overloaded with the
op-leading [opaque] tags bracket, requiring a content-based
disambiguation that confused users when an unrecognised tag was
silently rewritten as a binder. With braces vs. brackets, the parser
disambiguates lexically and there is no overlap with op tags.
Parser:
- idxvars_decl now matches LBRACE lident+ RBRACE.
- mixed_tyvars_decl and bucket_mixed are gone, replaced by a single
ix_ty_binder rule that takes idxvars_decl? then tyvars_decl? and
returns (idxvars, tyvars_opt).
- The Gap-fix disambiguate_op_brackets helper is removed; the op
rules now read tags from the leading [...] and the binder from
the after-name {...} [...] pair without any reinterpretation.
- All consumers (operator x2, pred x2, inductive, notation, abbrev,
lemma_decl) switched to ix_ty_binder.
Pretty-printer:
- pp_paramsannot emits {idx} for indices and ['a] for tyvars,
separated by a single space when both are non-empty.
- pp_typedecl uses curly braces for the leading idx binder.
Tests: tests/indexed-types.ec (159 declarations) migrated to the new
syntax. Round-trip print produces text that re-parses unchanged.
FMap.ec (heavy [opaque] tag user) still compiles.
When an axiom / lemma / op / pred / abbreviation / notation binds
an idxvar [n] via {n}, the same ident now also resolves as an
int-typed local in the body of that declaration. Previously [n]
was only reachable in tindex positions like vec<:n>; using it as
an integer term (e.g. mkseq f n) failed with "unknown variable n".
This realises the Phase-2 design choice that idxvars and
formula-locals share a namespace: an idxvar is exactly an integer
binding, and the indexer / formula machinery agree on the ident.
New helper EcTyping.bind_idx_locals env ue pulls the idxvars out
of the unienv's tparams and binds each as a (id, tint) local in
env. Called immediately after every transtyvars ~idxparams site:
ecScope.add_r (axiom/lemma), ecScope op processing,
ecHiPredicates.trans_preddecl_r, ecHiNotations.trans_notation_r,
and ecHiNotations.trans_abbrev_r.
Verified with the original report case and a new regression in
tests/indexed-types.ec exercising [size (id_bits[:n] v) = n + 0].
Two related fixes that together let `rewrite L` and `apply L` work on lemmas declared over indexed types. 1. PTGlobal carries indices. The proof-term head `PTGlobal of EcPath.path * (ty list)` becomes `PTGlobal of EcPath.path * (tindex list) * (ty list)`. The constructor `ptglobal` and the alias `paglobal` gain an optional `?idxs` argument (default `[]`, so non-indexed call sites are source-compatible). `EcEnv.Ax.instantiate` also gains `?idxs`, substituting the lemma's idxvars in the spec. Without this, the proof checker re-instantiated only tyvars and the residual idxvars in the body broke conversion against the goal. 2. Closing a unienv now substitutes index-univars too. New helper `EcUnify.UniEnv.close_subst : unienv -> f_subst` builds a complete `f_subst` carrying both `~tu` (type-univars, as before) and `~iu` (index-univars). Used at the axiom-saving and op-saving sites in ecScope. Previously `Tuni.subst (close ue)` left every `TIUnivar` in the saved AST untouched, so even after the unifier resolved `?u_n := n_lem` the operator-type signatures inside the axiom's body still carried `?u_n`. Two `bits w` nodes that printed identically had different fresh univars and failed `is_conv`. Supporting infra: - `EcUnify.UniEnv.openidx` is exposed in the .mli (was private). Both `pt_of_uglobal_r` and `process_named_pterm` in ecProofTerm now open the lemma's idxvars to fresh `TIUnivar`s alongside its tyvars and thread both maps through `f_subst_init` to substitute the spec. - `EcMatching.MEV.assubst` adds `~iu:(iu_assubst ue)` so concretize resolves index-univars in the proof term and its formula together. - `EcMatching` `Fop` matching uses `unify_idx` for index lists (exposed in `EcUnify`) instead of structural `tindex_equal`. Verified: the user's `bits_cat` rewrite, `exact (test_eq w)` apply, and a new regression in tests/indexed-types.ec all work; full regression (184 decls) passes; non-indexed lemmas/rewrites unchanged.
…p targs
Three intertwined fixes for the user's [bits_cat] case:
lemma catE {m n} (wm : word<:m>) (wn : word<:n>) (i : int) :
0 <= i < m + n
=> (wm ++ wn).[i] = if i < m then wm.[i] else wn.[i-m].
proof.
move=> rgi @/"_.[_]".
rewrite bits_cat.
1. Op application records call-site indices in Fop targs.
`EcUnify.openty_r` now returns `(subst, ixs, tvs)` (was
`(subst, tvs)`); `select_op` returns
`(path, idxs, tys) * top * subue * sbody` (was `(path, tys)`);
`EcTyping.OpSelect.opsel.\`Op` and `opmatch.\`Op` carry
`path * tindex list * ty list`. `form_of_opselect` builds
`f_op p ~indices:ixs ~tyargs:tys ty`. Without this the call-
site indices were lost on Fop nodes — every `Fop` carried
`targs.indices = []` regardless of how the op was applied.
2. Op unfolding substitutes both tyvars and idxvars in the body.
`EcEnv.Op.reduce` was substituting only `tparams.tyvars`; now
it builds an `f_subst` with both `~tv` and `~idx` maps so that
every nested `Fop` in the unfolded body has its `targs.indices`
and `f_ty` rewritten to use the call-site indices.
3. Matcher does not unify Fop indices on the head.
The polynomial-against-polynomial `bits` head match
(`?u_m + ?u_n` against `m + n`) is genuinely ambiguous when
considered in isolation — multiple multi-univar Diophantine
solutions. The Fop matcher now only unifies type arguments and
trusts the surrounding `Fapp` arg matching to constrain indices
via per-arg f_ty unification (matching `(++) wm wn` first sets
`?u_m := m, ?u_n := n` individually, then `bits`'s polynomial
head trivially matches by reduction).
Every consumer of the new triple form was updated: ecHiInductive,
ecPrinting, ecScope (3 sites), ecTyping (4 sites), ecUserMessages.
Verified: full regression (202 decls) + new `unfold_then_rewrite`
test exercising the bug pattern.
…apply
When a lemma or op binds [{n}] and uses [n] both as a tindex AND as
an int term in its body (e.g. [size_bits {n} (w : word<:n>) :
size (bits w) = n]), substitution must reach BOTH namespaces:
- the tindex side ([TIVar n_lem] in [bits]'s targs), and
- the formula-local side ([Flocal n_lem] on the RHS).
Without the second part, opening the lemma at index [m] leaves a
dangling [Flocal n_lem] in the rewrite RHS, the goal becomes
[size (bits wm) = Flocal n_lem] (printed misleadingly as [... = n]
since both n_lem and m_caller share the name "n"), and the proof
cannot close. Same issue for [Op.reduce] when an op's body uses an
idxvar as int.
Fixes:
1. New [EcCoreFol.f_of_tindex : tindex -> form] projects a tindex
into the int-formula world. [TIVar id -> Flocal id : int],
[TIConst k -> f_int k], [TIAdd/TIMul -> f_int_add / f_int_mul].
Asserts on residual [TIUnivar].
2. [EcEnv.Op.reduce] (op-unfolding) and [EcEnv.Ax.instantiate]
(lemma application) now also bind [n_lem -> f_of_tindex idx] in
[fs_loc] alongside the existing [fs_idx] binding.
3. [EcProofTerm.pt_env] gains a [pte_idx_link] field recording each
lemma's [(idxvar ident, fresh tindex univar uid)] pairs;
[concretize_env] uses it to bridge the two namespaces during
proof-term concretization (when [?u_pat] resolves to [TIVar
m_caller], the corresponding [Flocal n_lem] in the body gets
bound to [Flocal m_caller]).
Verified: the user's [size_bits] / [bits_cat] / [catE] proof works,
plus a new regression case in tests/indexed-types.ec covering the
"idxvar used as int term in lemma RHS" pattern (214 decls total).
The previous fix for [pt_of_uglobal_r] (the no-instantiation lemma opener) wasn't carried over to [process_named_pterm] (the explicit [lemma[:idx]] / [lemma<:ty>] opener). Result: [have := mkK[:m + n]] on a lemma whose body uses [n] as an int term left a dangling [Flocal n_lem], breaking the proof checker with InvalidGoalShape. [process_named_pterm] now mirrors [pt_of_uglobal_r]: - For idxvars whose [openidx] returned a concrete [tindex] (because the user supplied [[:idx]]), bind [Flocal n_lem -> f_of_tindex idx] in [fs_loc] directly. The substitution flows into the formula immediately. - For idxvars whose [openidx] returned a fresh [TIUnivar] (the no-instantiation case), record [(n_lem, ?u)] in [pte_idx_link] so [concretize_env] can synthesise the form binding once unification resolves the univar. Same mechanism as [pt_of_uglobal_r]. Verified with the user's [have := mkK[:m + n]] case, plus a new regression in tests/indexed-types.ec exercising the explicit-index [have :=] pattern (227 decls total).
The previous fix made the matcher's Fop case skip index unification entirely, since polynomial-against-polynomial unification with multiple univars (e.g. [bits[:?u_m + ?u_n]] vs [bits[:m + n]]) is genuinely ambiguous in isolation. But that broke the simpler single-univar case: [rewrite mkK] (where mkK has one bound idxvar) no longer constrains [?u_pat] from the [mk[:?u_pat]] head, so the univar stays unresolved and the matcher concludes "nothing to rewrite". Make the index unification best-effort: try to unify each pair, and if a particular pair fails (multi-univar case), silently continue and let arg matching constrain the residual univars later. The single-univar case (handled by Gap-B's naked-univar fast path) goes through normally. Type unification on Fop heads stays mandatory. Verified: the user's [rewrite mkK] case now works, alongside the earlier [bits_cat] / [catE] cases (240 decls).
…ivars [Ax.instantiate], [Op.reduce], and [process_named_pterm] all bind [Flocal n_lem -> f_of_tindex idx] alongside the [TIVar n_lem -> idx] tindex substitution. But the call site can supply an [idx] that still contains an unresolved [TIUnivar] — happens when the matcher invokes [Ax.instantiate] before the surrounding unification has pinned the univar (e.g. on a chain like [apply: inj_bits; rewrite bits_cat. rewrite bits_cat]). The asserting [f_of_tindex] then crashes with the Phase-2 assert. New [EcCoreFol.f_of_tindex_opt : tindex -> form option] returns [None] when [ti] still contains [TIUnivar]. The three substitution sites use it to silently skip the form-side binding in that case; the form-side then gets resolved later by [pte_idx_link] at [concretize_env] time, once the univar is pinned. The asserting variant [f_of_tindex] stays for callers that have proven all univars are resolved. Verified: the user's [catA] / [apply: inj_bits; rewrite bits_cat] proof works (255 decls in tests/indexed-types.ec).
[h_tvar] is a [ty_params] record carrying both [tyvars] and
[idxvars], but the goal/hyps printers only displayed the type
variables. Lemmas with index binders (e.g. [{m n}]) showed an empty
"Type variables: <none>" line and no clue that [m, n] were in scope
as int-typed indices.
[pp_goal1] and [pp_hyps] now register the idxvars in the printer
env (via [PPEnv.add_locals]) and emit an "Index variables: m, n"
line above the existing "Type variables:" line whenever there is
at least one idxvar. The line is omitted entirely when no idxvars
are bound, so non-indexed lemmas look unchanged.
Verified: full regression (255 decls) passes; the line appears
during interactive proofs of any lemma with [{...}] binders.
Idxvars are non-negative integers by Phase-2 design. Expose this in
proofs by wrapping the goal — at proof-start time, not at lemma-save
time — with one [0 <= n_i =>] implication per idxvar. The user
introduces them on demand via [move=> Hn_i].
The wrapping is pushed INSIDE the lemma's [pa_vars] forall (not at
the very top) so the existing auto-introduction of [pa_vars] still
fires. So [lemma foo {n} (xs : T<:n>) : P] still auto-intros [xs];
the [0 <= n =>] hypothesis appears next, available via [move=>].
The implications never leak into the saved [ax_spec]: only the
proof goal sees them. Lemma application by other lemmas does not
require discharging [0 <= n] — the indexed-type discipline
guarantees it.
Also extend [EcSmt.lenv_of_tparams_for_hyp] to register each idxvar
as a top-level int constant in [te_lc], so [smt()] can talk about
the bound idxvars (else [trans_app]'s [Flocal] case would hit
[oget None]).
Verified: full regression (271 decls) plus three new test cases —
[idx_ge0_simple] (no [pa_vars]), [idx_ge0_smt] (multi-idxvar,
discharged via [smt()]), and [idx_with_args] ([pa_vars] auto-intro
still works alongside the new implications).
The Gap-B MVP scope already excluded multi-univar Diophantine but memory.md framed it as a defer. It's actually a non-goal: equations like [?m + ?n = 5] have no canonical nat solution, so the unifier correctly refuses rather than guess. Document the workaround (explicit [:...] instantiation) and the only sensible future direction (constraint-accumulate-and-commit), gated on a motivating use case.
FIFO queue order caused the type unifier to fail spuriously on polynomial index equations whose univars got resolved later in the queue. Concrete case: unifying word<:n*m> -> word<:m> array<:n> (lhs) against word<:?m_pack> array<:?n_pack> -> word<:?n_pack * ?m_pack> (pack's opened type, from the user's [packK : cancel unpack pack] lemma). Decomposition pushed [IxUni(n*m, ?n_pack * ?m_pack)] BEFORE the per-component [IxUni(m, ?m_pack)] and [IxUni(n, ?n_pack)]. The mixed-monomial case with unresolved univars is ambiguous in isolation so unify_ix failed — aborting the whole unification. Fix: when unify_ix fails, add the problem to a [deferred] list instead of raising. After the main queue drains, retry the deferred problems; each successful assignment triggers another drain pass. Only fail if a full pass makes no progress. Verified: the user's [packK] lemma ([cancel unpack[:m, n] pack]) typechecks; full regression (292 decls) passes. Does NOT solve genuinely multi-univar cases like [?m + ?n = 5] — those are still refused (no canonical solution) per the design note in memory.md.
The previous default of auto-injecting [0 <= n =>] for every idxvar
in a lemma's proof goal was too intrusive — users who don't need
the hypothesis paid the cost of an extra [move=>] to skip it. Make
it opt-in instead: a trailing `+` on the idxvar name in the binder
marks it as "inject the [0 <= n] hypothesis", everything else stays
as-is.
Examples:
lemma foo {n} : ... (* no hypothesis *)
lemma bar {n+} : ... (* [0 <= n] available via [move=> Hn] *)
lemma baz {m+ n} : ... (* only [0 <= m] is injected *)
Parser: new [idxvar_item] rule matches [lident PLUS] or plain
[lident]; [idxvars_decl] returns [(psymbol * bool) list];
[ix_ty_binder] returns [(idxs, nonneg_subset, tyvars_opt)].
Parsetree: new [paxiom.pa_idxvars_nneg : psymbol list] records the
`+`-marked idxvars.
Scope: [start_lemma] / [start_lemma_with_proof] gain an optional
[?nneg_idxs] argument; [add_r] looks up the marked names against
the axiom's [ax_tparams.idxvars] and threads the list through. Only
those idxvars get a [0 <= n =>] implication injected inside the
[pa_vars] forall. Non-lemma binders (op / pred / type / etc.)
accept the `+` syntax but ignore it.
Regression updated to use `+` where the hypothesis was relied on.
Silently ignoring the [+] idxvar marker outside lemma / axiom
binders was misleading — the mark looked accepted but did nothing.
Raise a parse error instead so the user sees exactly where the
marker is meaningless.
New helper [reject_nonneg_marker] in the parser prologue, called
from every non-lemma consumer of [ix_ty_binder] (operator, pred,
inductive predicate, notation, abbreviation) plus the type-decl
sites ([tyd_name], [clone_override.type]). Each produces a clear
parse error mentioning both the idxvar name and the declaration
kind:
type {n+} word. (* parse error: the `+' marker on idxvar `n'
only applies to lemma / axiom binders, not to
type declarations *)
Verified with a fresh rejection test per declaration kind, plus
the full regression (304 decls).
[LowSubst.is_eq_for_subst] walks a formula's free variables and calls [LDecl.by_id] for each to check whether the local has a definition that needs unfolding. But idxvars live in [h_tvar.idxvars], not [h_local], so [by_id] would error out with [unknown identifier `n/...`, please report] — crashing the [subst] tactic and any [move=> ->] / [move=> ->>] that destructured a hypothesis mentioning a bound idxvar. Fix: the [add] helper now consults [h_tvar.idxvars] first and skips ([by_id] is only meaningful for proper locals / hypotheses). Idxvars are constants from the perspective of substitution — they have no body to walk through, so skipping them is sound. Verified with the user's [eqP] proof and a new regression case that uses [move=> ->] on a hypothesis [xs = ys] where both have type [int vec<:n>] (the idxvar [n] would previously crash the substitution).
Idxvars occupy a hybrid namespace by Phase-2 design: listed in [h_tvar.idxvars] like type params, but also semantically int-typed locals usable in formulas. Every pass that walks formulas / locals / hypotheses has to be taught about this. We've patched seven such sites individually (bind_idx_locals, Op.reduce, pte_idx_link, lenv_of_tparams_for_hyp, LowSubst.add, Fop matcher, deferred IxUni) — each fix small and correct, but the pattern keeps recurring. Document the pattern, the existing patch sites, and three architectural directions to consider if the bug-of-the-week becomes a problem: (1) unify idxvars with formula-locals at AST level, (2) add a [Findex of tindex] form constructor, (3) status quo with an audit checklist for new passes.
[LDecl.init] only registers tyvars; idxvars stayed in [h_tvar.idxvars] but the proof env's local context didn't list them as int [LD_var]s. So a tactic argument like [exists n] (where [n] is a bound idxvar used as the existential witness) failed with [unknown variable or constant: `n']. [start_lemma] now passes [idx_locals] — one [(id, LD_var (tint, None))] per idxvar — as [~locals] to [LDecl.init], so the proof env knows about them from the start. The user can now use idxvars as values anywhere in tactic arguments. This is the seventh patch site for the idxvar-namespace fragility documented in memory.md. Verified with the user's [wordW] proof ([exists (mkseq (fun i => w.[i]) n)]) and a new regression test exercising [exists n] as the witness.
The previous patch added idxvars to [h_local] via [LDecl.init ~locals], creating a duplicate binding for [n]: once as a tparam ([h_tvar.idxvars]) and once as a regular local ([h_local]). That's the wrong design — idxvars should stay solely as tparams. Fix: [LDecl.init] now registers each idxvar via [Var.bind_local id tint] directly in [le_env] (the env exposed via [toenv]) without adding them to [h_local]. Name lookups (e.g. for tactic arguments like [exists n]) resolve the idxvar through the env path; the canonical record stays single (in [h_tvar.idxvars] only). Same observable behaviour as the previous patch (327 decls pass) without the duplication.
Users can now write [op[:_, m]] to fill some index positions
explicitly while letting the typechecker infer the rest. Each [_]
allocates a fresh [TIUnivar] in the unienv; the surrounding
context (call-site argument types, polynomial-equation deferred
retry, etc.) pins the univar via the existing unification path.
Parser: new [PIhole] variant in [pindex_r], matched by
[UNDERSCORE] in [pindex_atom].
Typer: [transtindex] for [PIhole] calls
[EcUnify.UniEnv.idx_fresh] (now exposed in the .mli).
Examples:
op concat {m n} ['a] : 'a vec<:m> -> 'a vec<:n> -> 'a vec<:m+n>.
concat[:_, 5] xs ys (* infer first, [5] explicit *)
concat[:m, _] xs ys (* [m] explicit, infer second *)
concat[:_, _] xs ys (* equivalent to [concat xs ys] *)
Useful when the type-directed inference is ambiguous on one
position but pinned on another by an annotation. Verified with
three new test cases (345 decls total).
When opening a lemma whose body references an idxvar `n` as both an
index (TIVar) and an int-typed Flocal (Phase-2 hybrid namespace), the
matcher used to compare the lemma's `Flocal n_lem` against the
caller's `Flocal m_caller` and fail (distinct idents). Tactics like
`rewrite usesN` and `rewrite packE` therefore reported "nothing to
rewrite" whenever the axiom referenced `n` as an int term.
Fix in two parts:
1. [pt_of_uglobal_r] / [process_named_pterm]: for each idxvar that is
referenced in the body's [f_fv], substitute [Flocal id_ax] to a
fresh form-evar registered in [pte_ev] and link it to the tindex
univar via [pte_idx_link]. Skipping idxvars not in [f_fv] avoids
leaving orphan evars that block [can_concretize].
2. [can_concretize] now calls [propagate_idx_link] first to keep the
two namespaces in sync regardless of which side the matcher pinned:
- form-evar bound to a form projecting into a tindex => resolve
the linked tindex univar via [unify_idx];
- tindex univar resolved to a [TIVar] / [TIConst] => set the
linked form-evar to the matching int formula.
Resolves the divergence between the indexed-types work (integer-indexed
type parameters: targs = {indices; types}, op/tydecl tparams as
{idxvars; tyvars} records, index unification) and origin/main's new
features (exceptions, subtypes, tyd_clinline inline-clone, PPAny match
patterns, classify_application-based op-selection diagnostics, hoare
hsi_m/hsi_inv split).
Conflict resolutions (12 files):
- ecTypes/ecUnify/ecTyping/ecTheoryReplay/ecScope/ecSection/ecHiGoal/
ecPrinting/ecParser/ecThCloning/ecPhlPrRw: combine HEAD's record-shaped
targs/ty_params + index threading with origin/main's new features.
- select_op result now carries call-site indices ((path, ixs, tvs), ...);
threaded through ecTyping constructor/record-projection paths.
- ecTheoryReplay: keep origin/main's fix-989 (ctor-remap HACK lifted out of
the Inline-only branch) atop the record-shaped params.
- ecPrinting try_pp_notations: use origin/main's result-typed op form
(f_op p ~tyargs:tv rty) rather than the function-typed variant, so infix
operators print correctly again.
- ecFol.proj_distr_ty: keep origin/main's lenient "unary constructor"
match so distr projections in outline/encryption theories don't assert.
- xop_override/preoperator `Direct payloads use EcIdent.t list (origin/main
semantics) now that EcDecl.ty_params is a record.
- op-application-errors.ec: one expect-fail case updated to the richer
diagnostic HEAD's unifier emits (shows the inferred `'a = int`).
Validated with -no-eco: dune build clean; stdlib 125/125; unit 86/86;
exception+circuit 21/21; tests/ 65/65; branch's indexed-types.ec passes.
Picks up origin/main's `defer failed-application classification (perf)` (e1a1e84) and auto-unfold of some Logic operators (88096fb). Only conflict was src/ecUnify.ml select_op_outcomes: origin/main split op-selection into a unify-based `select` (OK via mk_ok) plus a lazy `KO (lazy (classify ...))` for deferred error classification. Adopted that structure and threaded the indexed-types call-site indices through it (mk_ok now carries ixs into OK ((path, ixs, tvs), ...); classify uses op_tparams.tyvars). Validated with -no-eco: build clean; stdlib 125/125; tests/ 65/65; expect / op-application-errors / indexed-types pass.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds integer-indexed types to EasyCrypt: type constructors parameterised by both type variables and natural-number indices, with a small index language and decidable index equality. This lets you express size-carrying types like
'a vec<:n>and track index arithmetic through operators, lemmas, and cloning.