Struct OptimizeOptions¶

Defined in File options.hpp

Struct Documentation¶

struct OptimizeOptions¶

Options that control behavior of sequant::optimize.

Public Members

ObjectiveFunction objective_function = ObjectiveFunction::DenseFLOPs ¶: Objective function to minimize.

ReorderSum reorder = ReorderSum::Reorder ¶: Whether to reorder summands so terms with shared intermediates appear closer to each other.

CSEOptions CSE = {}¶: Common-subexpression-elimination options. All disabled by default; enabling can reduce op counts at the cost of additional optimization time.

index_to_extent_t idx_to_extent = {}¶: Caller-supplied Index to extent provider. If empty, defaults to IndexSpace::approximate_size().

std::function<double(Index const&, std::size_t)> inner_pow = {}¶: Optional k-aware inner-composite extent for CSV/PNO tensor-of-tensor indices, used by the peak objectives’ footprint accounting. For a group of k composites sharing a proto-index set, inner_pow(composite, k) should return the k-th power mean of the per-pair domain ((Sum_pairs d^k / nocc^N)^(1/k)), so that the product over the group times the outer nocc^N equals the true block-sparse volume Sum_pairs d^k. If empty, composites fall back to idx_to_extent (k=1), which grossly under-sizes multi-composite tensors.

BatchPolicy batch_policy = {}¶

Batchability policy: bundles the three per-index and per-leaf predicates that govern batched evaluation. All three fields default to empty (no batchable indices, no volatile leaves). The three sub-fields are:

is_batchable_index: marks an Index as living in a batchable space (e.g. DF/RI aux; = the eval cache’s accept_aux).
batch_target_size: per-index slice size (an upper bound); a sliced batchable index ix contributes min(extent, batch_target_size(ix)), a conservative over-estimate of the realized tile-floored batch. Only consulted by DensePeakSizeBatched.
is_volatile_leaf: marks a LEAF tensor as volatile (its value changes between replays). Empty => no tensor is volatile => cost weighting is disabled and volatile_weight is ignored. CC callers pass label==volatile_label, the same classification the runtime eval cache uses, so the optimizer’s cost model and the cache agree.

double volatile_weight = 1.0¶: Real-valued weight on the cost of each volatile contraction (re-evaluated on every replay of the network), while persistent (volatile-independent) contractions are counted once. Conceptually the expected number of replays. Default 1.0 (no change). Only consulted when batch_policy.is_volatile_leaf is non-empty and objective_function == ObjectiveFunction::DenseFLOPs.

double peak_flops_tolerance = 0.10¶: Relative peak tolerance for the peak objectives’ final selection: among the Pareto frontier of (peak, flops) trade-offs, pick the fewest-flops schedule whose peak is within (1 + peak_flops_tolerance) of the minimum. 0 = strict peak-min (flop tie-break only on exact peak ties). The default 0.10 trades up to a 10% peak increase for a (often much larger) flop reduction — e.g. forming a persistent 4-PNO integral instead of recomputing a particle-ladder. Only consulted by DensePeakSize / DensePeakSizeBatched.

double footprint_weight = 0.0¶

Per-intermediate memory-footprint penalty added to the single-term optimization cost. For every binary contraction, the storage footprint of its RESULT intermediate (the product of the extents of the result’s indices, i.e. its element count) is multiplied by this weight and added to the contraction cost. Unlike the FLOPs cost, this penalty is NOT scaled by volatile_weight (peak footprint is a one-time materialization cost, not a per-replay one). 0 (default) disables the penalty, recovering the pure FLOPs/Size behavior.

Rationale: the FLOPs cost is blind to the storage size of the intermediates it materializes, so it will happily pick an order (and thus expose, as a shareable subtree, a common subexpression) that carries a free large-space index — e.g. a half-transformed DF integral that still carries a free projected-AO (PAO) index — because forming it once is FLOPs-cheap. Such an intermediate can be enormous. A nonzero footprint_weight biases single-term optimization toward orders that defer or avoid materializing such large intermediates (e.g. transforming both large legs before exposing a shared subtree), trading a controlled amount of extra FLOPs for a lower peak footprint. Only consulted when objective_function == ObjectiveFunction::DenseFLOPs; the units are FLOPs-per-element, so a useful magnitude is on the order of the contracted-index extent that the offending intermediate would otherwise leave free.

RooflineParams roofline = {}¶: Roofline parameters for the peak objectives’ secondary (tie-break) cost; see RooflineParams. machine_balance == 0 (default) => pure-flop tie-break (no behavior change). Consulted only by DensePeakSize / DensePeakSizeBatched.

bool prune_outer_products = true¶: Prune disconnected (outer-product) subsets from the single-term contraction DP: a subset whose induced subgraph is disconnected under the “share a contractible (non-target) index” relation is an outer product the optimal tree never forms, so skipping it prunes search space without changing the result. true (default) prunes; false searches the full (unpruned) space and is exposed mainly for validation. (The environment variable SEQUANT_DISABLE_OUTER_PRODUCT_PRUNING force-disables pruning regardless of this flag.)