ALMA Componentes receives Purchase Orders where the Pr.tarifa,
Dto, and IMPORTE columns are systematically empty — the
customer sends only quantities and expects ALMA to price the order from its
master tariff. We formalize the pricing step as a lookup and aggregation problem over
a two-table cross-reference graph and prove that the output is deterministic, auditable,
and invariant to input-naming drift.
Let a Purchase Order be a set of n line items
where each ℓi = (pi, ri, qi, ui, di) with
Su Referencia), optional,Stage 4.5 must return a pricing assignment π : L → ℝ≥0 × ℝ≥0 mapping each line to (unit_price, line_total). When ui ≠ ⊥ the extraction honors it; when ui = ⊥ we must derive a price.
Let T = TGPAO ∪ TAXALYS be the disjoint union of the two
tariff tables. TGPAO keys the 155 internal ERP codes (T01A1XX1, …),
TAXALYS keys the 158 customer-facing refs (07MA1P250, 11A1, …).
Each node carries a canonical price ρ(t) ∈ ℝ>0.
An alias function A : Σ* → Σ* ∪ {⊥} maps
customer-visible names to tariff keys (e.g. A(AUDAX1UCR6ZK00) = 07MA1P250).
A is a partial function; currently |dom(A)| = 14 for the AUDAX family.
Given a candidate string s, define Λ : Σ* → T ∪ {⊥} as the first-match cascade:
Λ(s) = A(s) if A(s) ≠ ⊥ (alias)
s if s ∈ T (exact)
t if ∃ t ∈ T : norm(t) = norm(s) (normalized)
t if ∃ t ∈ T : norm(s).startswith(norm(t)) ∧ k ≥ 6 (prefix)
⊥ otherwise
where norm(x) = x.strip("-_ ").upper() and k = min(|norm(s)|, |norm(t)|). The cascade terminates in one of five cases, exactly one of which applies per input.
For a line ℓi = (pi, ri, qi, ui, di), set
Then the pricing assignment is
In the latter case, if |ui − ρ(ti)| / ρ(ti) ≥ τ we tag the line with a divergence flag. The default threshold τ = 0.10.
Determinism. Λ is deterministic: each clause of the cascade either fires or not, and earlier clauses dominate. Two queries with the same input always return the same tariff row.
Auditability. Each priced line carries the tuple
(tariff_ref, tariff_price, tariff_source), so
Isabel can reconstruct ρ(ti) from the original XLSX at any time.
Naming invariance. Adding an alias entry to A reclassifies every historical ⊥-line whose p or r matched s, without re-extracting the PDF. This is immediate because A is read at import, not baked into the model.
Per-line lookup: O(|T|) worst case, O(1) for the exact and alias clauses (dict lookup). Normalized and prefix clauses are O(|T| · k); for |T| = 313 and k ≤ 20 this is < 50 μs per line on the production EC2. Full-PO overhead is dominated by the VLM stages 0–2, so pricing enrichment is essentially free.
Centroalum PO PC26-001203, line 3:
p = "CI450968" # Centroalum internal
r = "AUDAX1UCR6ZK00" # ALMA/AXALYS ref
q = 60, u = ⊥, d = 0
Λ(p) = ⊥ (not in T)
Λ(r) = A("AUDAX1UCR6ZK00")
= "07MA1P250" (alias hit)
ρ("07MA1P250") = 6.08 EUR (axalys:Audax ALU UCR)
π = (6.08, 60 × 6.08 × 1.00) = (6.08, 364.80)
This is exactly the number shown in /ui today.
articles.json's synonyms column closes this.customer_id.