Cohesive Systems
  • Building Blocks
  • Products
  • Vision
  • Information
  • GitHub

Cohesive ARI Formalization

ARI solves a structured prediction problem over a bipartite graph between source and target field paths. The objective is to infer a globally consistent mapping via maximum a posteriori (MAP) inference in a constrained graphical model.

  • Problem setup
  • MAP objective
  • Energy decomposition
  • Feature representation
  • Unary scoring
  • Pairwise / structured scoring
  • Constrained optimization
  • One-to-one constraints
  • Type / ontology constraints
  • Structural constraints
  • Solution
  • Training objective
  • References

Problem setup

Let the sets of source and target field paths be:

S={si}T={tj}S = \{s_i\} \qquad T = \{t_j\}S={si​}T={tj​}

A mapping is a binary relation RRR:

R⊆S×TR \subseteq S \times TR⊆S×T

Define indicator variables:

xs,t∈{0,1},(s,t)∈S×Tx_{s,t} \in \{0,1\}, \quad (s,t) \in S \times Txs,t​∈{0,1},(s,t)∈S×T

where xs,t=1x_{s,t}=1xs,t​=1 means that the source sss and target ttt fields are compatible.

Then, equivalently, we can define RRR as:

R={(s,t)∣xs,t=1}R = \{(s,t) \mid x_{s,t} = 1\}R={(s,t)∣xs,t​=1}

MAP objective

ARI defines a Gibbs distribution over mappings:

P(R)∝exp⁡(−E(R))P(R) \propto \exp\left(-E(R)\right)P(R)∝exp(−E(R))

and seeks the MAP solution:

R∗=arg⁡min⁡RE(R)R^* = \arg\min_R E(R)R∗=argRmin​E(R)

Energy decomposition

The energy decomposes into unary and pairwise terms:

E(x)=∑(s,t)∈S×Tψu(s,t) xs,t+∑(s,t)≠(s′,t′)ψp((s,t),(s′,t′)) xs,txs′,t′E(x) = \sum_{(s,t)\in S\times T} \psi_u(s,t)\,x_{s,t} + \sum_{(s,t)\neq(s',t')} \psi_p\big((s,t),(s',t')\big)\,x_{s,t}x_{s',t'}E(x)=(s,t)∈S×T∑​ψu​(s,t)xs,t​+(s,t)=(s′,t′)∑​ψp​((s,t),(s′,t′))xs,t​xs′,t′​

where:

  • ψu\psi_uψu​ encodes local compatibility
  • ψp\psi_pψp​ encodes structural consistency

This corresponds to a pairwise Markov random field over candidate matches.

Feature representation

Each candidate pair is mapped to a feature vector:

ϕ:S×T→Rd\phi : S \times T \to \mathbb{R}^dϕ:S×T→Rd

denoted by xs,t\mathbf x_{s,t}xs,t​ for pair (s,t)(s,t)(s,t):

xs,t=ϕ(s,t)\mathbf{x}_{s,t} = \phi(s,t)xs,t​=ϕ(s,t)

Typical features include:

  • Lexical similarity
  • Structural context
  • Ontology compatibility
  • Embedding similarity: ϕemb(s,t)=⟨f(s),g(t)⟩\phi_{\text{emb}}(s,t) = \langle f(s), g(t) \rangleϕemb​(s,t)=⟨f(s),g(t)⟩

Unary scoring

Unary potentials (scores) are parameterized as:

ψu(s,t)=−fθ(xs,t)\psi_u(s,t) = -f_\theta(\mathbf{x}_{s,t})ψu​(s,t)=−fθ​(xs,t​)

Examples:

  • Linear: fθ=w⊤xf_\theta = w^\top \mathbf{x}fθ​=w⊤x
  • Tree/MLP models (GBDT, neural scoring)

Candidate pruning retains:

Ck(s)=Top-k⁡fθ(xs,t){t∈C(s)}C_k(s) = \operatorname{Top\text{-}k}_{f_\theta(\mathbf{x}_{s,t})} \{t \in C(s)\}Ck​(s)=Top-kfθ​(xs,t​)​{t∈C(s)}

Pairwise / structured scoring

Pairwise potentials capture dependencies:

ψp((s,t),(s’,t’))=−pθ((s,t),(s’,t’))\psi_p((s,t),(s’,t’)) = -p_\theta((s,t),(s’,t’))ψp​((s,t),(s’,t’))=−pθ​((s,t),(s’,t’))

Examples:

  • Cross-encoder: pθ=hθ(s,t,s’,t’)p_\theta = h_\theta(s,t,s’,t’)pθ​=hθ​(s,t,s’,t’)
  • Structured models (CRF / GNN): pθ=ψθ(G,(s,t),(s’,t’))p_\theta = \psi_\theta(\mathcal{G}, (s,t), (s’,t’))pθ​=ψθ​(G,(s,t),(s’,t’))

These enforce:

  • Structural alignment
  • Co-occurrence patterns
  • Ontological consistency

Constrained optimization

The MAP problem can be written as an integer quadratic program.

Using ψu(s,t)=−fθ(xs,t)\psi_u(s,t) = -f_\theta(\mathbf{x}_{s,t})ψu​(s,t)=−fθ​(xs,t​) and ψp((s,t),(s′,t′))=−pθ((s,t),(s′,t′))\psi_p((s,t),(s',t')) = -p_\theta((s,t),(s',t'))ψp​((s,t),(s′,t′))=−pθ​((s,t),(s′,t′)), the MAP objective

R∗=arg⁡min⁡RE(R)R^* = \arg\min_R E(R)R∗=argRmin​E(R)

becomes (equivalently)

max⁡x∈{0,1}∣S∣×∣T∣  ∑(s,t)fθ(xs,t) xs,t+∑(s,t)≠(s′,t′)pθ((s,t),(s′,t′)) xs,txs′,t′\max_{x\in\{0,1\}^{|S|\times|T|}} \; \sum_{(s,t)} f_\theta(\mathbf{x}_{s,t})\,x_{s,t} + \sum_{(s,t)\neq(s',t')} p_\theta\big((s,t),(s',t')\big)\,x_{s,t}x_{s',t'}x∈{0,1}∣S∣×∣T∣max​(s,t)∑​fθ​(xs,t​)xs,t​+(s,t)=(s′,t′)∑​pθ​((s,t),(s′,t′))xs,t​xs′,t′​

subject to:

One-to-one constraints

∑txs,t≤1∀s∑sxs,t≤1∀t\sum_{t} x_{s,t} \le 1 \quad \forall s \qquad \sum_{s} x_{s,t} \le 1 \quad \forall tt∑​xs,t​≤1∀ss∑​xs,t​≤1∀t

Type / ontology constraints

xs,t=0if incompatible(s,t)x_{s,t} = 0 \quad \text{if } \text{incompatible}(s,t)xs,t​=0if incompatible(s,t)

Structural constraints

  • Mutual exclusion: xs,t+xs′,t′≤1x_{s,t} + x_{s',t'} \le 1xs,t​+xs′,t′​≤1
  • Hierarchical consistency: xs,t≤xparent(s),parent(t)x_{s,t} \le x_{\text{parent}(s), \text{parent}(t)}xs,t​≤xparent(s),parent(t)​

This yields an ILP / quadratic optimization problem.

Solution

The optimal mapping is:

R∗={(s,t)∈Ck∣xs,t=1}R^* = \{(s,t) \in C_k \mid x_{s,t} = 1\}R∗={(s,t)∈Ck​∣xs,t​=1}

Training objective

The models are trained over heterogeneous datasets:

D=αDpre+βDgold+γDfeedback+δDnegD = \alpha D_{\text{pre}} + \beta D_{\text{gold}} + \gamma D_{\text{feedback}} + \delta D_{\text{neg}}D=αDpre​+βDgold​+γDfeedback​+δDneg​

Optimize:

L=λ1Lcontrastive+λ2Lhard-neg+λ3Lranking\mathcal{L} = \lambda_1 \mathcal{L}_{\text{contrastive}} + \lambda_2 \mathcal{L}_{\text{hard-neg}} + \lambda_3 \mathcal{L}_{\text{ranking}}L=λ1​Lcontrastive​+λ2​Lhard-neg​+λ3​Lranking​

References

Cohesive ARI Architecture
Cohesive Systems

Contact

About

FAQ

© Cohesive Systems 2026

GitHubXLinkedIn