Architecture¶
IRx is organized as a small compiler pipeline with a deliberate boundary between semantic meaning and backend-specific lowering. The goal is to keep the codebase easy to extend without letting semantic rules slowly drift into code generation.
Design Goals¶
The current architecture is shaped by a few practical goals:
- Keep parsing, semantic analysis, and code generation as distinct phases.
- Make semantic analysis the authority for meaning and program validity.
- Keep backend packages focused on emission, not interpretation.
- Preserve method-based multiple dispatch for visitor-driven lowering.
- Use package structure to communicate architecture instead of large utility
modules or generic
helpers/folders.
Pipeline Overview¶
IRx currently follows this high-level flow:
ASTx parser output -> semantic analysis -> resolved semantic sidecars -> backend code generation
The parser produces raw ASTx nodes. Those nodes are still close to surface
syntax and may not yet have enough information for direct lowering. The
semantic-analysis phase walks that tree, resolves symbols and types, validates
program rules, and attaches a structured node.semantic sidecar to the nodes
that backend code needs.
By the time a backend starts lowering, it should not need to infer meaning from raw syntax or re-run language validation from scratch.
Semantic Analysis¶
The semantic-analysis package lives in src/irx/analysis/ and is intentionally
independent from LLVM or llvmlite.
It is responsible for:
- symbol resolution
- lexical scope tracking
- mutability and assignment validation
- function and return validation
- loop-control legality such as
breakandcontinue - expression typing and promotion policy
- operator normalization
- semantic flag normalization such as unsigned and fast-math intent
- diagnostics collection and semantic error reporting
The public entry points are:
irx.analysis.analyze(node)irx.analysis.analyze_module(module)
These entry points return the same AST root after attaching semantic sidecars.
If semantic validation fails, analysis raises SemanticError before codegen
begins.
Why sidecars instead of a separate HIR?¶
For the current size of IRx, attaching explicit semantic sidecars to AST nodes is the lightest approach that still creates a clean boundary. It gives codegen resolved information without introducing a second full tree structure before it is needed.
If the language grows to the point where a true HIR becomes useful, the current phase split still leaves room for that evolution.
Shared Visitor Foundation¶
IRx also has a shared visitor layer in src/irx/base/visitors/.
It currently provides:
BaseVisitorProtocol: the minimal typing contract shared by visitor-style classesBaseVisitor: a concrete Plum-dispatch scaffold with explicitNotImplementedErrordefaults for the current ASTx node surface
This keeps typing and runtime behavior separate:
- protocols define what visitor-like objects must expose
- the concrete base class defines what happens for unsupported nodes
In practice:
SemanticAnalyzerinheritsBaseVisitorBuilderVisitorinheritsBaseVisitor- backend-specific protocols such as
llvmliteir.VisitorProtocolextendBaseVisitorProtocol
Backend Architecture¶
Each backend should live in its own package under src/irx/builders/. The
package path identifies the backend, while the classes inside the package use
short generic names.
For example, src/irx/builders/llvmliteir/ exposes:
BuilderVisitorVisitorProtocol- optional
VisitorCoreas a module-private implementation class
This naming convention matters for future backends. A contributor adding a new backend should not need to invent unique class prefixes when the package path already provides the context.
llvmliteir Package Layout¶
The LLVM backend is split into first-class modules instead of one monolithic builder:
../src/irx/base/visitors/: shared visitor protocol and runtime scaffoldfacade.py: public backend entry pointscore.py: shared mutable lowering state and backend lifecycleprotocols.py: typing contract used by mixins and runtime featurestypes.py,casting.py,vector.py,strings.py,runtime.py: shared IR infrastructurevisitors/: concern-groupedvisit(...)overloads
Foundational modules stay at the package root because they are architectural components, not incidental helpers.
Why visit(...) Remains the Public Lowering Boundary¶
The codegen layer continues to use method-based Plum multiple dispatch:
visit(self, node: ...)
This remains the only public dispatch boundary for backend lowering. IRx does
not use a free-function dispatch registry or a second public API like
lower(...) or build_node(...).
That choice keeps backend code readable and local:
- AST-family-specific lowering remains attached to the visitor class.
- Mixins can group overloads by concern without changing the public surface.
- Shared lowering state stays on the visitor instance instead of moving into a registry-driven design.
Core Class and Protocol¶
VisitorProtocol and VisitorCore serve different purposes:
VisitorProtocoldefines the stable interface that mixins and runtime feature declarations depend on for typing, building onBaseVisitorProtocol.VisitorCoreis the concrete implementation center that owns mutable state, module setup, helper methods, and backend lifecycle.
VisitorCore is still internal to the backend package. IRx uses
from public import private for module-level internal helpers and internal
implementation classes when a clear non-underscored name reads better than an
underscore-prefixed export. That keeps internal names readable without making
them part of the intended public surface.
The protocol is not a replacement for the core class. It exists so backend subsystems can depend on a narrow contract instead of the full concrete type.
Visitor Mixins¶
The final backend visitor is composed from concern-specific mixins plus the shared core. Each mixin should contain:
@dispatch def visit(self, node: ...)overloads for one concern- a small number of private helpers local to that concern
Examples of concern boundaries include:
- literals
- variables
- unary and binary operators
- control flow
- functions
- runtime or domain-specific lowering
This keeps dispatch organization aligned with language structure while still sharing one lowering state object.
Contributor Guidelines¶
When extending IRx, these rules help preserve the architecture:
- Put semantic meaning and validation in
analysis/, not in a backend. - Let codegen consume normalized semantic information instead of re-deriving it.
- Keep shared visitor dispatch defaults in
src/irx/base/visitors/so semantic and backend visitors fail consistently for unsupported ASTx nodes. - Add new backend-wide infrastructure at the package root, not under
helpers/. - Keep mutable lowering state instance-local.
- Prefer explicit code over clever abstractions.
- Use the package name, not class prefixes, to identify the backend.
When To Add A New Backend¶
If IRx gains another backend, it should follow the same broad shape:
- a package under
src/irx/builders/ - a public
Builder - a public
Visitor - a
VisitorProtocolif mixins or runtime hooks need typed access - an optional module-private
VisitorCorefor shared state and infrastructure
That keeps backend packages consistent for both contributors and users of the library.