Semantic Contract¶
IRx exposes a small but explicit semantic boundary between host parsing and
backend lowering. That boundary is defined in code by
irx.analysis.get_semantic_contract() and is enforced by the public analysis
entrypoints in irx.analysis.api.
Stable Semantic Phases¶
IRx currently treats these phases as stable:
module_graph_expansion:analyze_modules(...)asks the hostImportResolverfor every reachableParsedModule, records import edges, and produces a stable dependency order inCompilationSession.load_order.top_level_predeclaration:analyze_modules(...)registers top-level functions and structs for every reachable module before body validation.top_level_import_resolution:analyze_modules(...)resolves module-top- level imports into module-visible bindings and rejects unsupported import forms.semantic_validation:analyze(...),analyze_module(...), andanalyze_modules(...)attach semantic sidecars, normalize resolved meaning, and raiseSemanticErrorif diagnostics exist.
Metadata Required Before Codegen¶
Before lowering starts, IRx guarantees that analyzed nodes may carry
node.semantic: SemanticInfo with these stable fields:
resolved_typeresolved_symbolresolved_functionresolved_callableresolved_callresolved_returnresolved_structresolved_classresolved_moduleresolved_importsresolved_operatorresolved_assignmentresolved_field_accessresolved_class_field_accessresolved_base_class_field_accessresolved_static_class_field_accessresolved_method_callresolved_class_constructionsemantic_flagsextras
For multi-module compilation, IRx also guarantees the following
CompilationSession state before lowering:
rootmodulesgraphload_ordervisible_bindings
Lowering should consume this semantic metadata instead of re-deriving meaning from raw syntax.
When lowering or build layers discover that this contract has been violated, they now raise structured diagnostics instead of flattening the failure to a plain Python exception string. In other words:
- semantic failures continue to aggregate in
SemanticError - lowering failures surface as
LoweringError - native runtime-artifact compilation failures surface as
NativeCompileError - final executable link failures surface as
LinkingError - runtime feature activation and symbol-resolution failures surface as
RuntimeFeatureError
Each of those exception types carries one Diagnostic record with stable code,
phase, and best-effort source attribution when IRx can recover it.
Diagnostic Contract¶
IRx now uses one shared diagnostics model across semantic analysis, lowering, native artifact compilation, final linking, and runtime feature resolution.
Every diagnostic may include:
- phase
- message
- logical code such as
S010orK001 - module attribution when known
- best-effort source location derived from
node.loc - note and hint lines
- wrapped cause information
- related secondary locations
Semantic analysis still aggregates multiple diagnostics in DiagnosticBag and
raises SemanticError only after the semantic pass completes. Later phases
raise one structured diagnostic exception immediately because they do not have a
bagging pass today.
Source Locations¶
IRx centralizes source extraction through shared helpers:
get_node_source_location(node)safely readsnode.locwithout assuming every AST node carries a full spanSourceLocationstores line and column today and already has optional end fields for future span-aware parsersformat_source_location(...)renders module and line/column consistently for semantic and non-semantic diagnostics
IRx does not invent fake spans. If an AST node does not carry location data, the diagnostic still formats cleanly without a location prefix.
Diagnostic Codes And Prefixes¶
Diagnostic codes are split into a stable logical identifier and a configurable display prefix.
- IRx stores logical identifiers such as
S001,S010,L001,F001,R001,C001, andK001 - IRx renders them through one shared
DiagnosticCodeFormatter - the default display prefix is
IRX-
Current high-level families:
Sxxx: semantic analysisFxxx: public FFI contractLxxx: lowering and codegenRxxx: runtime feature activation and symbol resolutionCxxx: native runtime-artifact compilationKxxx: final executable linking
Downstream compilers can override the prefix without forking IRx formatting logic:
from irx.diagnostics import set_diagnostic_code_prefix
set_diagnostic_code_prefix("ARX-")
After that override, the same logical code renders as ARX-S010, ARX-L001,
ARX-R001, and so on.
Formatting¶
IRx keeps the first diagnostic line compact:
module_a:12:8: error[IRX-S010]: argument 1 of call to 'puts' expects UTF8String but got Int32
When extra context exists, the formatter appends indented follow-up lines:
module_a:4:2: error[IRX-S002]: Identifier already declared: value
note: duplicate declarations in one scope are not allowed
related: module_a:1:1: previous declaration is here
Non-semantic failures also keep their phase visible:
error[IRX-K001] (link): link failed while producing 'demo'
note: command: clang /tmp/irx_module.o -o /tmp/demo
note: stderr: undefined reference to `sqrt`
Function Signature And Calling Contract¶
Callable semantics are part of IRx's stable semantic boundary.
- every callable is normalized into one canonical semantic signature before lowering
- the canonical signature includes callable identity, ordered parameters, return type, calling convention class, variadic flag, extern/native status, and lowered symbol name
- extern signatures additionally record required runtime features and validated public FFI classification metadata
- parameter order is stable and exactly matches declaration order
- duplicate parameter names are rejected semantically
- unresolved parameter or return types are rejected semantically
- conflicting declarations are rejected semantically
Calling conventions are classified semantically even when current LLVM emission is shared:
irx_defaultfor IRx-defined functionscfor explicit extern/native declarations
Current declaration metadata is intentionally narrow. When present on
FunctionPrototype, IRx consumes:
is_externcalling_conventionis_variadicsymbol_nameruntime_featureruntime_features
Class Inheritance Contract¶
Class semantics are also part of IRx's stable semantic boundary.
- every analyzed class records a deterministic C3 linearization in
SemanticClass.mro - multiple inheritance is allowed when C3 can produce a consistent order; if it cannot, semantic analysis raises a diagnostic before lowering
- method lookup follows MRO order for inherited candidates, but class methods
now normalize into exact-signature overload groups in
SemanticClass.method_groupsandSemanticClass.method_resolution SemanticClass.member_tableremains the single-name lookup surface for attributes and method names with exactly one visible overload; overloaded method families live only in the overload-group metadata- same-name inherited attributes from distinct ancestors are rejected as ambiguous unless they collapse to one logical shared ancestor in the MRO
- same-call-signature inherited methods from sibling bases are rejected unless one candidate dominates unambiguously through inheritance or the subclass supplies an exact-signature override
- diamond inheritance is allowed semantically;
SemanticClass.shared_ancestorsrecords ancestors reached through more than one direct-base lineage so later layout/lowering phases can reuse that metadata instead of re-deriving it - private members do not participate in inherited lookup; non-private inherited
members are normalized before lowering in
SemanticClass.member_tableandSemanticClass.method_groups
Class Layout Contract¶
IRx now records a deterministic low-level object layout for every analyzed class before lowering.
- class values lower as pointers to identified object structs, not by-value composites
- every class object reserves two hidden header slots first: one type-descriptor pointer slot and one dispatch-table pointer slot
- the type-descriptor header slot is reserved in semantic layout metadata now, but codegen does not populate a descriptor global yet; that slot is kept stable for later construction/runtime work
- instance storage is flattened in one canonical ancestor-first order with shared ancestors stored once per logical base class
SemanticClass.layout.instance_fieldsrecords stable storage indices for all declared instance attributes, including inherited storage that is not visible for lookup- declared static attributes lower to internal module globals named in
SemanticClass.layout.static_fields SemanticClass.layout.visible_field_slotsandSemanticClass.layout.visible_static_storagelet later lowering phases reuse semantic member resolution instead of recomputing layout lookups from syntax
Class Method Contract¶
IRx class methods now lower as explicit functions with analysis-owned dispatch metadata rather than as implicit runtime behavior.
- every analyzed class method records a normalized source signature plus a
lowered callable form in
SemanticClassMember.lowered_function - instance methods gain one hidden leading
selfparameter whose type is the declaring class pointer representation - static methods keep their declared parameter list and do not receive an implicit receiver
- class methods support exact-signature multidispatch by method name and explicit argument types; overload selection does not rank implicit numeric or class-hierarchy conversions when more than one overload is visible
- non-private instance methods receive hierarchy-family-local dispatch slots in
SemanticClassMember.dispatch_slot; valid exact-signature overrides reuse the inherited slot, and unrelated class families do not affect those slot numbers SemanticClass.layout.dispatch_entriesandSemanticClass.layout.visible_method_slotsprovide the lowering-facing view of one class dispatch table after MRO resolution, keyed by exact signatureMethodCallanalysis records oneResolvedMethodCallwith the chosen member, overload key, candidate set, validated argument conversions, dispatch mode, receiver class, and dispatch slot when indirect dispatch is required- implicit
Derived -> Baseupcasts participate in assignment, call, and return compatibility, which makes overridden instance methods usable through base-typed receivers - full multiple-inheritance ancestor field views are still deferred; the current upcast guarantee is specifically for class pointer compatibility and method dispatch
- conversion-ranked overload selection is intentionally deferred; callers that
want a non-exact overload must spell the conversion explicitly with
Cast StaticMethodCallanalysis always resolves to direct calls because static methods never consume a hidden receiver and do not participate in instance dispatch- lowering emits one internal dispatch table global per class when at least one visible instance method has a dispatch slot, and instance call sites load the callee through that table instead of re-resolving semantics from syntax
Class Member Access Contract¶
IRx now distinguishes instance and class-qualified member access forms explicitly instead of inferring them from generic field syntax.
FieldAccessremains the low-level read form forobj.attron structs and class instancesBaseFieldAccessis the low-level read/write form for one explicit base-qualified instance attribute view on a class receiverStaticFieldAccessis the low-level read/write form forClassName.static_attrMethodCallremains the low-level call form forobj.method(...)and uses direct or indirect dispatch from analyzed method metadataBaseMethodCallis the low-level call form for one explicit base-qualified instance method invocation and lowers as a direct call to the selected base implementationStaticMethodCallremains the low-level call form forClassName.static_method(...)and always lowers as a direct call- base-qualified member access is legal only when the named base appears in the receiver class MRO; analysis rejects unrelated class names before lowering
- static field reads resolve through
SemanticClass.layout.visible_static_storagefirst and only fall back to qualified-name storage metadata for the selected inherited member when needed - instance field reads resolve through
SemanticClass.layout.visible_field_slotsand the canonical flattened storage layout recorded during class analysis - explicit base-qualified field reads resolve through the concrete receiver layout using the selected base member's qualified storage slot
- lowering consumes the resolved storage and dispatch metadata attached during semantic analysis and does not re-run class member lookup from syntax
- direct writes through
FieldAccess,BaseFieldAccess, andStaticFieldAccessreuse the same analyzed layout or storage metadata as their read paths - constant class members reject both assignment and unary mutation during semantic analysis before lowering
- static field initialization remains limited to literal/default construction in this phase; phase 8 adds mutability and write enforcement without changing initialization order
- implicit ancestor field views remain deferred; IRx now supports only explicit base-qualified ancestor access in this phase
Class Access Control Contract¶
IRx enforces class visibility during semantic analysis instead of deferring it to lowering or runtime behavior.
publicmembers are accessible from any context that can already resolve the containing class valueprivatemembers are accessible only while analyzing methods of their declaring classprotectedmembers are accessible only within the declaring class and its subclasses; IRx does not add same-module or friend-style access- private members still stay out of inherited lookup tables, but access sites diagnose hidden base members as inaccessible rather than pretending they do not exist
- explicit
BaseFieldAccessandBaseMethodCallfollow the same visibility rules as ordinary member access; qualifying a base does not bypass access control - when the declaring class accesses one of its private members through a derived-typed receiver, analysis resolves the originating base member and lowering reuses the existing class-pointer upcast path
Class Construction Contract¶
IRx now exposes one low-level default-construction path for classes without introducing high-level constructor syntax yet.
ClassConstruct("Name")allocates one heap object and returns the analyzed class pointer type forName- construction initializes object headers first, then instance fields in the
same canonical flattened storage order recorded in
SemanticClass.layout.instance_fields - the reserved type-descriptor header slot is initialized to null for now
- the dispatch-table header slot is initialized to the class dispatch global when one exists, otherwise null
- instance fields use their declaration initializer when present; otherwise they receive the same zero/null default used for ordinary local declarations
- instance constant fields must have declaration initializers in the current model because dedicated constructors are not implemented yet
- static class fields initialize once at module load from literal declaration initializers or their zero/null default when no initializer is present
- non-literal static field initializers are rejected during semantic analysis in this phase so codegen never has to invent runtime initialization order
Class ABI And Interop Contract¶
IRx now makes the internal class ABI explicit without treating it as a stable foreign object ABI.
- source-level
ClassTypeparameters and return types inside IRx-defined functions lower through the same pointer ABI as method receivers; they are not copied by value - class method bodies lower to internal LLVM symbols named with
mangle_class_method_name(module_key, class_name, method_name, overload_key)so the module, declaring class, method name, and exact overload signature all participate in the symbol name - class static attributes lower to internal globals named with
mangle_class_static_name(module_key, class_name, member_name) - dispatch tables and reserved descriptor metadata remain internal globals and should be treated as opaque implementation details rather than a public ABI
- IRx does not promise a stable foreign ABI for general classes in this phase;
explicit extern declarations reject
ClassTypedirectly and through pointer pointees - ABI-oriented interop should continue to use plain structs, plain extern functions, typed pointers, and opaque handles at foreign boundaries
Class Diagnostics Contract¶
IRx now treats class errors as stable semantic diagnostics rather than backend failures.
- unknown base classes diagnose before MRO resolution completes
- duplicate direct bases, self-inheritance, inheritance cycles, and inconsistent C3 linearizations diagnose on the class definition
- duplicate member names, duplicate exact method signatures, return-type-only overloads, and mixed static/instance method families diagnose during member normalization
- inherited attribute ambiguity, inherited member conflicts, visibility reductions on override, and static/instance status changes across inheritance diagnose before lowering
- inaccessible private/protected member use, instance-vs-static access misuse, and unrelated explicit base qualification diagnose at semantic access sites
- constant class members reject assignment and unary mutation during semantic analysis before codegen
- invalid constant or static initialization rules diagnose while building the canonical class initialization plan
Public FFI Contract¶
IRx now treats explicit extern/native declarations as one public FFI layer instead of an incidental backend escape hatch.
What Qualifies As A Public FFI Callable¶
- only explicit extern declarations participate in the public FFI contract
- extern declarations must not define an IRx body
- extern declarations default to calling convention
c - source-level function names default to
symbol_name == name symbol_namemay override the linked/native symbol while keeping a different IRx-visible wrapper nameruntime_featureorruntime_featuresmay declare explicit native dependency packaging for that extern- semantic analysis records the public/source name, linked symbol name, calling convention, variadic flag, extern flag, required runtime features, and public FFI admissibility metadata before lowering
Public FFI Type Policy¶
IRx intentionally keeps the public FFI type surface narrow in this phase.
Accepted in extern signatures:
- scalar integers
- scalar floats
BooleanNoneTypeonly as a return type (void)String/UTF8String/UTF8Charonly as pointer-based extern valuesPointerType(T)whenTis itself FFI-admissiblePointerType()as an opaque pointerOpaqueHandleType("name")BufferOwnerTypeas a named opaque handle- ABI-compatible structs
- nested ABI-compatible structs by value
- the canonical
BufferViewTypedescriptor, which is a stable plain ABI struct
Rejected in extern signatures:
- unresolved or unsized types
ClassType(...)values and pointers to class pointees- non-ABI-stable internal-only composite forms
- temporal and other IRx-only types without an explicit public FFI ABI contract
- pointers to unsupported pointee types
- arbitrary variadic IRx-defined callables
- function pointers and callbacks in this phase
ABI-Compatible Structs For FFI¶
The public FFI layer accepts a validated subset of IRx structs:
- fields must resolve semantically before lowering
- declaration order is the ABI field order
- empty structs are rejected
- direct or mutual by-value recursive layouts are rejected
- every field must itself be FFI-admissible
- nested structs are allowed when every nested field remains ABI-admissible
- by-value and by-pointer passing both use the same validated layout assumptions
- lowering emits the same plain LLVM struct layout that semantic validation approved; no hidden headers or runtime payloads are introduced
Pointers And Opaque Handles¶
PointerType(T)represents a typed native pointerPointerType()represents an opaque pointer with no visible pointee layoutOpaqueHandleType("name")represents a first-class named native handle whose layout is intentionally hidden- opaque handles may be passed, returned, stored, and compared when comparison is otherwise semantically supported
- opaque handles do not support field access or indexing
- nullability is not modeled statically yet; null is currently a runtime-level concern rather than a typed IRx value
Symbol Resolution And Runtime Features¶
- an extern with no runtime features emits only an LLVM external declaration; the final system toolchain/linker is expected to resolve the symbol
- an extern with
runtime_feature/runtime_featuresstill emits one semantic extern declaration, but it also activates the named runtime feature set for that compilation unit - runtime features remain the only place where IRx packages native C sources, objects, static libraries, or linker flags
- duplicate extern declarations with incompatible ABI or runtime-feature meaning are rejected semantically
- duplicate source-level declarations or duplicate
symbol_namealiases must be compatible in calling convention, variadic status, symbol name, parameter types, return type, and required runtime features
Intentionally Unsupported For Now¶
- dynamic loading or plugin discovery
- callbacks and public function-pointer interop
- broad platform-specific ABI tuning beyond the current LLVM/data-layout model
- arbitrary variadic IRx-defined functions
- auto-coercion between incompatible pointers, structs, or opaque handles
Minimal examples:
puts = astx.FunctionPrototype(
"puts",
args=astx.Arguments(astx.Argument("message", astx.UTF8String())),
return_type=astx.Int32(),
)
puts.is_extern = True
puts.calling_convention = "c"
puts.symbol_name = "puts"
sqrt = astx.FunctionPrototype(
"sqrt",
args=astx.Arguments(astx.Argument("value", astx.Float64())),
return_type=astx.Float64(),
)
sqrt.is_extern = True
sqrt.calling_convention = "c"
sqrt.symbol_name = "sqrt"
sqrt.runtime_feature = "libm"
open_handle = astx.FunctionPrototype(
"open_handle",
args=astx.Arguments(),
return_type=astx.OpaqueHandleType("demo_handle"),
)
open_handle.is_extern = True
open_handle.calling_convention = "c"
open_handle.symbol_name = "open_handle"
astx.StructDefStmt(
name="Point",
attributes=[
astx.VariableDeclaration(name="x", type_=astx.Float64()),
astx.VariableDeclaration(name="y", type_=astx.Float64()),
],
)
take_point = astx.FunctionPrototype(
"take_point",
args=astx.Arguments(astx.Argument("point", astx.StructType("Point"))),
return_type=astx.Int32(),
)
take_point.is_extern = True
take_point.calling_convention = "c"
take_point.symbol_name = "take_point"
Call And Return Validation¶
Function calls are validated through one semantic path before lowering:
- callee resolution must produce a callable symbol with a canonical signature
- fixed-arity calls must match the declared parameter count exactly
- variadic calls are limited to explicit extern/native declarations
- fixed prefix arguments use the canonical implicit-cast policy
- successful call analysis records resolved callable metadata, resolved argument types, result type, and any inserted implicit conversions
- lowering must consume that metadata instead of repairing malformed calls
Returns are also validated semantically before lowering:
return expris valid only in non-void functions- bare
returnis valid only in void functions - implicit return conversion follows the same canonical cast policy used for assignments and call arguments
- non-void functions must not fall through
- structured control flow is analyzed conservatively; missing returns on any reachable path are rejected
Representative examples of the current semantic style:
cannot assign Float64 to 'count' of type Int32argument 2 of call to 'sqrt' expects Float64 but got Int32if condition must be Boolean, got Int32extern 'take_point' is not FFI-safe: parameter 'point' field 'x' uses unsupported FFI type 'DateTime'
Void and non-void usage is explicit:
- void calls may be used as statements
- void calls may not be used as expression values
- non-void calls may be used as expressions or discarded as statements
main Contract¶
main is part of the stable semantic contract rather than a backend caveat:
mainmust beInt32 main()mainmust not be variadicmainmust not be externmainmust return deterministically along every path
IRx no longer accepts loose void main behavior or non-deterministic
fallthrough.
Scalar Numeric Foundation¶
Binary scalar numerics use one canonical promotion table:
| Operand mix | Promoted operand type |
|---|---|
float + float |
wider float |
float + integer |
float widened to cover the integer width floor (16, 32, or 64 bits), capped at Float64 |
signed + signed |
wider signed integer |
unsigned + unsigned |
wider unsigned integer |
signed + unsigned |
wider signed integer when the signed operand is strictly wider; otherwise the wider unsigned integer |
Comparison operators (<, >, <=, >=, ==, !=) promote their operands
with the same table and always return Boolean semantically and i1 in LLVM
IR.
Boolean And Comparison Contract¶
Boolean behavior is part of the stable semantic boundary:
- comparisons always return
Boolean if,while, andfor-countconditions must beBoolean&&,||, and!requireBooleanoperands- implicit truthiness is forbidden for integers, floats, pointers, and other non-boolean values
Lowering should branch directly on the analyzed Boolean i1 value for control
flow instead of inventing zero-comparison truthiness rules during codegen.
Loop Contract¶
Loop semantics are now part of the stable lowering contract for structured control flow:
breakexits the nearest enclosing loopcontinuetargets the canonical re-entry block for the active loop formWhileStmtre-enters through its condition blockForCountLoopStmtevaluates initializer once, condition before each iteration, update after each fallthrough orcontinue, and stores the update result as the next loop-variable valueForRangeLoopStmtinitializes its induction variable once before the first condition check, observesendandstepbefore iteration begins, runs body mutations before the step block, and does not expose the loop variable after the loop- loop misuse such as
breakorcontinueoutside a loop is rejected semantically before lowering and still surfaces as a structured lowering diagnostic in direct backend use
Struct Contract¶
Structs are IRx's stable composite storage and ABI foundation.
- struct names are stable semantic symbols
- field order is exactly declaration order
- field names must be unique within a struct
- field types must resolve semantically before lowering
- field layout must not be implicitly reordered by semantics or lowering
- field access must resolve semantically before codegen and lower by stable field index
- nested structs by value are allowed when every referenced struct is fully defined
- direct by-value recursive structs are forbidden
- mutual by-value recursive structs are forbidden
- structs can be passed and returned by value within IRx-defined functions
- the public FFI layer accepts only the ABI-compatible subset described above
- emitted LLVM struct types are plain data with no hidden headers, metadata, tags, or runtime object payloads
For now, empty structs are rejected explicitly instead of relying on backend- specific behavior.
Buffer/View Model¶
IRx defines a canonical buffer owner plus buffer view substrate for low-level memory/container interop. This is not a user-facing scientific array API, and it does not define broadcasting, slicing syntax, reductions, or tensor algebra.
The canonical view descriptor is a plain stable struct conceptually equivalent to:
data: ptrowner: ptr | nulldtype: opaque handle or stable tokenndim: i32shape: ptr<i64>strides: ptr<i64>offset_bytes: i64flags: i32
Stable built-in primitive dtype tokens are also available when a producer does not need an out-of-band dtype handle:
1: bool2: int83: int164: int325: int646: uint87: uint168: uint329: uint6410: float3211: float64
Semantic rules:
- ownership is explicit as borrowed, owned, or external-owner
- exactly one ownership flag must be present
- borrowed views do not free memory and use a null owner handle
- owned and external-owner views use non-null opaque owner handles
- descriptor copies are shallow metadata copies
- deep copy is explicit and never implicit
- retain and release go through runtime/native helpers
- statically known borrowed views are rejected for retain/release helpers
- mutability is attached to the view, not only the allocation
- readonly and writable views are mutually exclusive
- writes through statically readonly views are rejected semantically
- raw byte writes require an 8-bit integer value and are not typed element stores
- shape and strides describe logical indexing, not ownership
- offset support is part of the descriptor model
- null data with statically nonzero extent is rejected
IRX_BUFFER_FLAG_VALIDITY_BITMAPmay advertise producer-side validity metadata, but generic buffer operations remain null-agnostic
Lowering uses irx_buffer_view as a named plain struct with stable field order:
%"irx_buffer_view" = type {i8*, i8*, i8*, i32, i64*, i64*, i64, i32}
Runtime/native lifetime operations are feature-gated behind the buffer runtime
feature. Plain descriptors do not pull native helper symbols into a module
unless a helper is used.
Arrow Runtime Interop Contract¶
IRx exposes Arrow as one optional runtime feature and FFI-owned ABI surface. It is not a first-class language container model.
Stable scope in this phase:
- supported plain primitive Arrow storage types:
bool,int8,int16,int32,int64,uint8,uint16,uint32,uint64,float32, andfloat64 - opaque schema, array builder, and array handles under
irx_arrow_* - Arrow C Data import/export as the external interchange boundary
- explicit Arrow-to-
irx_buffer_viewprojection for supported fixed-width numeric arrays
Import/export rules:
irx_arrow_array_import_copy(...)copies external Arrow C Data into a new runtime-owned array handleirx_arrow_array_import_move(...)adopts external Arrow C Data into a new runtime-owned array handle and leaves the source structs moved-from on successirx_arrow_array_export(...)copies a runtime-owned array handle into an independent Arrow C Data pair that the caller releases separately- schema handles use the same copy-oriented pattern through
irx_arrow_schema_import_copy(...)andirx_arrow_schema_export(...)
Nullability rules:
- Arrow nullability is modeled on Arrow handles, not as generic
BufferViewTypeelement semantics irx_arrow_array_is_nullable(...),irx_arrow_array_null_count(...), andirx_arrow_array_has_validity_bitmap(...)are the stable Arrow-side inspection surfaceirx_arrow_array_validity_bitmap(...)exposes the physical validity bitmap pointer plus bit offset and length- generic buffer indexing, stores, and raw writes remain null-agnostic
Arrow-to-buffer-view bridge rules:
- only fixed-width, byte-addressable primitive arrays are buffer-view compatible in this phase
- the bridge is always readonly and borrowed
- bridged views use a null owner handle; the caller must keep the Arrow array handle alive explicitly
- bridged views populate dtype, shape, strides, and offset for one 1-D columnar value buffer
- when a validity bitmap exists, the returned view sets
IRX_BUFFER_FLAG_VALIDITY_BITMAP - bool arrays are supported as Arrow handles but are not buffer-view compatible because their values are bit-packed
Intentionally out of scope here:
- ArrowArrayStream, RecordBatch, and Table runtime handles
- dataframe/query semantics
- compute kernels
- nested, dictionary, temporal, decimal, and other non-primitive Arrow layouts
- implicit null-aware scalar semantics on generic buffer views
Example scalar wrapper:
astx.StructDefStmt(
name="ScalarBox",
attributes=[
astx.VariableDeclaration(name="value", type_=astx.Int32()),
],
)
Example nested record:
astx.StructDefStmt(
name="Descriptor",
attributes=[
astx.VariableDeclaration(name="point", type_=astx.StructType("Point")),
astx.VariableDeclaration(name="ready", type_=astx.Boolean()),
],
)
Canonical Cast Policy¶
Implicit promotions in variable initializers, assignments, call arguments, and returns are intentionally narrower than explicit casts:
- same-type assignment is always allowed
- signed integers may widen to wider signed integers
- unsigned integers may widen to wider unsigned integers
- unsigned integers may widen to strictly wider signed integers
- integers may promote to floats when the target float width meets the same
16/32/64floor used by the numeric-promotion table - floats may widen to wider floats
- implicit sign-changing integer casts to unsigned targets are rejected
- implicit narrowing casts are rejected
- implicit float-to-integer and numeric-to-boolean casts are rejected
Explicit Cast(...) expressions allow the full scalar conversions:
- numeric-to-numeric casts
- boolean-to-numeric casts using
0and1 - numeric-to-boolean casts using
!= 0or!= 0.0 - string-to-string casts
- numeric/boolean-to-string casts through runtime formatting
Error Boundaries¶
- Semantic errors: invalid programs, unsupported semantic input, and import
contract violations are reported as diagnostics and surfaced as
SemanticErrorfrom the public analysis entrypoints. - Lowering errors: once semantic analysis succeeds, failures during LLVM IR emission belong to backend lowering, not to semantic validation.
- Linking/runtime errors: native artifact compilation, linker execution, and runtime integration failures happen after lowering and are outside the semantic contract.
What Arx May Hand to IRx¶
- Arx owns parsing. IRx accepts ASTx nodes and host-owned
ParsedModulevalues; it does not parse source files or perform package discovery. - Single-root lowering may use
analyze(...)oranalyze_module(...)when no cross-module import graph is required. - Cross-module lowering must use
analyze_modules(root, resolver)with a host-suppliedImportResolver. - Imports are currently part of the stable contract only at module top level.
- Wildcard imports and import expressions are not part of the current stable lowering contract and are rejected semantically.