Testing Strategy
Decision
Spec-driven, multi-layer testing. Unit tests for internals, oracle comparison for behavioral equivalence, integration tests for language semantics, PUC-Rio official test suite as the compatibility target.
Current: 1325 tests (609 unit, 431 integration, 277 oracle, 5 proptest, 3 lua51).
With dynmod feature: 1333 tests (611 unit, 6 dynmod, 431 integration,
277 oracle, 5 proptest, 3 lua51). All 5 layers are active.
Test Layers
Layer 1: Unit Tests
In-module #[cfg(test)] blocks testing internal components in
isolation. Every implementation chunk includes unit tests.
Lexer tests (src/compiler/lexer.rs):
- Token type recognition for all keywords and symbols
- Number literal parsing (decimal, hex, float, exponent)
- String literal parsing (escapes, long brackets)
- Comment handling (single-line, long comments)
- Error cases (unterminated strings, invalid escapes)
- Source position tracking
Parser tests (src/compiler/parser.rs):
- Expression parsing with operator precedence
- Statement parsing for each statement type
- Block and scope handling
- Error recovery and error messages
- AST structure verification
Compiler tests (src/compiler/codegen.rs):
- Instruction emission for each AST node type
- Register allocation
- Constant pool management
- Upvalue resolution
- Jump backpatching
VM tests (src/vm/):
- Instruction dispatch correctness
- Stack manipulation
- GC arena allocation and collection
- Table operations (get, set, resize)
- String interning
- Closure creation and upvalue management
Layer 2: Oracle Comparison Tests
Oracle comparison tests run the same Lua code in both rilua and PUC-Rio Lua 5.1.1, comparing output to verify behavioral equivalence. This catches divergences that unit tests and integration tests might miss.
Reference Binaries
- lua (interpreter):
./lua-5.1.1/src/lua - luac (compiler/lister):
./lua-5.1.1/src/luac
Both are built from the official PUC-Rio Lua 5.1.1 tarball. See
AGENTS.md for download, verification, and build instructions.
The lua binary path is configured via the LUA_REFERENCE_BIN
environment variable (defaults to ./lua-5.1.1/src/lua). Tests
that require the reference binary skip gracefully if it is not
available.
Oracle Test Framework
Test helpers in tests/helpers/:
// tests/helpers/oracle.rs
/// Run code in PUC-Rio Lua 5.1.1 and return (stdout, stderr, exit_code).
fn run_reference(code: &str) -> (String, String, i32);
/// Run code in rilua and return (stdout, stderr).
fn run_rilua(code: &str) -> (String, String);
/// Assert rilua produces the expected stdout for the given code.
fn assert_output(code: &str, expected: &str);
/// Run in both interpreters, assert stdout matches.
fn assert_matches_reference(code: &str);
Usage in tests:
#[test]
fn arithmetic_matches_reference() {
assert_matches_reference("print(1 + 2)");
assert_matches_reference("print(2 ^ 10)");
assert_matches_reference("print(10 % 3)");
assert_matches_reference("print(-7 % 3)"); // floor modulo
}
Bytecode Comparison
After the compiler is implemented, bytecode comparison tests compile
Lua snippets with both rilua and luac -l, then compare instruction
output. This verifies the compiler produces correct bytecode before
the VM exists.
/// Compile code with rilua and return a formatted instruction listing.
fn compile_rilua(code: &str) -> String;
/// Compile code with PUC-Rio luac -l and return the listing.
fn compile_reference(code: &str) -> String;
/// Assert both compilers produce equivalent bytecode.
fn assert_bytecode_matches(code: &str);
Bytecode comparison checks instruction opcodes and operands. It does not compare constant pool ordering or debug info formatting, since these may differ between implementations without affecting semantics.
When Each Test Category Activates
| Category | Activates after | Mechanism |
|---|---|---|
| Unit tests | Phase 0 (skeleton) | cargo test --lib |
| Bytecode comparison | Phase 2 (compiler) | Compare rilua compiler output with luac -l |
| Oracle comparison | Phase 3 + print | Compare rilua output with lua -e |
Integration .lua tests | Phase 3 + assert | cargo test --test integration |
| PUC-Rio test suite | Phase 5a (base lib) | cargo test --test lua51 |
Layer 3: Integration Tests
Lua scripts in tests/ that exercise language features through the
full pipeline. Each test uses assert() to validate behavior.
Organization mirrors the Lua 5.1 Reference Manual. Language tests cover Chapter 2 (“The Language”), standard library tests cover Chapter 5 (“Standard Libraries”).
Language tests (Chapter 2):
| File | Section | Description |
|---|---|---|
lexical.lua | 2.1 | Lexical conventions (keywords, names, strings, numbers, comments) |
types.lua | 2.2 | Values and types, coercion (2.2.1) |
variables.lua | 2.3 | Global, local, and table field variables |
statements.lua | 2.4 | Chunks, blocks, assignment, control structures, for loops, local declarations |
expressions.lua | 2.5 | Arithmetic, relational, logical operators, concatenation, length, precedence, table constructors, function calls, function definitions |
visibility.lua | 2.6 | Lexical scoping, upvalues, closures |
errors.lua | 2.7 | error(), pcall, xpcall, error objects, stack traces |
metatables.lua | 2.8 | Metamethods for arithmetic, comparison, indexing, call, concatenation, length |
environments.lua | 2.9 | Function environments, setfenv, getfenv |
gc.lua | 2.10 | Garbage collection, finalizers (2.10.1), weak tables (2.10.2) |
coroutines.lua | 2.11 | create, resume, yield, wrap, status, error propagation |
Standard library tests (Chapter 5):
| File | Section | Description |
|---|---|---|
stdlib-base.lua | 5.1 | Base library (assert, type, tonumber, tostring, select, unpack, etc.) |
stdlib-package.lua | 5.3 | Package/module library (require, module, loaders, etc.) |
stdlib-string.lua | 5.4 | String library (find, format, gmatch, gsub, etc.) |
stdlib-table.lua | 5.5 | Table library (concat, insert, remove, sort, maxn) |
stdlib-math.lua | 5.6 | Math library (abs, floor, ceil, random, sin, cos, etc.) |
stdlib-io.lua | 5.7 | I/O library (open, read, write, lines, etc.) |
stdlib-os.lua | 5.8 | OS library (clock, date, time, execute, etc.) |
stdlib-debug.lua | 5.9 | Debug library (getinfo, getlocal, sethook, traceback, etc.) |
Test infrastructure files: tests/helpers/mod.rs (shared utilities),
tests/helpers/oracle.rs (PUC-Rio comparison), tests/integration.rs
(runner), tests/lua51.rs (PUC-Rio suite runner).
Layer 4: PUC-Rio Official Test Suite
The PUC-Rio Lua 5.1.1 test suite (./lua-5.1-tests/) is the
compatibility target. These are verbatim test files from the
official Lua test tarball. See AGENTS.md for download instructions.
Official Running Modes
Per lua.org/tests/, the test suite has three running modes:
- Portable mode:
lua -e"_U=true" all.lua– skips system-dependent tests and memory-intensive operations. - Full mode:
lua all.lua– tests every corner of the language. Requires compiled C libraries inlibs/subdirectory. - Internal mode: Recompile Lua with
ltests.c/ltests.hto enable the T (testC) library for internal VM tests.
What all.lua Does
The all.lua runner is not a simple file list. It modifies the
runtime environment before running each test:
- Sets GC parameters:
collectgarbage("setstepmul", 180)andsetpause(190). - Redefines
dofileto round-trip every test file throughstring.dump+loadstring, implicitly testing binary chunk serialization and deserialization. - Wraps
big.luaincoroutine.wrap(the file yields values). calls.luaexpects adeepvariable set bymain.lua.- Sets a
debug.sethookcall/return hook during cleanup (line 118).
How rilua Runs These Tests
rilua does not run all.lua directly. Instead, each test file
is executed individually using two approaches:
Important: Tests must be run from the lua-5.1-tests/ directory.
Several tests depend on relative paths: attrib.lua creates files
in libs/, math.lua and verybig.lua require checktable.lua
via LUA_PATH, and file tests reference paths relative to the test
directory. Running from the project root will cause false failures.
Individual file execution (primary):
# Run from the test directory (required)
cd lua-5.1-tests
mkdir -p libs
# Run a single test
RILUA_TEST_LIB=1 LUA_PATH="?;./?.lua" ../target/release/rilua <test>.lua
# Run all tests
for f in *.lua; do
[ "$f" = "all.lua" ] && continue
echo -n "$(basename $f .lua): "
timeout 30 env RILUA_TEST_LIB=1 LUA_PATH="?;./?.lua" \
../target/release/rilua "$f" >/dev/null 2>&1 && echo "PASS" || echo "FAIL"
done
Comparison script (scripts/compare.sh):
# Compare all test files between PUC-Rio and rilua
scripts/compare.sh ./lua-5.1.1/src/lua ./target/release/rilua
The comparison script runs each .lua file individually (except
all.lua) with a 10-second timeout, reporting PASS/FAIL/TIMEOUT
for both interpreters. This differs from all.lua in that:
- No
string.dump/loadstringround-trip (tests run directly). - No GC parameter tuning.
- No inter-test state (
big.luaruns standalone, not in a coroutine;calls.luaruns withoutdeepfrommain.lua). - Each test gets a fresh interpreter state.
rilua also passes all.lua directly (see
What all.lua Does for its additional
requirements).
T Module
rilua implements PUC-Rio’s internal test library (T global),
the Rust equivalent of ltests.c. Activate it with the
RILUA_TEST_LIB=1 environment variable. 25 functions are
registered:
| Function | Description |
|---|---|
T.querytab | Returns (array size, hash size) for a table |
T.hash | Returns hash-part index for a key in a table |
T.int2fb | Converts integer to float-byte encoding |
T.log2 | Returns floor(log2(x)) |
T.listcode | Returns list of opcodes for a function |
T.setyhook | Sets yield-on-hook for a coroutine thread |
T.resume | Resumes a coroutine (no arguments) |
T.d2s | Converts f64 to 8-byte native-endian string |
T.s2d | Converts 8-byte native-endian string to f64 |
T.testC | C API mini-interpreter (28 commands) |
T.newuserdata | Create userdata with given byte size |
T.udataval | Return unique integer ID for userdata |
T.pushuserdata | Find/create userdata by its ID |
T.ref | Store object in registry, return integer key |
T.unref | Remove registry entry |
T.getref | Get value from registry by key |
T.upvalue | Get/set upvalue n of closure f |
T.checkmemory | No-op stub (GC consistency check) |
T.gsub | String substitution |
T.doonnewstack | Run code in a new coroutine |
T.newstate | Create independent Lua state |
T.closestate | Close a state created by newstate |
T.doremote | Execute code string in remote state |
T.loadlib | Load standard libraries into remote state |
T.totalmem | Get/set memory limit (OOM simulation) |
Four tests (api.lua, checktable.lua, closure.lua, code.lua)
use T extensively. When T is nil, guarded sections (if T then ... end) are skipped. All four pass with RILUA_TEST_LIB=1.
Test Files
| Test File | Area |
|---|---|
all.lua | Test runner (chains all tests with dump/undump) |
api.lua | C API interactions (requires T.testC) |
attrib.lua | require/package system, assignments, operators |
big.lua | String overflow, large line counts, table constructs |
calls.lua | Function calls and returns |
checktable.lua | Table invariant checker (utility functions only) |
closure.lua | Closures, upvalues, and coroutines |
code.lua | Code generation, optimizations (uses T.listcode) |
constructs.lua | Syntax, operator priority, language constructs |
db.lua | Debug library |
errors.lua | Error handling |
events.lua | Metatables and metamethods |
files.lua | I/O library |
gc.lua | Garbage collection |
literals.lua | Scanner/lexer and literal parsing |
locals.lua | Local variables |
main.lua | Standalone interpreter (lua.c) options |
math.lua | Math library |
nextvar.lua | Tables, next(), size operator, for loops |
pm.lua | Pattern matching |
sort.lua | table.sort |
strings.lua | String library |
vararg.lua | Vararg functions |
verybig.lua | Very large programs |
Current Status
All 23 files pass: api, attrib, big, calls, checktable, closure, code, constructs, db, errors, events, files, gc, literals, locals, main, math, nextvar, pm, sort, strings, vararg, verybig.
The all.lua runner also passes (see
What all.lua Does for its additional
requirements beyond individual file execution).
Compatibility flags: The PUC-Rio test suite was written with
default compat options enabled (e.g., LUA_COMPAT_VARARG enables
the arg table in vararg functions). WoW’s Lua disables some of
these. Tests that depend on compat options may need conditional
handling.
Layer 5: Behavioral Equivalence Tests
Tests that specifically verify behavioral equivalence with PUC-Rio Lua 5.1.1. These test edge cases where implementations commonly diverge:
- Numeric formatting (
tostring(0.1),string.format("%g", 0.1)) - Error message wording (programs may match on error strings)
- GC behavior (
collectgarbagereturn values) - Weak table clearing timing
- Finalizer execution order
- String-to-number coercion edge cases
- Integer overflow behavior (all numbers are f64)
- Modulo with negative operands (floor division)
- Concatenation type coercion
These tests use the oracle comparison framework to run each case in both rilua and PUC-Rio, comparing exact output. They are the last line of defense before the PUC-Rio test suite.
Test Workflow
Test-Driven Development
New features are implemented test-first where possible:
- Write a Lua test script that exercises the feature.
- Run the test against PUC-Rio Lua 5.1.1 to verify expected behavior.
- Implement the feature in rilua.
- Run the test and fix until it passes.
- Run the oracle comparison to verify matching output.
- Run the full test suite to check for regressions.
This ensures every feature is validated against the reference implementation.
Quality Gate
Every commit must pass:
cargo fmt -- --check && \
cargo clippy --all-targets && \
cargo test && \
cargo doc --no-deps
This ensures:
- Consistent formatting
- No lint warnings
- All tests pass (unit, integration, oracle, PUC-Rio suite)
- Documentation builds without errors
Coverage Tracking
Test coverage is measured by:
- Feature coverage – which Lua 5.1.1 features are implemented and tested (tracked in CHANGELOG.md).
- PUC-Rio test suite progress – 23 of 23 official test files passing (tracked in CI).
- Oracle comparison count – number of Lua snippets verified against PUC-Rio output.
- Code coverage –
cargo-tarpaulinorllvm-covfor line coverage metrics (informational, not a gate).