|
| 1 | +# CCC - C Compiler Collection |
| 2 | + |
| 3 | +A C compiler written in Rust, targeting x86-64, AArch64, and RISC-V 64. |
| 4 | + |
| 5 | +## Status |
| 6 | + |
| 7 | +**Initial scaffold complete.** The basic compilation pipeline is functional: |
| 8 | +- Lexer, preprocessor (strip-only), parser, semantic analysis (stub) |
| 9 | +- IR lowering (AST -> alloca-based IR) |
| 10 | +- Code generation for x86-64, AArch64, and RISC-V 64 |
| 11 | +- Assembly and linking via system tools (gcc/gas) |
| 12 | + |
| 13 | +### Test Results (1% sample) |
| 14 | +- x86-64: ~13% passing |
| 15 | +- AArch64: ~8% passing |
| 16 | +- RISC-V 64: ~15% passing |
| 17 | + |
| 18 | +### What Works |
| 19 | +- `int main() { return N; }` for any integer N |
| 20 | +- `printf()` with string literal arguments (via libc linking) |
| 21 | +- Basic arithmetic (`+`, `-`, `*`, `/`, `%`) |
| 22 | +- Local variable declarations and assignments |
| 23 | +- `if`/`else`, `while`, `for`, `do-while` control flow |
| 24 | +- Function calls with up to 6/8 arguments |
| 25 | +- Comparison operators |
| 26 | + |
| 27 | +### What's Not Yet Implemented |
| 28 | +- Preprocessor (macros, includes, conditionals) |
| 29 | +- Type checking (sema is a stub) |
| 30 | +- Structs, unions, enums (parsed but not lowered) |
| 31 | +- Arrays and pointers (parsed but codegen incomplete) |
| 32 | +- Switch statements (stub) |
| 33 | +- Floating point |
| 34 | +- Global variables |
| 35 | +- String formatting (printf with %d etc) |
| 36 | +- Native assembler/linker (currently uses gcc) |
| 37 | +- Optimization passes |
| 38 | + |
| 39 | +## Building |
| 40 | + |
| 41 | +```bash |
| 42 | +cargo build |
| 43 | +# Produces: target/debug/ccc, ccc-x86, ccc-arm, ccc-riscv |
| 44 | +``` |
| 45 | + |
| 46 | +## Usage |
| 47 | + |
| 48 | +```bash |
| 49 | +# Compile C to x86-64 executable |
| 50 | +target/debug/ccc -o output input.c |
| 51 | + |
| 52 | +# Compile for AArch64 |
| 53 | +target/debug/ccc-arm -o output input.c |
| 54 | + |
| 55 | +# Compile for RISC-V 64 |
| 56 | +target/debug/ccc-riscv -o output input.c |
| 57 | +``` |
| 58 | + |
| 59 | +## Architecture |
| 60 | + |
| 61 | +``` |
| 62 | +src/ |
| 63 | + frontend/ |
| 64 | + preprocessor/ Strip preprocessor directives (TODO: full expansion) |
| 65 | + lexer/ Tokenize C source with source locations |
| 66 | + parser/ Recursive descent parser, produces AST |
| 67 | + sema/ Semantic analysis (TODO: type checking) |
| 68 | +
|
| 69 | + ir/ |
| 70 | + ir.rs IR definition (instructions, basic blocks, values) |
| 71 | + lowering/ AST -> alloca-based IR |
| 72 | + mem2reg/ TODO: promote allocas to SSA |
| 73 | +
|
| 74 | + passes/ TODO: optimization passes (constant fold, DCE, etc.) |
| 75 | +
|
| 76 | + backend/ |
| 77 | + x86/ |
| 78 | + codegen/ IR -> x86-64 assembly (stack-based allocation) |
| 79 | + assembler/ Assembly -> object file (via gcc -c) |
| 80 | + linker/ Object files -> executable (via gcc) |
| 81 | + arm/ |
| 82 | + codegen/ IR -> AArch64 assembly |
| 83 | + assembler/ via aarch64-linux-gnu-gcc |
| 84 | + linker/ via aarch64-linux-gnu-gcc |
| 85 | + riscv/ |
| 86 | + codegen/ IR -> RISC-V 64 assembly |
| 87 | + assembler/ via riscv64-linux-gnu-gcc |
| 88 | + linker/ via riscv64-linux-gnu-gcc |
| 89 | +
|
| 90 | + common/ |
| 91 | + types.rs CType, IrType |
| 92 | + symbol_table.rs Scoped name resolution |
| 93 | + source.rs Span, SourceLocation, SourceManager |
| 94 | + error.rs Diagnostic with span |
| 95 | +
|
| 96 | + driver/ CLI argument parsing, pipeline orchestration |
| 97 | +``` |
| 98 | + |
| 99 | +## Running Tests |
| 100 | + |
| 101 | +```bash |
| 102 | +python3 /verify/verify_compiler.py --compiler target/debug/ccc-x86 --ratio 100 |
| 103 | +python3 /verify/verify_compiler.py --compiler target/debug/ccc-arm --ratio 100 |
| 104 | +python3 /verify/verify_compiler.py --compiler target/debug/ccc-riscv --ratio 100 |
| 105 | +``` |
0 commit comments