The C# 8086 Compiler is a custom-built compiler designed to process a simple programming language through all major compilation stages — from lexical analysis to code generation. It serves as both a functional compiler and an educational project to explore compiler theory and system-level programming.
Built entirely in C#, it features real-time syntax feedback and a complete backend that translates validated code into functional x86 assembly, demonstrating a full compilation pipeline in action.
Project Overview
The compiler’s workflow is orchestrated through the main interface (Compilador.cs),
which performs lexical and syntactic analysis in real time as the user types.
Full compilation — including semantic analysis and assembly code generation — is triggered manually from the menu.
The process includes lexical, syntactic, and semantic analysis, followed by intermediate code generation and final translation to assembly.
The Asm.cs module produces structured x86 assembly with distinct .Data and .Code sections,
including variable declarations, arithmetic operations, control structures, and MS-DOS interrupts for input/output.
Technical Stack
- C#: The primary programming language for the compiler, with a focus on syntax validation and code generation. It features real-time syntax feedback and a complete backend that translates validated code into functional x86 assembly.
- Assembly x86: Assembly code for the Intel 8086 microprocessor. The compiler generates this code, which includes arithmetic operations, I/O, and control flow translated to mov, add, cmp, and conditional jump instructions.
- Visual Studio: Integrated development environment (IDE) for development and debugging of the compiler.
- Emu8086: An emulator for the Intel 8086 microprocessor, used for visualizing the assembly code generated by the compiler.
Key Features
- Lexical Analyzer (
clsLexico.cs): Implements a deterministic finite automaton (DFA) using a transition table to tokenize the source code and identify reserved words, variables, operators, and strings. - Syntactic Analyzer (
clsSintactico.cs): A predictive parser using a parsing table and stack-based logic to validate the structure and nesting of language constructs. - Semantic Analyzer (
Semantico.cs): Builds and verifies a symbol table, checks declarations, detects type mismatches, and validates operations to prevent logic errors. - Intermediate Code Generation (
Postfijo.cs): Converts infix expressions to postfix (RPN) format using the Shunting-yard algorithm for easier processing. - Assembly Code Translation (
Asm.cs): Generates final x86 assembly with arithmetic, I/O, and control flow translated tomov,add,cmp, and conditional jump instructions.
Key Learnings
- Deep understanding of compiler architecture and the interaction between lexical, syntactic, and semantic analysis stages.
- Implementation of finite automata and predictive parsing techniques for language validation.
- Experience translating high-level constructs into low-level assembly code and managing memory sections manually.
- Exploration of asynchronous updates and real-time syntax validation within a C# GUI environment.
- Insight into how code structure, data types, and control flow are represented at the hardware level.
Repository
This project serves as an example of a compiler that translates high-level language constructs into low-level assembly code, highlighting the importance of understanding the compiler architecture and the interaction between lexical, syntactic, and semantic analysis stages.