NFA: — states DFA: — states
(a|b)*abb a*b+ (a|b)+ ab?c (0|1)*00 a(bc)*d (a|b|c)*
Show ε-moves
Physics sim
Show labels
Curved edges
StateTypeTransitions
No automaton built yet
Theory Reference
REGEX · ε-NFA · NFA · DFA · FORMAL LANGUAGE THEORY
Core Concepts
🔤
Regular Expressions (Regex)
A regular expression is a compact notation for describing sets of strings (formal languages). Every regex denotes exactly one regular language.

Core operations:
a·b — Concatenation: string "ab"
a|b — Union: "a" or "b"
a* — Kleene star: zero or more "a"s

From these three primitives every regular language is expressible. Extensions like a+ (= aa*) and a? (= a|ε) are syntactic sugar.
🤖
Finite Automata (FA)
A finite automaton is a mathematical model of computation with a finite set of states. It reads an input string and either accepts or rejects it.

Formally: M = (Q, Σ, δ, q₀, F)
Q — finite set of states
Σ — input alphabet
δ — transition function
q₀ — start state
F ⊆ Q — accept states
〰️
ε-NFA (Non-deterministic)
An ε-NFA allows:
1. Multiple transitions on the same symbol
2. ε-transitions (spontaneous moves without reading input)

The transition function becomes δ: Q × (Σ ∪ {ε}) → 2^Q (maps to a set of states). This makes Thompson's Construction elegant — each regex operator maps to a simple NFA fragment with ε-transitions gluing them together.
DFA (Deterministic)
A DFA has exactly one transition per state per symbol — no ε-moves, no ambiguity. The transition function is δ: Q × Σ → Q.

DFAs are easier to implement and run in O(n) time in the input length. Despite being more restricted, DFAs and NFAs accept exactly the same class of languages — the regular languages (Kleene's theorem).
🔗
ε-Closure
The ε-closure of a state q is the set of all states reachable from q by following zero or more ε-transitions — without consuming any input.

Formally: ε-CLOSURE(q) = {q} ∪ {p | q →ε* p}

ε-closure is the core operation used in both NFA simulation and subset construction. It "expands" a set of active states to include all spontaneously reachable states.
⚖️
Kleene's Theorem
A language L is regular if and only if it is accepted by some finite automaton.

The three directions of proof:
1. Regex → NFA: Thompson's Construction
2. NFA → DFA: Subset (Powerset) Construction
3. DFA → Regex: State elimination / Arden's lemma

This tool demonstrates directions 1 and 2.
Thompson's Construction Algorithm
INPUT: A regular expression in postfix form
OUTPUT: An ε-NFA with exactly one start state and one accept state
KEY PROPERTY: Each NFA fragment has exactly one accept state with no outgoing transitions
1
Base case — symbol a: Create two states s, a. Add transition s →ᵃ a. Fragment = (s, a).
2
Union — r|s: New start q, new accept f. Add ε-edges q→r.start, q→s.start, r.accept→f, s.accept→f.
3
Concatenation — r·s: Merge r.accept into s.start (or add ε-edge). Fragment = (r.start, s.accept).
4
Kleene Star — r*: New start q, new accept f. Add ε-edges q→r.start, q→f, r.accept→r.start (loop), r.accept→f.
5
Result: The NFA for the full expression is the fragment left on the stack, with guaranteed unique start and accept states.
Subset (Powerset) Construction
Converts ε-NFA → DFA. Each DFA state corresponds to a set of NFA states. A DFA state is accepting if it contains any NFA accepting state. Worst case: 2^n DFA states for n NFA states.
1
Start: DFA start state = ε-CLOSURE({q₀})
2
For each unmarked DFA state S and each symbol a ∈ Σ: Compute MOVE(S,a) = ε-CLOSURE(⋃_{q∈S} δ(q,a))
3
Add transition: S →ᵃ MOVE(S,a). If MOVE(S,a) is new, add it as an unmarked DFA state.
4
Mark S as done. Repeat until no unmarked states remain.
5
Accept states: Any DFA state S where S ∩ F ≠ ∅ (F = NFA accept states).
Regex Operator Precedence & Associativity
OperatorSymbolPrecedenceAssociativityDescription
Kleene Star*Highest (3)LeftZero or more repetitions
Plus+3LeftOne or more (= aa*)
Optional?3LeftZero or one (= a|ε)
Concatenation· (implicit)2LeftSequence of two expressions
Union|Lowest (1)LeftEither left or right
Grouping( )N/AN/AOverrides precedence
Regular Language Properties
Closure Properties
Regular languages are closed under:
Union: L₁ ∪ L₂
Intersection: L₁ ∩ L₂
Complement: Σ* \ L
Concatenation: L₁·L₂
Kleene star: L*
Reversal: L^R
Homomorphism: h(L)
Pumping Lemma
For any regular language L, there exists a pumping length p such that any string w ∈ L with |w| ≥ p can be split w = xyz where:
|xy| ≤ p
|y| ≥ 1
∀i ≥ 0: xy^i z ∈ L

Used to prove a language is not regular by contradiction.
Myhill-Nerode Theorem
A language L is regular iff the number of Myhill-Nerode equivalence classes is finite.

Two strings x, y are equivalent (x ≡_L y) if for all z: xz ∈ L ↔ yz ∈ L.

The minimum DFA for L has exactly one state per equivalence class — this is the theoretical foundation for DFA minimization.
Chomsky Hierarchy
Regular languages sit at the bottom of the Chomsky hierarchy:

Type 3: Regular (FA, regex)
Type 2: Context-Free (PDA, CFG)
Type 1: Context-Sensitive (LBA)
Type 0: Recursively Enumerable (TM)

Each class strictly contains the class below it.
Enter a regular expression and click Convert
to visualize the automaton with physics simulation
Ready — enter a regex to begin |
States: — Transitions: — Σ: —