PyPongAI

The Problem

Traditional reinforcement learning requires hand-crafted reward functions and extensive training data. The challenge was demonstrating that evolution alone could discover complex behaviors without explicit rules, and showing the progression from random to skilled play.

My Approach

→ NEAT (NeuroEvolution of Augmenting Topologies) for evolving neural networks
→ RNNs (Recurrent Neural Networks) for temporal memory and trajectory prediction
→ ELO-based competitive training for relative fitness measurement
→ Novelty search to maintain population diversity
→ Dual architecture: headless simulator for training, visual Pong for verification

Key Highlights

Generation 0: 0% win rate, random movement
Generation 50: 98% win rate, perfect predictive tracking
500x faster-than-real-time training in headless mode
ELO + Novelty system prevents convergence to single strategy

How It Works

PyPongAI is a neuroevolution research platform demonstrating that evolution can discover how to play Pong without hand-coded rules. Watch as a neural network evolves from random button-pressing to perfect ball-tracking through competitive selection and behavioral diversity.

The Concept

Evolution discovered how to play Pong—without hand-coded rules. The platform shows the progression from Generation 0 (0% win rate, random movement) to Generation 50 (98% win rate, perfect predictive tracking).

Three Key Insights

1. The Gap Between Random & Trained

Generation 0: 0% win rate, random paddle movement
Generation 50: 98% win rate, perfect predictive tracking
Comparison View shows evolution side-by-side

2. Dual Architecture: Speed vs Interpretability

Headless Simulator: Optimized for 500x faster-than-real-time training
Visual Pong: Proof it actually works via high-fidelity verification and human matches

3. Novelty Search: Evolution’s Creativity Engine

Pure ELO: All agents often converge to a single “safe” strategy
ELO + Novelty: Population maintains diversity, enabling agents to discover non-obvious, superior solutions by rewarding unique behaviors

Technical Highlights

NEAT (NeuroEvolution of Augmenting Topologies) Evolves both the weights and the topology of neural networks, starting from minimal complexity and complexifying only as needed to achieve higher fitness.

RNNs for Temporal Memory Inclusion of Recurrent Neural Networks allows agents to maintain a “memory” of ball velocity and previous states, which is critical for trajectory prediction.

ELO-Based Competitive Training Agents are ranked using a standard ELO system, ensuring that fitness isn’t just a static score but a reflection of the agent’s performance relative to the evolving population.

League System & Gamification Models are automatically categorized into ELO-based tiers (Bronze, Silver, Gold, Platinum), providing a clear progression path for the neuroevolution process.

Quick Start

Installation:

git clone https://github.com/rfd62794/PyPongAI.git
cd PyPongAI
pip install pygame neat-python numpy

Training (Research Mode):

python train.py --mode research --generations 50 --visual

Playing Against a Trained AI:

python play.py --model data/models/best_genome.pkl

Comparing Gen 0 vs Gen 50:

Run training to generate recordings
Launch the app: python main.py
Navigate to Analytics → Compare

Research Value

This platform demonstrates:

Neuroevolution Fundamentals: Implementation of NEAT in a dynamic environment
Advanced RL Techniques: RNNs, Novelty Search, and Curriculum Learning
Production-Grade System Design: High-performance dual-architecture simulator

Lessons Learned

Novelty search is crucial. Pure ELO optimization leads to convergence on a single “safe” strategy. Adding novelty rewards unique behaviors, enabling the population to discover non-obvious, superior solutions.

Dual architecture pays dividends. The headless simulator enables 500x faster training, while the visual Pong provides verification and human-comparable benchmarks.

GitHub: PyPongAI

Built with Python, NEAT, and Pygame. Neuroevolution research platform demonstrating emergent behavior.