PyPongAI

The Research Question

Can evolution discover how to play Pong without hand-coded rules?

This platform demonstrates the answer: from Generation 0 (0% win rate, random movement) to Generation 50 (98% win rate, perfect predictive tracking).

The Arc

Generation 0:

0% win rate
Random paddle movement
No trajectory prediction
Ball bounces off randomly

Generation 10:

15% win rate
Basic ball tracking
Simple reaction to ball position
Still misses frequently

Generation 25:

65% win rate
Predictive tracking emerges
Anticipates ball bounces
Strategic positioning

Generation 50:

98% win rate
Perfect predictive tracking
Anticipates player patterns
Near-optimal play

Three Key Insights

1. The Gap Between Random & Trained The visual difference between Generation 0 and Generation 50 is dramatic. Random flailing becomes precise, predictive movement. The platform shows this progression side-by-side.

2. Dual Architecture: Speed vs Interpretability

Headless Simulator: Optimized for 500x faster-than-real-time training
Visual Pong: Proof it actually works via high-fidelity verification and human matches

The headless mode enables rapid iteration. The visual mode provides verification and human-comparable benchmarks.

3. Novelty Search: Evolution’s Creativity Engine

Pure ELO: All agents often converge to a single “safe” strategy
ELO + Novelty: Population maintains diversity, enabling agents to discover non-obvious, superior solutions by rewarding unique behaviors

Novelty search is crucial. Pure optimization leads to convergence on a single strategy. Adding novelty rewards unique behaviors, enabling the population to discover superior solutions.

Technical Highlights

NEAT (NeuroEvolution of Augmenting Topologies) Evolves both the weights and the topology of neural networks, starting from minimal complexity and complexifying only as needed to achieve higher fitness.

RNNs for Temporal Memory Inclusion of Recurrent Neural Networks allows agents to maintain a “memory” of ball velocity and previous states, which is critical for trajectory prediction.

ELO-Based Competitive Training Agents are ranked using a standard ELO system, ensuring that fitness isn’t just a static score but a reflection of the agent’s performance relative to the evolving population.

League System & Gamification Models are automatically categorized into ELO-based tiers (Bronze, Silver, Gold, Platinum), providing a clear progression path for the neuroevolution process.

Research Value

This platform demonstrates:

Neuroevolution Fundamentals: Implementation of NEAT in a dynamic environment
Advanced RL Techniques: RNNs, Novelty Search, and Curriculum Learning
Production-Grade System Design: High-performance dual-architecture simulator

Audience

This is a research demo for ML researchers, not a game for players. The value is in watching the evolution process, not in playing against the final AI.

Quick Start

Installation:

git clone https://github.com/rfd62794/PyPongAI.git
cd PyPongAI
pip install pygame neat-python numpy

Training (Research Mode):

python train.py --mode research --generations 50 --visual

Playing Against a Trained AI:

python play.py --model data/models/best_genome.pkl

Comparing Gen 0 vs Gen 50:

Run training to generate recordings
Launch the app: python main.py
Navigate to Analytics → Compare

Lessons Learned

Novelty search is crucial. Pure ELO optimization leads to convergence on a single “safe” strategy. Adding novelty rewards unique behaviors, enabling the population to discover non-obvious, superior solutions.

Dual architecture pays dividends. The headless simulator enables 500x faster training, while the visual Pong provides verification and human-comparable benchmarks.

GitHub: PyPongAI

Built with Python, NEAT, and Pygame. Neuroevolution research platform demonstrating emergent behavior.