PyPongAI
The Research Question
Can evolution discover how to play Pong without hand-coded rules?
This platform demonstrates the answer: from Generation 0 (0% win rate, random movement) to Generation 50 (98% win rate, perfect predictive tracking).
The Arc
Generation 0:
- 0% win rate
- Random paddle movement
- No trajectory prediction
- Ball bounces off randomly
Generation 10:
- 15% win rate
- Basic ball tracking
- Simple reaction to ball position
- Still misses frequently
Generation 25:
- 65% win rate
- Predictive tracking emerges
- Anticipates ball bounces
- Strategic positioning
Generation 50:
- 98% win rate
- Perfect predictive tracking
- Anticipates player patterns
- Near-optimal play
Three Key Insights
1. The Gap Between Random & Trained The visual difference between Generation 0 and Generation 50 is dramatic. Random flailing becomes precise, predictive movement. The platform shows this progression side-by-side.
2. Dual Architecture: Speed vs Interpretability
- Headless Simulator: Optimized for 500x faster-than-real-time training
- Visual Pong: Proof it actually works via high-fidelity verification and human matches
The headless mode enables rapid iteration. The visual mode provides verification and human-comparable benchmarks.
3. Novelty Search: Evolution’s Creativity Engine
- Pure ELO: All agents often converge to a single “safe” strategy
- ELO + Novelty: Population maintains diversity, enabling agents to discover non-obvious, superior solutions by rewarding unique behaviors
Novelty search is crucial. Pure optimization leads to convergence on a single strategy. Adding novelty rewards unique behaviors, enabling the population to discover superior solutions.
Technical Highlights
NEAT (NeuroEvolution of Augmenting Topologies) Evolves both the weights and the topology of neural networks, starting from minimal complexity and complexifying only as needed to achieve higher fitness.
RNNs for Temporal Memory Inclusion of Recurrent Neural Networks allows agents to maintain a “memory” of ball velocity and previous states, which is critical for trajectory prediction.
ELO-Based Competitive Training Agents are ranked using a standard ELO system, ensuring that fitness isn’t just a static score but a reflection of the agent’s performance relative to the evolving population.
League System & Gamification Models are automatically categorized into ELO-based tiers (Bronze, Silver, Gold, Platinum), providing a clear progression path for the neuroevolution process.
Research Value
This platform demonstrates:
- Neuroevolution Fundamentals: Implementation of NEAT in a dynamic environment
- Advanced RL Techniques: RNNs, Novelty Search, and Curriculum Learning
- Production-Grade System Design: High-performance dual-architecture simulator
Audience
This is a research demo for ML researchers, not a game for players. The value is in watching the evolution process, not in playing against the final AI.
Quick Start
Installation:
git clone https://github.com/rfd62794/PyPongAI.git
cd PyPongAI
pip install pygame neat-python numpy
Training (Research Mode):
python train.py --mode research --generations 50 --visual
Playing Against a Trained AI:
python play.py --model data/models/best_genome.pkl
Comparing Gen 0 vs Gen 50:
- Run training to generate recordings
- Launch the app:
python main.py - Navigate to Analytics → Compare
Lessons Learned
Novelty search is crucial. Pure ELO optimization leads to convergence on a single “safe” strategy. Adding novelty rewards unique behaviors, enabling the population to discover non-obvious, superior solutions.
Dual architecture pays dividends. The headless simulator enables 500x faster training, while the visual Pong provides verification and human-comparable benchmarks.
GitHub: PyPongAI
Built with Python, NEAT, and Pygame. Neuroevolution research platform demonstrating emergent behavior.