The Problem
Deploying machine learning models traditionally requires complex infrastructure:- Docker containers for consistent environments
- Kubernetes for orchestration and scaling
- Cloud services for hosting and model management
- API gateways for routing and authentication
- Monitoring systems for health checks and performance
The Solution
Mozo is a universal model server that eliminates deployment complexity:Key Features
35+ Pre-Configured Models
Object detection, OCR, depth estimation, vision-language models, and more. All variants tested and ready to use.
Zero Deployment Overhead
No Docker, Kubernetes, or cloud services needed. Install with pip, start with one command.
Automatic Memory Management
Models load on first access and automatically unload when inactive. Efficient memory usage without manual intervention.
Unified Output Format
All detection models return PixelFlow Detections format. Write once, work with any model family.
Available Model Families
Mozo includes pre-configured variants for:- Detectron2 (27 variants) - Object detection, instance segmentation, keypoint detection
- EasyOCR (5 variants) - Text recognition with 80+ language support
- PaddleOCR (5 variants) - PP-OCRv5 for high-accuracy OCR
- PPStructure (4 variants) - Document analysis with layout and table recognition
- Florence-2 (8 task variants) - Multi-task vision model
- Depth Anything (3 variants) - Monocular depth estimation
- Qwen2.5-VL (1 variant) - Vision-language understanding and VQA
- Qwen3-VL (1 variant) - VLM with chain-of-thought reasoning
- BLIP VQA (2 variants) - Visual question answering
- Stability Inpainting (1 variant) - Image inpainting with Stable Diffusion 2
- Datamarkin (dynamic variants) - Cloud-based custom model inference
Two Ways to Use Mozo
REST API (Recommended)
Start the server and make HTTP requests from any language:Python SDK (Advanced)
Embed Mozo directly in Python applications:Next Steps
Quickstart
Make your first prediction in under 5 minutes
Choosing a Model
Decision guide for selecting the right model
Model Families
Explore all available models and variants
Why Mozo?
For quick prototyping: Test if a model works for your use case in minutes, not days. For production deployments: Mozo handles model lifecycle, memory management, and concurrent access automatically. For multi-model applications: Switch between models or use multiple models simultaneously without infrastructure changes. For team collaboration: Consistent model serving API means everyone uses the same interface, regardless of underlying framework.Mozo is designed for development and moderate production workloads. For high-throughput production deployments requiring GPU scaling, consider dedicated model serving solutions like TorchServe or Triton.