Skip to main content
Get Mozo running and make your first model prediction in 3 simple steps.

Prerequisites

  • Python 3.9-3.12
  • pip package manager
  • An image file to test with (any common format: JPG, PNG, etc.)

Step 1: Install Mozo

1

Install via pip

pip install mozo
Verify installation:
mozo version

Step 2: Start the Server

1

Start Mozo server

mozo start
The server starts on http://localhost:8000 with all 35+ models ready to use.
You should see:
INFO:     Uvicorn running on http://0.0.0.0:8000
INFO:     Application startup complete
Models load on first access (lazy loading), not at startup. The server starts in seconds regardless of how many models are available.

Step 3: Make Your First Prediction

Choose your preferred method:
curl -X POST "http://localhost:8000/predict/detectron2/mask_rcnn_R_50_FPN_3x" \
  -F "file=@your_image.jpg"
Response:
[
  {
    "bbox": [120.5, 95.2, 315.8, 405.9],
    "class_name": "person",
    "class_id": 0,
    "confidence": 0.95,
    "mask": [[...]]
  }
]
Success! You’ve made your first prediction. The model detected objects in your image and returned bounding boxes, class names, confidences, and segmentation masks.

Alternative: Python SDK (No Server)

You can also use Mozo directly in Python without starting the HTTP server. This is ideal for embedding in applications or when you only need Python access.
from mozo import ModelManager
import cv2

# Initialize manager
manager = ModelManager()

# Load model
model = manager.get_model('detectron2', 'mask_rcnn_R_50_FPN_3x')

# Load image and predict
image = cv2.imread('your_image.jpg')
detections = model.predict(image)

# Access results
print(f"Found {len(detections)} objects")
for det in detections.detections:
    print(f"  - {det.class_name}: {det.confidence:.2f}")
REST API vs Python SDK:
  • REST API (mozo start): Use when you need HTTP access, multi-language support, or want all models instantly available
  • Python SDK (ModelManager): Use when embedding in Python apps, avoiding HTTP overhead, or needing direct memory access

What Just Happened?

  1. Mozo received your request and identified which model to use (detectron2/mask_rcnn_R_50_FPN_3x)
  2. The model loaded (takes a few seconds on first access, then cached)
  3. Inference ran on your image
  4. Results returned in a unified JSON format

Try Other Models

Now that the server is running, try different models:

Object Detection (Faster R-CNN)

curl -X POST "http://localhost:8000/predict/detectron2/faster_rcnn_R_50_FPN_3x" \
  -F "file=@your_image.jpg"

Text Recognition (EasyOCR)

curl -X POST "http://localhost:8000/predict/easyocr/english-light" \
  -F "[email protected]"

Depth Estimation

curl -X POST "http://localhost:8000/predict/depth_anything/small" \
  -F "[email protected]"

Visual Question Answering

curl -X POST "http://localhost:8000/predict/qwen2.5_vl/7b-instruct" \
  -F "[email protected]" \
  -F "prompt=What objects are visible in this image?"

Explore Available Models

List all model families:
curl http://localhost:8000/models
List variants for a specific family:
curl http://localhost:8000/models/detectron2/variants
Get detailed information about a model:
curl http://localhost:8000/models/detectron2/mask_rcnn_R_50_FPN_3x/info

Server Options

Customize server behavior:
# Custom port
mozo start --port 8080

# Custom host
mozo start --host 127.0.0.1

# Production mode with multiple workers
mozo start --workers 4

Next Steps

Troubleshooting

This is expected behavior. Models load on first access and can take several seconds depending on model size. Subsequent requests to the same model are fast (cached in memory).
Some models (like Qwen2.5-VL 7B) require significant RAM (16GB+). Either:
  • Use a smaller variant
  • Manually unload other models first
  • Enable aggressive cleanup: curl -X POST "http://localhost:8000/models/cleanup?inactive_seconds=60"
Detectron2 requires platform-specific installation. See Detectron2 installation guide.
Some models have limited MPS support. If you encounter errors, the server automatically falls back to CPU (set via PYTORCH_ENABLE_MPS_FALLBACK=1).
For more troubleshooting tips, see the Troubleshooting Guide.