Quickstart

Get Mozo running and make your first model prediction in 3 simple steps.

Prerequisites

Python 3.9-3.12
pip package manager
An image file to test with (any common format: JPG, PNG, etc.)

Step 1: Install Mozo

Install via pip

pip install mozo

Verify installation:

mozo version

Step 2: Start the Server

Start Mozo server

mozo start

The server starts on http://localhost:8000 with all 35+ models ready to use.

You should see:

INFO:     Uvicorn running on http://0.0.0.0:8000
INFO:     Application startup complete

Models load on first access (lazy loading), not at startup. The server starts in seconds regardless of how many models are available.

Step 3: Make Your First Prediction

Choose your preferred method:

cURL
Python
JavaScript

curl -X POST "http://localhost:8000/predict/detectron2/mask_rcnn_R_50_FPN_3x" \
  -F "file=@your_image.jpg"

Response:

[
  {
    "bbox": [120.5, 95.2, 315.8, 405.9],
    "class_name": "person",
    "class_id": 0,
    "confidence": 0.95,
    "mask": [[...]]
  }
]

import requests

# Make prediction request
with open('your_image.jpg', 'rb') as f:
    response = requests.post(
        'http://localhost:8000/predict/detectron2/mask_rcnn_R_50_FPN_3x',
        files={'file': f}
    )

# Parse results
detections = response.json()
print(f"Found {len(detections)} objects")

for det in detections:
    print(f"  - {det['class_name']}: {det['confidence']:.2f}")

const formData = new FormData();
formData.append('file', fileInput.files[0]);

const response = await fetch(
  'http://localhost:8000/predict/detectron2/mask_rcnn_R_50_FPN_3x',
  {
    method: 'POST',
    body: formData
  }
);

const detections = await response.json();
console.log(`Found ${detections.length} objects`);

Success! You’ve made your first prediction. The model detected objects in your image and returned bounding boxes, class names, confidences, and segmentation masks.

Alternative: Python SDK (No Server)

You can also use Mozo directly in Python without starting the HTTP server. This is ideal for embedding in applications or when you only need Python access.

Object Detection
Text Recognition
Depth Estimation
Visual Q&A

from mozo import ModelManager
import cv2

# Initialize manager
manager = ModelManager()

# Load model
model = manager.get_model('detectron2', 'mask_rcnn_R_50_FPN_3x')

# Load image and predict
image = cv2.imread('your_image.jpg')
detections = model.predict(image)

# Access results
print(f"Found {len(detections)} objects")
for det in detections.detections:
    print(f"  - {det.class_name}: {det.confidence:.2f}")

from mozo import ModelManager
import cv2

manager = ModelManager()
model = manager.get_model('easyocr', 'english-light')

image = cv2.imread('document.jpg')
results = model.predict(image)

for item in results.detections:
    print(f"Text: {item.text}")
    print(f"Confidence: {item.confidence:.2%}")

from mozo import ModelManager
import cv2

manager = ModelManager()
model = manager.get_model('depth_anything', 'small')

image = cv2.imread('scene.jpg')
depth_map = model.predict(image)

# depth_map is a PIL Image
depth_map.save('depth_output.png')
depth_map.show()

from mozo import ModelManager
import cv2

manager = ModelManager()
model = manager.get_model('qwen2.5_vl', '7b-instruct')

image = cv2.imread('image.jpg')
result = model.predict(
    image,
    prompt="What objects are visible in this image?"
)

print(result['text'])

REST API vs Python SDK:

REST API (mozo start): Use when you need HTTP access, multi-language support, or want all models instantly available
Python SDK (ModelManager): Use when embedding in Python apps, avoiding HTTP overhead, or needing direct memory access

What Just Happened?

Mozo received your request and identified which model to use (detectron2/mask_rcnn_R_50_FPN_3x)
The model loaded (takes a few seconds on first access, then cached)
Inference ran on your image
Results returned in a unified JSON format

Try Other Models

Now that the server is running, try different models:

Object Detection (Faster R-CNN)

curl -X POST "http://localhost:8000/predict/detectron2/faster_rcnn_R_50_FPN_3x" \
  -F "file=@your_image.jpg"

Text Recognition (EasyOCR)

curl -X POST "http://localhost:8000/predict/easyocr/english-light" \
  -F "[email protected]"

Depth Estimation

curl -X POST "http://localhost:8000/predict/depth_anything/small" \
  -F "[email protected]"

Visual Question Answering

curl -X POST "http://localhost:8000/predict/qwen2.5_vl/7b-instruct" \
  -F "[email protected]" \
  -F "prompt=What objects are visible in this image?"

Explore Available Models

List all model families:

curl http://localhost:8000/models

List variants for a specific family:

curl http://localhost:8000/models/detectron2/variants

Get detailed information about a model:

curl http://localhost:8000/models/detectron2/mask_rcnn_R_50_FPN_3x/info

Server Options

Customize server behavior:

# Custom port
mozo start --port 8080

# Custom host
mozo start --host 127.0.0.1

# Production mode with multiple workers
mozo start --workers 4

Next Steps

Choosing a Model

Learn which model to use for different tasks

REST API Reference

Explore all API endpoints and options

Model Families

See detailed documentation for each model family

Troubleshooting

Model loading is slow on first request

This is expected behavior. Models load on first access and can take several seconds depending on model size. Subsequent requests to the same model are fast (cached in memory).

Out of memory error

Some models (like Qwen2.5-VL 7B) require significant RAM (16GB+). Either:

Use a smaller variant
Manually unload other models first
Enable aggressive cleanup: curl -X POST "http://localhost:8000/models/cleanup?inactive_seconds=60"

ImportError: detectron2 not installed

Detectron2 requires platform-specific installation. See Detectron2 installation guide.

MPS/GPU issues on macOS

Some models have limited MPS support. If you encounter errors, the server automatically falls back to CPU (set via PYTORCH_ENABLE_MPS_FALLBACK=1).

For more troubleshooting tips, see the Troubleshooting Guide.

Getting Started

Customization

Writing Content

AI Tools

API Documentation

Endpoint Examples

Prerequisites

Step 1: Install Mozo

Step 2: Start the Server

Step 3: Make Your First Prediction

Alternative: Python SDK (No Server)

What Just Happened?

Try Other Models

Object Detection (Faster R-CNN)

Text Recognition (EasyOCR)

Depth Estimation

Visual Question Answering

Explore Available Models

Server Options

Next Steps

Choosing a Model

REST API Reference

Model Families

Troubleshooting

Getting Started

Customization

Writing Content

AI Tools

API Documentation

Endpoint Examples

​Prerequisites

​Step 1: Install Mozo

​Step 2: Start the Server

​Step 3: Make Your First Prediction

​Alternative: Python SDK (No Server)

​What Just Happened?

​Try Other Models

​Object Detection (Faster R-CNN)

​Text Recognition (EasyOCR)

​Depth Estimation

​Visual Question Answering

​Explore Available Models

​Server Options

​Next Steps

Choosing a Model

REST API Reference

Model Families

​Troubleshooting

Prerequisites

Step 1: Install Mozo

Step 2: Start the Server

Step 3: Make Your First Prediction

Alternative: Python SDK (No Server)

What Just Happened?

Try Other Models

Object Detection (Faster R-CNN)

Text Recognition (EasyOCR)

Depth Estimation

Visual Question Answering

Explore Available Models

Server Options

Next Steps

Troubleshooting