v3.0.1 Stable Release

We Give Robots
Vision

The open-source vision framework for edge devices. Runs DeepStream pipelines, YOLO detection, and world models at 60 FPS on NVIDIA Jetson, Intel NPU, and Hailo.

Initialize Engine View Repository

openeyes-engine - bash

[INFO] Initializing OpenEyes Vision Engine v2.6.0

[SYSTEM] Hardware detected: Jetson Orin Nano (8GB)

[SYSTEM] CUDA Available: True | TensorRT: True

[INFO] Loading YOLO TensorRT engine... DONE

[INFO] Initializing MediaPipe FaceMesh... DONE

[INFO] Initializing MediaPipe Hands... DONE

[INFO] Starting DeepStream pipeline via appsink...

FPS: 60 | Obj: 3 | Face: 1 | Hand: thumbs_up | Pose: 1

FPS: 60 | Obj: 4 | Face: 1 | Hand: open_palm | Pose: 1

FPS: 59 | Obj: 4 | Face: 1 | Hand: open_palm | Pose: 1

Core Architecture

DeepStream Pipeline

Hardware-accelerated processing via GStreamer/DeepStream. Runs TensorRT YOLO engines directly on GPU, passing frames through appsink for low-copy Python processing.

Multi-Modal Inference

Simultaneous FaceMesh, hand gesture recognition, and body pose estimation running alongside primary object detection.

ROS2 Native

Publishes telemetry across dedicated topics such as detections, depth, and pose with a multithreaded executor strategy.

World Models

Predictive planning with LeWM and V-JEPA style world models for temporal awareness, safety checks, and forward simulation.

Performance Benchmarks

Tested on NVIDIA Jetson Orin Nano (8GB) in MAXN mode.

Configuration	Models Active	Frame Rate	Latency
Detection Only (INT8)	YOLO TensorRT	60 FPS	16ms
Minimal Pipeline	Detection + Depth + Tracking	35-40 FPS	28ms
Full Pipeline	Detection + Face + Gesture + Pose	25-30 FPS	38ms
World Model Planning	LeWM (Inference)	200 Hz	5ms