The ESP32-S3 AI Camera combines a wide-angle IR camera, microphone and speaker with edge AI processing and Wi-Fi connectivity — ideal for smart monitoring, security, and voice-enabled IoT projects.
The ESP32-S3 AI Camera is an intelligent camera module built around the ESP32-S3 chip, designed for efficient video processing, edge AI tasks and voice interaction. It features a wide-angle infrared camera, along with an onboard microphone and speaker, making it well suited to applications such as electronic peepholes, baby monitors and licence plate recognition.
With built-in AI processing and Wi-Fi connectivity, the module supports edge-based image recognition as well as interaction with online AI models. It integrates easily into IoT systems, making it a practical choice for projects ranging from security and surveillance to smart devices and AI assistants.
Please note: the Gravity interface of V1.1 version has 3.3V output, so please do not connect to the input power supply.
Features
Edge AI processing using the ESP32-S3 neural network engine
Supports on-device image recognition with platforms including Edge Impulse, YOLOv5 and OpenCV
Enables object detection and image classification without relying on the cloud
Supports voice-controlled commands through ChatGPT integration
Combines local AI processing with cloud-based model interaction for flexible IoT deployments
Includes extensive documentation and example code, covering:
Camera setup, video streaming and audio recording
Image recognition and object classification
OpenCV contour detection
OpenAI integration for voice and image recognition
Custom AI model training with Edge Impulse
Integrated microphone and amplifier for:
Voice recognition (ASR)
Interactive dialogue using ChatGPT
Voice-controlled automation and hands-free device management
160° wide-angle infrared camera with infrared illumination
Built-in light sensor for automatic adaptation to lighting conditions
Suitable for 24/7 monitoring, including low-light and evening use
Ideal for baby monitoring, security surveillance and smart home systems
Wireless connectivity via Wi-Fi and Bluetooth LE 5
Supports remote video access and system control from mobile or connected devices
Suitable for real-time monitoring, smart automation and remote surveillance applications
Various AI capabilities
Edge image recognition (based on EdgeImpulse)
Online image recognition (OpenCV, YOLO)
Online large models for voice and image (ChatGPT)
Equipped with a wide-angle night vision camera, infrared illumination, and all-day usability
Onboard microphone and amplifier for voice interaction
Offers a variety of AI models, with tutorial support for quick learning
Specifications
Operating Voltage: 3.3V
Type-C Input Voltage: 5V DC
VIN Input Voltage: 3.7-15V DC
Operating Temperature: -10~60°C
Module Size: 42*42mm
OV3660: 160° wide-angle infrared camera
IR: Infrared illumination (IO47)
MIC: I2S PDM microphone
LED: Onboard LED (IO3)
ALS: LTR-308 ambient light sensor
ESP32-S3: ESP32-S3R8 chip
SD: SD card slot
Flash: 16MB Flash
VIN: 3.7-15V DC input
HM6245: Power chip
Type-C: USB Type-C interface for power and code uploading