Displayless Smart Glasses: Design & Applications
- Displayless Smart Glasses are wearable devices that forgo visual displays in favor of alternative modalities such as haptics, audio, and biopotential sensing for unobtrusive interaction.
- They integrate advanced biosignal acquisition and on-device ML inference (e.g., GAP9 SoC) to achieve high accuracy in eye gesture, depth-based interaction, and contextual computing.
- Emerging designs emphasize long-wear comfort, energy efficiency, and robust multimodal feedback to support continuous health monitoring, accessible navigation, and privacy-centric applications.
Displayless Smart Glasses are wearable systems that dispense with integrated visual displays, instead leveraging alternative modalities—such as haptics, audio, edge inference, and biopotential sensing—to enable interaction, feedback, and multimodal computing in a familiar eyeglass form factor. Unlike AR/VR headsets and conventional smart glasses, displayless paradigms prioritize low power operation, unobtrusiveness, privacy, and long-wear comfort, often targeting continuous health monitoring, gesture input, environmental sensing, and accessible computing.
1. Core Architectures and Modalities
Displayless smart glasses span a spectrum of sensing and interaction modalities, with typical architectures including:
- Biopotential Acquisition Platforms: Systems such as GAPSES integrate fully dry electrodes in the frame (SoftPulse®, Ag/AgCl elastomer; ETI ≈ 2 MΩ at DC on a 25 mm² EOG pad), channel signals into an ultra-low-noise analog front-end (input impedance >100 GΩ ∥ 5 pF, digitized at 24 bits/1 kS/s via ADS1298), and process all data on-board using a parallel RISC-V PULP @@@@10@@@@ (GAP9, up to 15.6 GOPS DSP/32.2 GMAC/s NNet, 16×21×14 mm³ module) (Frey et al., 2024).
- EOG/Eye-Tracking Wearables: ElectraSight demonstrates fully onboard, hybrid contact/contactless EOG using five differential electrode channels (both wet-elastomer and ENIG copper foils), processed in real time by a 4-bit quantized tinyML CNN (79 kB, 301 μs/inference) running on GAP9 with inference accuracies of 81–92% across 10- and 6-class eye gesture taxonomies (Schärer et al., 2024).
- Depth-Aware EOG Glasses: VergeIO places Ag/AgCl electrodes at critical anatomical sites (temples, nose bridge, mastoid) to capture small bioelectrical signals from vergence, classifies depth-based eye gestures (4–6 gestures at 80–98% accuracy) with <11 mW total budget, and employs motion-artifact gating for robust always-on operation (Zhang et al., 2 Jul 2025).
- Haptic-Feedback Navigation Aids: LLM-Glasses use an ESP32-CAM for vision, YOLO-World object detection, and GPT-4o AI reasoning, relaying directional cues through temple-mounted haptic actuators (five-bar linkages), achieving 81.3% recognition across 13 haptic patterns and robust waypoint-following in navigation studies (Tokmurziyev et al., 4 Mar 2025).
- Multimodal (Gaze+Voice) Referencing: Gazeify Then Voiceify employs eye-tracking and head-mounted camera input, fusing spatiotemporal gaze clustering with segmentation (EfficientSAM), VLM-powered object descriptions (GPT-4o-mini), and error correction via conversational free-form voice, achieving correct initial gaze selection in 53% of trials and effective voice-driven correction in 58% of error cases (Zhang et al., 27 Jan 2026).
This device class typically eschews integrated visual feedback to maximize comfort (frameless <50 g), minimize perceptible electronics, and extend battery life (e.g., >70 h continuous operation at ≈8.85 mW (Schärer et al., 2024); ≥8 days at <11 mW (Zhang et al., 2 Jul 2025)).
2. Signal Acquisition, Processing, and Edge Inference
Key workflows in displayless glasses involve end-to-end biosignal and sensor data handling:
- Dry Electrode Interfaces: Custom soft electrodes (e.g., GAPSES, SoftPulse®) are injection-molded in conductive elastomer, engineered for optimal area and pressure (e.g., 45° prong arrays at 1–2 kPa for EEG) for high impedance yet low discomfort (Frey et al., 2024). Contactless EOG (ElectraSight) extends longevity and wearability with ENIG copper electrodes embedded in the frame (Schärer et al., 2024).
- AFE and Digitization: High-impedance, low-noise op-amps and ADCs (e.g., AD8603, 68 kΩ protection; ST1VAFE3BX, programmable 235 MΩ–2.4 GΩ) ensure signal integrity for sub-microvolt biosignals (Schärer et al., 2024, Frey et al., 2024).
- On-Device ML Inference: Specialized edge platforms (GAP9, nRF5340) execute quantized CNNs (e.g., 79 kB, 151k params, 4b, 301 μs inference (Schärer et al., 2024); MI-BMInet for EEG, EPIDENET for EOG in GAPSES (Frey et al., 2024)) or random forests (VergeIO, 20-dim feature vector) for rapid decode without data offload.
- Privacy and Reliability: All inference and filtering are performed on-board, removing the need for streaming sensitive data and mitigating Wi-Fi/BLE dropout and interception risk (Frey et al., 2024, Schärer et al., 2024). Feedback delivery leverages secondary devices (e.g., bone-conduction headsets), haptics, or audio prompts.
3. Interaction Mechanisms and Feedback Channels
With no visual overlay, displayless glasses rely on alternative interface paradigms:
- Haptic Actuation: LLM-Glasses employ temple-mounted micro-servos (five-bar linkage) to deliver distinct patterns—tapping and sliding at controlled speeds—encoding cues for navigation or event notification. Mean recognition across 13 haptic stimuli reached 81.3%, with pattern execution latency ≈1.25 s (Tokmurziyev et al., 4 Mar 2025). This approach offers spatial cueing mapped to navigation intent (e.g., slide-left for “turn left”).
- Audio and Voice: Multimodal systems (Gazeify Then Voiceify) relay object references and system prompts via synthesized voice, with user corrections accepted through open-ended speech and parsed by a LLM. Audio remains the primary avenue for semantic and corrective feedback, but induces latency (initial description t_voice ≈ 3.6 s, mask update t_update ≈ 5.25 s) and imposes cognitive load due to verbosity (Zhang et al., 27 Jan 2026).
- Gesture and Biopotential Command: EOG-based platforms implement low-latency, high-accuracy gesture schemes. GAPSES provides 11-class eye-movement (EOG) and 8-channel EEG interfaces for comprehensive user command spaces—eye gestures, blinks, and rapid saccades—processed with microjoule-scale energy (Frey et al., 2024, Schärer et al., 2024). VergeIO uniquely instantiates depth-based interaction via eye vergence, enabling hands-free, depth-selective commands and lens autofocus actuation (Zhang et al., 2 Jul 2025).
- Gaze Tracking: Eye-tracking via EOG or hybrid sensors allows reference selection for interface actions (e.g., object segmentation in gaze-voice pipelines). Disambiguation of selection is achieved through voice interaction and VLM parsing to handle inherent gaze noise (Zhang et al., 27 Jan 2026).
4. System Performance and Application Domains
Performance metrics for displayless smart glasses are guided by energy efficiency, accuracy, and user experience:
- Low Power Budgets and Battery Longevity:
- GAPSES achieves >12 h continuous operation at 16.28 mW with a 75 mAh cell (Frey et al., 2024).
- ElectraSight supports >3 days continuous eye-tracking at <9 mW with a 175 mAh cell (Schärer et al., 2024).
- VergeIO demonstrates continuous sensing for >8 days on a 570 mAh Li-Po (11 mW total) (Zhang et al., 2 Jul 2025).
- Inference Accuracy and Latency:
- EOG gesture classification: 96.78% (11 classes, ITR up to 161.43 bit/min) (Frey et al., 2024).
- Hybrid EOG/tinyML: 81% (10 classes), 92% (6 classes), median detection latency ≈40 ms (Schärer et al., 2024).
- Depth-aware vergence: up to 98.3% for four-gesture set, zero calibration generalization (Zhang et al., 2 Jul 2025).
- Haptic recognition: 81.3% average across 13 patterns (Tokmurziyev et al., 4 Mar 2025).
- Gaze-based selection: 53% first-pass accuracy; further 58% error correction via voice (Zhang et al., 27 Jan 2026).
- Use Case Spectrum:
- Neurometric/biometric authentication (EEG/EOG) (Frey et al., 2024).
- Eye gesture-based menu navigation, hands-free control (Schärer et al., 2024, Zhang et al., 2 Jul 2025).
- Varifocal lens actuation, device selection, AR input (Zhang et al., 2 Jul 2025).
- Navigation assistance for visually impaired users via LLM-driven haptic feedback (Tokmurziyev et al., 4 Mar 2025).
- Physical object referencing (gaze + voice) for contextually aware computing (Zhang et al., 27 Jan 2026).
5. Technical and Design Trade-offs
Displayless systems present unique trade-offs:
| Design Aspect | Advantage | Limitation |
|---|---|---|
| No integrated display | Lower weight, less power, extended comfort | Feedback limited to audio/haptics; visual UIs unavailable |
| Edge ML processing | Enhanced privacy, resilience to connectivity | Increased on-board compute complexity, thermal constraints |
| Dry soft electrodes | Long-wear comfort, no gels/adhesives | Potentially higher impedance, SNR variability |
| Haptic/audio feedback | Suitable for accessibility, privacy | Ambiguities, cognitive load, response latency |
Forgoing visual overlays preserves the social acceptability and ergonomics of conventional eyewear, but information must be encoded in less expressive channels. For instance, audio can be verbose or ambiguous, and haptic feedback is bandwidth-limited. These constraints drive much of the system design toward energy and signal optimization, input disambiguation (artifact rejection, preamble gestures), and hybrid multimodal interaction (Frey et al., 2024, Schärer et al., 2024, Zhang et al., 2 Jul 2025, Tokmurziyev et al., 4 Mar 2025, Zhang et al., 27 Jan 2026).
6. Emerging Applications and Future Directions
Rapid advances in on-device ML, dry biopotential electrodes, and ultra-low-power microelectronics are catalyzing new applications:
- Health Monitoring and Biometric Systems: Continuous EOG/EEG monitoring for authentication, stress, cognition, and sleep metrics (Frey et al., 2024, Schärer et al., 2024).
- Depth-Aware and Multimodal Interactions: Gesture sets leveraging vergence, blink patterns, and hybrid gaze+voice channels enable interaction scenarios beyond those possible with display-centric AR (Zhang et al., 2 Jul 2025, Zhang et al., 27 Jan 2026).
- Accessible Navigation: Tactile encoding of directional cues for visually impaired users demonstrates high utility and acceptability in user studies (Tokmurziyev et al., 4 Mar 2025).
- Robust Always-On Operation: Sub-10 mW platforms, dry electrodes, and artifact rejection enable devices to be worn continually, supporting passive and active sensing paradigms.
This suggests that future work will increasingly emphasize scalable on-chip inference (e.g., SoC-integrated ML pipelines), multimodal sensor fusion (EOG with IMU/microphone), adaptive feedback strategies, and standardized privacy-preserving frameworks. Displayless architectures are poised to underpin less obtrusive, more socially acceptable, and privacy-centric wearable computing.
7. Research Challenges and Open Problems
Ongoing work targets technical challenges at the intersection of sensing, inference, and interaction:
- Artifact and Motion Rejection: Even sophisticated preamble gating and artifact classification pipelines do not fully eliminate false detections during complex real-world activities; integration with IMUs and more advanced noise models remains necessary (Zhang et al., 2 Jul 2025).
- Feedback Bandwidth and Usability: Both haptic and audio feedback channels limit the expressiveness and speed of user interaction; user evaluation indicates that latency and information overload are persistent issues (Tokmurziyev et al., 4 Mar 2025, Zhang et al., 27 Jan 2026).
- Detection/Segmentation for Multimodal Inputs: Gaze-based object selection remains error-prone, sensitive to gaze calibration and environmental clutter; voice disambiguation mitigates this but adds latency (Zhang et al., 27 Jan 2026).
- Scalability and Miniaturization: Embedding all analog, digital, and actuator subsystems into miniaturized, socially acceptable eyewear is constrained by battery technology, thermal dissipation, and regulatory standards for biopotential measurement (Frey et al., 2024, Schärer et al., 2024).
A plausible implication is the increasing convergence of displayless and display-enabled glasses platforms as advances in embedded AI, battery chemistry, and multimodal interface design enable runtime switches between user-preferred feedback modes and expanded sensing capabilities. Ongoing research will define the operational and user-experience boundaries of truly ubiquitous, invisible computing via displayless smart glasses.