Alpha-Vision: A Real-Time Always-on Vision Processor with 787µs Face Detection Latency in <5mW

ALPhA-Vision is an always-on low-power subsystem for DNN-inference-based vision tasks in edge SoCs. Flexible and programmable, the subsystem supports CNN and ViT inference and employs hardware/software co-design to enable fully end-to-end execution with no external memory accesses. Fine-grained power management features to mitigate leakage enable the subsystem to perform face detection with 787µs latency and 99.3% detection accuracy with 4.6 mW average power at 60fps.

GalaxyDiT: Efficient Video Generation with Guidance Alignment and Adaptive Proxy in Diffusion Transformers

Diffusion models have revolutionized video generation, becoming essential tools in creative content generation and physical simulation. Transformer-based architectures (DiTs) and classifier-free guidance (CFG) are two cornerstones of this success, enabling strong prompt adherence and realistic video quality. Despite their versatility and superior performance, these models require intensive computation. Each video generation requires dozens of iterative steps, and CFG doubles the required compute. This inefficiency hinders broader adoption in downstream applications.

Gian Marti

Gian Marti is a Research Scientist at NVIDIA. He earned his B.Sc. and M.Sc. degrees in Electrical Engineering from ETH Zurich in 2017 and 2019, and completed his Ph.D. there in 2025. From 2019 to early 2026, he worked at ETH Zurich’s Signal and Information Processing Laboratory and later at the Integrated Information Processing Group. He has also interned at ABB, Kistler, and NVIDIA.

Jef Packer

Jef Packer is a Research Engineer who has helped bring many projects from research to production. After receiving his B.S. and M.Eng. in Mechanical and Aerospace Engineering from UC Davis, he spent over a decade building autonomous driving systems. He started at Tesla, where he worked on Autopilot v1, then spent six years at Zoox progressing from firmware and classical planning to leading collision avoidance and ML-based planning research.