Search

Andrew Tao

Nemotron-Research-Tool-N1: Exploring Tool-Using Language Models with Reinforced Reasoning
OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM
Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models
Wolf: Dense Video Captioning with a World Summarization Framework
FeatSharp: Your Vision Model Features, Sharper
RADIO Amplified: Improved Baselines for Agglomerative Vision Foundation Models
Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
VILA: On pretraining for vision language models
FasterViT: Fast Vision Transformers with Hierarchical Attention
Partial Convolution for Padding, Inpainting, and Image Synthesis
Dual Contrastive Loss and Attention for GANs
View Generalization for Single Image Textured 3D Models
Neural FFTs for Universal Texture Image Synthesis
Transposer: Universal Texture Synthesis Using Feature Maps as Transposed Convolution Filter