The fragmentation of existing autonomous vehicle (AV) datasets hinders the development of generalizable driving policies that can handle complex and infrequent events. To overcome this, we introduce the Open-X AV (OXAV) repository, an initiative designed to aggregate a wide variety of AV datasets and enable models to learn from these diverse sources. We propose a two-stage training workflow using OXAV: a pre-training phase using perception-focused data, followed by post-training on challenging planning-centric scenarios. Our method DiffusionLTF, a simple end-to-end policy trained on OXAV, ranked second in the 2025 Waymo vision-based end-to-end driving challenge, demonstrating the benefits of diverse, aggregated data.