Close

Presentation

This content is available for: Tech Program Reg Pass. Upgrade Registration
Co-Design Hardware and Algorithm for Vector Search
DescriptionVector search has emerged as the foundation for large-scale information retrieval and machine learning systems, with search engines like Google and Bing processing tens of thousands of queries per second on petabyte-scale document datasets by evaluating vector similarities between encoded query texts and web documents. As performance demands for vector search systems surge, accelerated hardware offers a promising solution in the post-Moore's Law era. We introduce FANNS, an end-to-end and scalable vector search framework on FPGAs. Given a user-provided recall requirement on a dataset and a hardware resource budget, FANNS automatically co-designs hardware and algorithm, subsequently generating the corresponding accelerator. The framework also supports scale-out by incorporating a hardware TCP/IP stack in the accelerator. FANNS attains up to 23.0x and 37.2x speedup compared to FPGA and CPU baselines, respectively, and demonstrates superior scalability to GPUs, achieving 5.5x and 7.6x speedup in median and 95th percentile latency within an eight-accelerator configuration.
Event Type
Paper
TimeThursday, 16 November 20232:30pm - 3pm MST
Location401-402
Tags
Accelerators
Artificial Intelligence/Machine Learning
Codesign
Fault Handling and Tolerance
Performance Measurement, Modeling, and Tools
Post-Moore Computing
Registration Categories
TP
Reproducibility Badges