"Lessons from building GitHub code search" by Luke Francl (Strange Loop 2023)

Описание к видео "Lessons from building GitHub code search" by Luke Francl (Strange Loop 2023)

In this talk, I'll share some lessons we learned building a high-performance code search engine, designed to meet GitHub's large scale. GitHub code search is the world's largest publicly available code search engine, with more than 60 million repositories and over 160 TB of content indexed. To build it, we had to turn the unique content-addressable nature of Git repositories to our advantage. I'll cover the key strategies we used, including using deduplication and repository similarity to reduce indexing workload, full index compaction to remove deleted documents, multiple levels of sharding, and load balancing. Come discover how we turned code search from a frustrating experience to a powerful feature for our users.

Luke Francl
@lof

Luke Francl works on code search at GitHub. He's excited to build software that makes developers happier and more productive. Prior to joining GitHub, he worked at a search-as-a-service startup, as a freelance developer, and spent a lot of time doing XML sit-ups in the Java world. He lives in San Francisco with his family.

----
Recorded Sept 21, 2023 at Strange Loop 2023 in St. Louis, MO.
https://thestrangeloop.com

Комментарии

Информация по комментариям в разработке