LINCC Frameworks LSDB & TAPE – Scalable Analysis Across Large Datasets

Описание к видео LINCC Frameworks LSDB & TAPE – Scalable Analysis Across Large Datasets

Today (June 13, 2024) we are joined by Neven Caplar and Doug Branton of LINCC Frameworks, both at the University of Washington.

Abstract: This presentation introduces LSDB, a software package that facilitates large-scale analytics across multiple datasets. LSDB’s efficacy lies in its ability to shard expansive datasets and index sources within the Healpix space, leveraging catalog density per index. We will shortly describe the project history and design decisions that we have taken (such as using DASK). We will discuss current and future plans, including supporting science cases, International Virtual Observatory Alliance standardization, and collaboration with NASA. We will then discuss our effort to enable time-domain science, present a use case of searching for rare events and anomalies in existing datasets, and showcase how it can be extended to Rubin-size datasets. We will discuss the challenges of working with LSST-like objects/source tables and then discuss our current plans to refactor the code to implement a “nested” structure that keeps objects and sources tightly coupled. We will demonstrate the current capabilities with a Jupyter notebook demo.

LINCC Tech Talks are held on the second Thursday of every month. Events are also advertised on our web page (https://lsstdiscoveryalliance.org/pro...) and the #lincc-tech-talks LSSTC Slack channel, which is always available for discussions before, during, and after the talks.

Комментарии

Информация по комментариям в разработке