STEAM: Observability-Preserving Trace Sampling
Large Scale Intelligent Microservices – IEEE Big Data 2020 Paper Presentation
Deploying Machine Learning (ML) algorithms within databases is a challenge due to the varied computational footprints of modern ML algorithms and the myriad of database technologies each with their own restrictive syntax. We introduce an…
Demonstration of CORNET: Learning Spreadsheet Formatting Rules by Example
Abstract: Data management and analysis tasks are often carried out using spreadsheet software. A popular feature in most spreadsheet platforms is the ability to define data-dependent formatting rules. These rules can express actions such as…
Query Acceleration for Data Lakes
Accelerating query processing on open data formats As businesses become more data-driven, there is an increasing interest in adopting data lakes (e.g., Microsoft Fabric) in large enterprises. A data lake is a large storage repository…
Self-service Data Preparation
It is often cited that data scientists spend a significant portion of their time (up to 80%), cleaning and preparing data. For less-technical users, who may be less proficient in writing code (e.g., in Excel,…
FRA: Flexible Resource Allocation in Multi-Tenant Relational Database-as-a-Service
Oversubscription is an essential cost management strategy in multi-tenant, cloud Database-as-a-Service (DBaaS), and its importance is magnified by the emergence of serverless databases. In the FRA project, we have developed novel resource management techniques that…