StarDB: a large-scale DBMS for strings

Majed Sahli, Essam Mansour, Panos Kalnis

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Scopus citations


Strings and applications using them are proliferating in science and business. Currently, strings are stored in file systems and processed using ad-hoc procedural code. Existing techniques are not flexible and cannot efficiently handle complex queries or large datasets. In this paper, we demonstrate StarDB, a distributed database system for analytics on strings. StarDB hides data and system complexities and allows users to focus on analytics. It uses a comprehensive set of parallel string operations and provides a declarative query language to solve complex queries. StarDB automatically tunes itself and runs with over 90% efficiency on supercomputers, public clouds, clusters, and workstations. We test StarDB using real datasets that are 2 orders of magnitude larger than the datasets reported by previous works.
Original languageEnglish (US)
Title of host publicationProceedings of the VLDB Endowment
PublisherVLDB Endowment
Number of pages4
StatePublished - Aug 1 2015

Bibliographical note

KAUST Repository Item: Exported on 2020-10-01


Dive into the research topics of 'StarDB: a large-scale DBMS for strings'. Together they form a unique fingerprint.

Cite this