Databricks Runtime Health and Life Sciences

The Databricks Runtime Health and Life Sciences (HLS) is a version of the Databricks Runtime optimized for working with genomic and biomedical data. It is a component of the Unified Analytics Platform for Genomics.


The Databricks Runtime HLS is currently in Beta. Interfaces and pricing are subject to change before general availability. Sign up for access.

What’s inside?

  • A fast, scalable DNASeq pipeline
  • Spark SQL optimizations for common query patterns
  • Hail 0.2 integration
  • Popular open source libraries, optimized for performance and reliability
    • ADAM
    • GATK
    • Hadoop-bam
  • Reference data (grch37 or 38, known SNP sites)