Canopy is an open-source platform for FAIR-aligned scientific data hubs, supporting data sharing, harmonization, discovery, and reuse across research studies. Canopy is derived from the NIH RADx Data Hub (https://radxdatahub.nih.gov/), a cloud-based platform originally developed for the NIH Rapid Acceleration of Diagnostics (RADx) program. RADx Data Hub is available on GitHub. Rather than presenting a one-size-fits-all data hub, Canopy enables customization of RADx Data Hub technology for the needs of specific scientific domains.
Live demo: A demonstration instance of Canopy is publicly available at canopy.stanford.edu. All studies, datasets, and files on that site are synthetic and intended for demonstration purposes only.
Deploying Canopy to AWS
Start here → Deployment Guide
Exploring the codebase?
Start here → Repositories — links to every service, tool, and guide
Want to contribute?
Start here → Contributing Guide
Canopy runs on AWS as a microservices platform:
- 7 Spring Boot microservices on ECS Fargate, behind an Application Load Balancer
- Next.js / React frontend with server-side rendering
- PostgreSQL (RDS) for relational data persistence
- OpenSearch for full-text and faceted search
- AWS Lambda for asynchronous email processing and search reindexing
- S3 for dataset file storage
- Keycloak for authentication and authorization
- CloudFormation (IaC) for repeatable, auditable AWS deployments
| Repository | Description |
|---|---|
| datahub-service-entity | Direct retrieval of database entities |
| datahub-service-search | Search across studies and variables |
| datahub-service-user | User info, profiles, and support requests |
| datahub-service-submission | Data and study ingestion workflows |
| datahub-service-report | Metrics dashboard and reporting |
| datahub-service-download | Controlled dataset file downloads |
| datahub-service-email | Lambda-based email notifications via AWS SES |
| datahub-lib-keycloak-auth | Shared Keycloak authentication library |
| datahub-project | Maven parent POM for all Java services |
| Repository | Description |
|---|---|
| datahub-ui-main | Next.js / React web application |
| Repository | Description |
|---|---|
| datahub-cloud-replication | AWS CloudFormation templates |
| datahub-development | PostgreSQL schema scripts, seed data, OpenSearch Lambda, Keycloak Docker Compose |
| datahub-docs | Deployment guide, limitations, and operator documentation |
| datahub-deployment-scripts | Automation scripts supporting deployment and operations |
| Repository | Description |
|---|---|
| datahub-cli | CLI for local development and server management |
| datahub-utility-scripts | Automation helpers and publication utilities |