Introducing Cyanea

2026-03-12T00:00:00+00:00

Cyanea is live at app.cyanea.bio</a>. You can sign up, create a space, upload datasets, write protocols, run notebooks, and share your work. Everything described in this post is deployed and working today.

This post walks through what we built, how it works, and the decisions behind it. It is long. If you just want to try it, go to app.cyanea.bio</a> and create an account. Come back here when you want to understand what is going on under the hood.

The situation</h2>

If you work in biology, you already know the tooling situation. Your sequencing data lives on a shared drive or an S3 bucket with cryptic folder names. Your protocols are in Google Docs, Word files, or someone’s lab notebook. Your analysis code is scattered across Jupyter notebooks that nobody can reproduce because the environment is gone. Your results live in a slide deck.

There is no single place where a research group can put their datasets, protocols, notebooks, and results, version them, share them with collaborators, and make them discoverable by the wider community. GitHub solved this for code. Hugging Face solved it for ML models. Biology has nothing equivalent.

The tools that do exist fall into two camps. Enterprise LIMS/ELN systems like Benchling are built for internal compliance workflows, not public sharing or community discovery. General-purpose tools like Jupyter and Google Drive were never designed for biological data and have no understanding of FASTA files, protein structures, or experimental protocols.

We built Cyanea to fill this gap.

What Cyanea is</h2>
Cyanea is a platform where researchers organize their work into spaces. A space is like a GitHub repository for research. It contains datasets, protocols, notebooks, and discussions, all versioned and all shareable.
A space belongs to a user or an organization. It can be public (anyone can see it), internal (only your org members), consortium-level (shared with partner organizations), or private (only explicit collaborators). You control what is visible and to whom.
Inside a space you will find:
Datasets with files, metadata, and tags. Upload a FASTA file and Cyanea extracts metadata automatically, recognizes the format, computes a content hash, and stores it in content-addressed storage. We support metadata extraction for 21 file formats including FASTA, FASTQ, VCF, BED, GFF3, CSV, Parquet, GenBank, EMBL, Newick, NEXUS, SDF, PDB, mmCIF, Stockholm, Clustal, Phylip, bigWig, and bedGraph.
Protocols with structured steps, reagent lists, equipment requirements, quality control checkpoints, and troubleshooting notes. Protocols are versioned, so you can track exactly what changed between runs. We ship 18 protocol templates covering common wet-lab workflows (TRIzol RNA extraction, CRISPR-Cas9 knockout, 10x Chromium scRNA-seq, PCR-free WGS, nanopore adaptive sampling, ChIP-seq, ATAC-seq, bisulfite sequencing, Hi-C, long-read amplicon) and dry-lab workflows (RNA-seq DESeq2, GATK variant calling, 16S amplicon metagenomics, scRNA-seq clustering, genome assembly QC, phylogenetic analysis, structural variant calling, epigenome peak calling).
Notebooks where you write and run code. More on these below, because they are unusual.
Discussions for conversation around the work. Ask questions, report problems, request features, or review results. We provide templates for common discussion types so you do not start from a blank page.

Notebooks with two execution engines</h2>
Cyanea notebooks are not Jupyter. They have two execution environments that run simultaneously.
The first is a browser-native WASM runtime. When you write code in a Cyanea cell and press Shift+Enter, it executes immediately in your browser through WebAssembly. No kernel startup, no server round-trip, no waiting. The WASM module loads once and stays warm. You get instant feedback.
let dna = "ATGGCTAGCGTACGATCG" let protein = Seq.translate(dna) print(protein) </code></pre> That runs entirely in your browser. The output renders with type-aware formatting: DNA sequences get color-coded bases, alignments get match/mismatch highlighting, tables get sortable grids, protein sequences get amino acid coloring. The second engine is server-side Elixir execution. For heavier work, like processing large datasets, running database queries, or operations that need server resources, you write Elixir cells that execute in a sandboxed environment on the server. The sandbox validates your code against an AST blocklist (no System, File, IO, Port, Process, or Code module access) and enforces a 30-second timeout. Results are persisted to the database and available to all collaborators. The notebook automatically routes each cell to the right engine based on the language tag. Cyanea cells go to WASM, Elixir cells go to the server. Multiple researchers can work on the same notebook simultaneously. You see other users’ cursors and cell updates in real time, powered by Phoenix Presence and PubSub. Every significant change creates a version snapshot. You can browse version history, diff any two versions at the cell level (additions, modifications, removals, moves), and restore previous states.The Rust bioinformatics ecosystem</h2> The compute layer underneath all of this is not Python. It is Rust. We have a workspace called Cyanea Labs</a> containing 15 Rust crates with over 3,700 tests. These crates cover the core domains of bioinformatics: cyanea-seq handles DNA, RNA, and protein sequences. FASTA and FASTQ parsing at 2 GB/s, roughly 10x faster than BioPython. K-mer extraction, quality scores, sequence manipulation. cyanea-align does pairwise and multiple sequence alignment with affine gap penalties. Needleman-Wunsch, Smith-Waterman, semi-global alignment, banded alignment for long sequences, and GPU dispatch for batch operations. cyanea-omics provides expression matrices (dense and sparse), genomic coordinates with interval trees, variant representation, and single-cell data structures. The single-cell module handles MTX/10X formats, RNA velocity computation, and batch correction (ComBat, LISI, ARI), all feature-gated behind a single-cell</code> flag. cyanea-stats covers descriptive statistics, correlation, t-tests, distributions, multiple testing correction, and PCA. These are the statistical methods you actually use in life sciences, not a general-purpose stats library. cyanea-ml has clustering (k-means, DBSCAN, hierarchical), distance metrics, embeddings, dimensionality reduction (PCA, t-SNE), and k-nearest neighbors. cyanea-chem parses SMILES and SDF, computes Morgan fingerprints, calculates molecular properties, and does substructure search. cyanea-struct parses PDB files, assigns secondary structure (simplified DSSP), does Kabsch superposition, and builds contact maps. cyanea-phylo handles Newick and NEXUS I/O, distance models (JC69, K2P, and others), and tree building with UPGMA and neighbor-joining. Fitch and Sankoff parsimony reconstruction. cyanea-meta is the metagenomics crate. Taxonomy and LCA assignment, community profiling, alpha and beta diversity metrics, compositional data transforms (CLR, ILR, ALDEx2, ANCOM), functional annotation, binning, and assembly QC. cyanea-epi covers epigenomics. MACS2-style peak calling, pileup computation, motif discovery with PWM and MEME-format support, ChromHMM state modeling, differential binding analysis, nucleosome positioning, and ATAC-seq QC. cyanea-io is the unified file format parser. 21 formats behind feature flags, so you only compile what you need. cyanea-gpu abstracts over CUDA and Metal, providing a single API for GPU-accelerated computation. Same code runs on NVIDIA and Apple Silicon. cyanea-wasm compiles all the pure-Rust crates to WebAssembly with JavaScript bindings. This is what powers the browser-side execution in notebooks. cyanea-core has the shared primitives: traits, error types, SHA-256 hashing, zstd compression, memory-mapped file utilities. Every crate is open source. You can use them in your own projects without touching the Cyanea platform. cargo add cyanea-seq</code> and you are set. Why Rust</h3> Four reasons. Performance. Rust compiles to native code with zero-cost abstractions. When you are processing terabytes of genomic data, the difference between 200 MB/s and 2 GB/s matters. Safety. Memory safety without a garbage collector. No segfaults in production when you are halfway through a large alignment job. Portability. The same codebase compiles to native binaries, WASM for browsers, Python bindings via PyO3, and Elixir NIFs via Rustler. Write the algorithm once, deploy it everywhere. GPU readiness. The Rust ecosystem has solid CUDA and Metal support. As GPU-accelerated bioinformatics becomes standard, our compute layer is ready. How Rust connects to the platform</h3> The platform itself is Elixir/Phoenix. Rust powers the compute through two paths: In the browser (WASM). When you preview a FASTA file or run an alignment in a notebook, that is cyanea-seq and cyanea-align compiled to WebAssembly. No server needed. On the server (NIFs). Heavy operations like file checksumming, metadata extraction, and batch alignment run as Rust NIFs inside the Elixir runtime through Rustler. You get Rust speed with Erlang’s concurrency model and fault tolerance. REST API and CLI</h2> Not everything belongs in a web browser. Pipelines need to push data programmatically. Automation scripts need to create spaces and upload datasets. CI systems need to validate outputs. The Cyanea API covers everything you can do in the web interface. Spaces, notebooks, protocols, datasets, search, webhooks, all through a standard REST API. Authentication works two ways. API keys with the cyn_</code> prefix, scoped to read, write, or admin permissions, good for service accounts and automation. JWT tokens issued via email and password, valid for one hour, good for interactive scripts. Rate limiting is 1,000 requests per 15 minutes for API keys and 5,000 for JWT sessions. We also built a CLI tool called cyn</code>: cyn login cyn spaces list cyn spaces create --name "RNA-Seq Analysis" --visibility public cyn datasets upload my-space/counts.csv ./data/counts.csv cyn notebooks import my-space ./analysis.ipynb </code></pre> It is a standalone Elixir escript, no dependencies beyond the binary. Configuration lives in ~/.config/cyn/config.json</code> or environment variables. Webhooks let you subscribe to events across the platform. Space created, dataset updated, protocol modified, notebook changed. Each delivery is HMAC-SHA256 signed and retried up to 5 times with exponential backoff. Federation</h2> This is the part that makes Cyanea structurally different from other platforms. Cyanea is federated. Your institution can run its own Cyanea node, a self-contained instance with full functionality. Data stays on your infrastructure, under your control. When you are ready to share, you selectively publish datasets, protocols, or notebooks to the Cyanea network. Other nodes and the public hub can discover and reference your published work. Attribution and provenance travel with the data. Think of it like email. Your organization runs its own mail server, but you can send messages to anyone. This matters because centralized platforms have a fundamental tension with science. Researchers need to share data openly, but institutions need to control where sensitive data lives. Regulatory compliance, data sovereignty laws, and institutional policies all create constraints that a single SaaS platform cannot accommodate. Every file in Cyanea is content-addressed using SHA-256 hashes. This gives you deduplication (two researchers uploading the same reference genome results in one stored copy) and integrity (a content hash is a permanent, verifiable proof of what the data contained). Federated nodes can verify that synced data has not been tampered with without trusting the source. Federation is live today. Nodes can register, sync manifests, and publish to the hub. We are actively building incremental sync, signed manifests using organizational keys, cross-node citations, and pull mirroring. The tech stack</h2> The platform is Elixir/Phoenix end to end. LiveView for real-time UI updates without writing JavaScript. Phoenix Channels for collaboration features. Oban for reliable background job processing (file uploads, webhook deliveries, metadata extraction, notebook execution). The database layer supports both SQLite and PostgreSQL through a compile-time adapter. The open-source version defaults to SQLite for simple single-file deployment. The hosted version at app.cyanea.bio can use PostgreSQL. Both adapters share the same schema and migrations. File storage is content-addressed. Every blob gets a SHA-256 hash, and that hash is its identity. Blobs are deduplicated automatically. Storage quotas are enforced per user and per organization. Authentication supports email/password with Guardian JWT tokens and API key authentication. Organizations have role-based access with owner, admin, member, and viewer levels. Site admins have a global bypass for all authorization checks. The deployment runs on Hetzner, built with Vela, with SQLite on a mounted volume. It is simple on purpose. One server, one file, no external database process to manage. Plans and pricing</h2> Cyanea is free for public work. Free accounts get 1 GB of storage, 50 MB max file size, 20 versions per notebook, and WASM notebook execution. Pro accounts ($39/month) get 50 GB storage, 200 MB max file size, unlimited versions, server-side Elixir execution, and private spaces. Organizations start at $499/workspace/month with 200 GB storage, unlimited members, and all Pro features. The open-source version is free to self-host with no restrictions. What we are building next</h2> The platform is live but we are not done. Here is what is coming: Community-contributed protocol templates. If you have a well-tested protocol, we want to include it in the template library. More crate development. The Rust ecosystem keeps growing. Python bindings via PyO3, broader GPU acceleration for batch operations, and more file format support. Incremental federation sync. Right now nodes can publish and sync, but the sync is full rather than incremental. We are building efficient delta updates. Signed manifests. Organizational keys for attestation, so you can verify who published what and when. Better single-cell workflows. The single-cell</code> feature in cyanea-omics already handles MTX loading, RNA velocity, and batch correction. We are expanding this with more integration methods and visualization. Try it</h2> Go to app.cyanea.bio</a> and create an account. Create a space, upload some data, start a notebook, write a protocol. Everything works today. If you want to use the Rust libraries directly, they are on GitHub</a>. Every crate is documented and tested. If you want to self-host, the open-source version is at github.com/cyanea-bio/cyanea</a>. If you have questions, feedback, or protocols you would like to contribute, reach out</a>. We read everything.