ElasticGraph 1.0 is Here

$ cat content.md

We're thrilled to announce the 1.0.0 release of ElasticGraph, a project we've been developing and battle-testing within Block, now ready for the broader community! ElasticGraph is a schema-driven, scalable, cloud-native, batteries-included GraphQL framework, backed by Elasticsearch or OpenSearch.

In a nutshell, ElasticGraph provides a robust GraphQL API that unlocks the immense power of Elasticsearch/OpenSearch (ES/OS), without the headaches and risks of exposing your search clusters directly to client applications.

A Common Conundrum: Bridging Client Needs and Datastore Power

Elasticsearch and OpenSearch are tremendously powerful datastores, but teams that use them commonly expose more limited APIs tailored to specific product needs. As their needs evolve, they must evolve their APIs and maintain complex mapping logic to translate API requests into appropriate ES/OS queries. In this environment, it can be tempting to directly expose your ES/OS datastore to clients, allowing them to query anything. However, that approach comes with lots of downsides: it hinders schema evolution, allows clients to run inefficient queries, deeply couples clients and infrastructure, and has significant security concerns.

A backend API that mediates access to ES/OS remains essential. ElasticGraph provides a backend query API which exposes the core search and aggregation functionality of ES/OS while avoiding the downsides of direct access, and its schema-driven nature avoids the need to maintain mapping logic in an API layer.

A Powerful GraphQL Abstraction for Search and Reporting

ElasticGraph was born from a common need at Block to provide a powerful, near real-time secondary index on diverse datasets. We first used it to replace a bespoke search system built on MySQL, and its success has led to its adoption for many other use cases:

Seller dashboard reporting for our largest, most critical datasets
Public GraphQL endpoints (large portions of the Square Public Graph are served directly by ElasticGraph)
Numerous other internal search and reporting use cases from many teams inside Block

What Sets ElasticGraph Apart

GraphQL is particularly well suited for flexible backend APIs, as it provides a type system with excellent tooling, supports operations of arbitrary complexity using field arguments, and is designed to support schema evolution. ElasticGraph's query API goes far beyond what most GraphQL APIs offer, offering a much richer query interface.

Query Almost Anything: Full Datastore Power via GraphQL

ElasticGraph exposes a flexible GraphQL API which allows you to:

Filter on any field in your schema, with the ability to arbitrarily combine and negate filters for complex search scenarios.
Sort and paginate by any field, giving you precise control over result ordering.
Group and aggregate over any field, unlocking powerful analytical and reporting capabilities directly through the API.
Highlight search results to see why particular results matched a query.

Imagine building complex search UIs or detailed reporting dashboards without having to write custom endpoint after custom endpoint. With ElasticGraph, if the data is in your schema, it's available for these rich interactions.

Build Efficient GraphQL APIs Without Writing Resolvers

For backend engineers, writing and maintaining resolver functions (the code that fetches the data for each field in your schema) can be quite time-consuming.

ElasticGraph takes a different approach. Because it's schema-driven and deeply integrated with ES/OS, it automatically provides the GraphQL resolvers. You define an ElasticGraph schema, point it at your ES/OS cluster, and it handles the translation of GraphQL queries into efficient ES/OS queries.

This means:

Faster Development Cycles: Get your GraphQL API up and running much more quickly.
Reduced Boilerplate: Less repetitive resolver code to write and maintain.
Optimized Queries Out-of-the-Box: ElasticGraph resolvers are designed to generate efficient Elasticsearch/OpenSearch queries, saving you the effort of manual optimization and helping ensure good performance.
Focus on Your Data Model: Spend your time defining what data is available, not how to fetch every piece of it.

This "resolver-less" nature, combined with its optimized query generation, significantly lowers the barrier to adopting GraphQL on top of a search datastore, allowing you to leverage its full power with less custom code and greater confidence in performance.

Architecture

ElasticGraph operates in distinct ways in local development vs production. Let's go over those in turn.

Local Development: Generating Schema Artifacts

In local development, an ElasticGraph application owner authors an ElasticGraph schema. From it, ElasticGraph generates multiple schema artifacts:

Here's what an ElasticGraph schema definition looks like:


ruby
1ElasticGraph.define_schema do |schema|
2  schema.json_schema_version 1
3
4  schema.object_type "Artist" do |t|
5    t.field "id", "ID"
6    t.field "name", "String"
7    t.field "lifetimeSales", "JsonSafeLong"
8
9    t.field "albums", "[Album!]!" do |f|
10      f.mapping type: "nested"
11    end
12
13    t.index "artists"
14  end
15
16  schema.object_type "Album" do |t|
17    t.field "name", "String"
18    t.field "releasedOn", "Date"
19  end
20end

The produced schema artifacts are designed to be committed to source control, and are then used in production in a few different ways.

Production: Configuring, Indexing, and Querying the Datastore

In production, ElasticGraph sits in the middle, mediating access between the various clients and the datastore:

Application Owners use ElasticGraph to administer the Elasticsearch/OpenSearch clusters. ElasticGraph uses the generated datastore_config.yaml artifact to configure the ES/OS datastore including all index mappings, configuration settings, and Painless scripts.
Data Publishers--that is, the source-of-record systems for the data--are responsible for publishing events as an asynchronous data stream using an appropriate technology (e.g. AWS SQS, Kafka, or similar). The generated json_schemas.yaml artifact is available to data publishers to use for code generation and event validation. In addition, ElasticGraph uses the json_schemas.yaml artifact when ingesting events to validate the data before indexing it in ES/OS.
GraphQL Clients query an ElasticGraph GraphQL endpoint. The GraphQL endpoint uses the generated schema.graphql artifact to power its GraphQL engine. Internally, it translates GraphQL queries into performant queries against the ES/OS datastore, and returns a GraphQL-formatted response. The schema.graphql artifact is also available for use by the GraphQL client (e.g. for code generation or other purposes).

Try ElasticGraph in 1 Minute!

Curious to learn more? Quickly boot an example ElasticGraph application locally using this command:


bash
1curl -s https://block.github.io/elasticgraph/dc.yml | docker compose -f - up

Then visit the local GraphiQL UI to try some example queries.

Start a New Project in 5 Minutes

Here's how to bootstrap a new ElasticGraph project:


bash
1gem exec elasticgraph new path/to/project
2cd path/to/project
3bundle exec rake boot_locally

Visit our Getting Started guide for a full tutorial.

Celebrating 1.0.0: A Milestone and a Beginning

The release of ElasticGraph 1.0.0 marks a significant milestone. ElasticGraph is a mature, battle-tested solution that has grown from an internal Block initiative to a general-purpose framework now powering numerous systems.

While 1.0.0 signifies stability and readiness for wider adoption, it's also just the beginning. We're excited about the future of ElasticGraph and will continue to evolve it based on community feedback.

We encourage you to dive in and see how you can build more powerful GraphQL APIs with less code using ElasticGraph. We welcome your feedback, contributions, and bug reports on GitHub and the #elasticgraph channel on the Block Open Source Discord server.