From our own and our users experience building indexers, we noticed common patterns when it comes to integrate onchain data with the services that power modern applications.
Apibara provides built-in integrations that simplify integrating onchain data with common services such as web API, databases, and data analytics tools.
All integrations follow these three steps:
- stream data from a DNA stream by using the user-provided filter.
- finally, data is sent to the downstream integration. This step looks different based on the type of integration.
Notice that all integrations support starting from historical data and then continue with live (real-time) blocks. This means you can use Apibara to build both your indexer (which requires historical data), but also notification services that require the most recent real-time data.
Types of integrations#
Apibara goal is to bring onchain data to any application. At the moment, we offer three types of integrations:
- serverless functions: invoke serverless functions for each batch of data, both historical and live data. Functions are also invoked when a chain reorganization happens, so that your application can manage them.
- webhooks: invoke a webhook for each batch of data, with exactly the payload you provide. This integration doesn't invoke the HTTP webhook in case of chain reorganization.
Apibara can mirror all onchain data you select to a database of your choice. This is the easiest and fastest way to build an indexer for your application, all data is synced automatically and you can focus on other important parts of your application.
While some details vary between each database implementation, all database integrations work as follows:
- the records returned by the transform step are inserted in the database. This step is required to return an array of objects.
- Apibara adds the cursor that generated each piece of data.
- When a chain reorganization happens, Apibara removes all records that have been invalidated.
We provide integrations for the following two databases:
- PostgreSQL: write data to the table specified by the user. Batch data
is converted to PostgreSQL records using the
json_populate_recordsetfunction. Apibara requires a
_cursorcolumn in the table to keep track of each batch's cursor, so that data can be invalidated in case of chain reorganizations.
- MongoDB: write data to the collection specified by the user. Data is
converted to BSON and then written to the collection. Apibara adds a
_cursorcolumn to each record so that data can be invalidated in case of chain reorganizations.
If you'd like us to add a specific database, feel free to open an issue on GitHub.
Apibara is the easiest and fastest way to generate all the datasets that your analytics team needs. Generate exactly the data you need using the filter and transform steps. After that, the integration will start streaming all historical data and will keep your dataset updated as the chain moves forward.
At the moment, datasets can only be generated locally. In the future, we plan to add the ability to automatically upload the datasets to any S3-compatible storage.
Apibara supports generating datasets in the following formats:
- Apache Parquet: generate Parquet files, the schema is automatically deduced from the first batch of data. Apibara groups multiple blocks of data into a single file.