
Stream Live Chain Data into Your Analytics Stack with Amp
Real-Time, Reorg-Safe Blockchain Event Indexing with SQL
Reliable analytics starts with accurate, queryable blockchain event data. But contract events break the moment reorgs happen.
Most teams end up building custom indexers and ingestion pipelines to transform logs into SQL tables. That means handling reorgs, managing schemas, running backfills, and maintaining infrastructure.
Amp provides a structured pipeline for smart contract events. When a contract emits events, Amp indexes them, reconciles reorgs, and exposes the results as SQL tables. The data stays consistent without custom parsers, reorg logic, or ingestion pipelines.
You write contracts and queries. Amp maintains the data.
Real-Time User Engagement Metrics on a DEX
Analysts often ask questions like:
- “Where do our new users come from?”
- “Which liquidity pools are growing fastest?”
- “How many unique addresses interacted with token X this week vs last?”
These aren't hard questions. But getting reliable answers requires a pipeline most teams don't have. What you actually need:
- Real-time ingestion
- Guaranteed correctness through reorgs
- Queryable SQL tables with semantic meaning (events, entities)
- A pipeline where engineers don't write ABI parsers for every contract
Problem Discovery: Why Traditional Pipelines Fail
Without Amp, teams are prone to encounter the following issues:
Polling an RPC in AWS Lambda Example:
Write code to fetch blocks, decode logs using local ABIs, and manage shards with DynamoDB leases. But:
- Devs spend more time fixing parsers than building BI
- Reorgs cause duplicates and miss reversions
- Every new contract means fresh parsing work
At some point your devs are spending more time on the pipeline than on the product.
Firehose + Raw Tables Example:
Index everything into raw tables. Analysts write complex SQL for every metric. Problems:
- Hard to maintain
- Slow queries
- No schema guarantees
Analysts end up reverse-engineering the data instead of using it.
Amp is the Solution
Teams choose Amp for three big reasons:
- Automatic Event → SQL Table Conversion
No ABI coding: Amp introspects ABIs and auto-generates table schemas viaeventTables(abi). - Reorg Handling Built-In
Amp's reorg awareness lets you ingest data while still trusting the results. - Derived Tables for Transformations
Push transformations inside Amp, not in messy Airflow tasks.
Amp Architecture Overview
For a full breakdown of how Amp's architecture works, see the docs. Here's a quick overview of how the ampd daemon runs as a server that provides two query interfaces:
| Interface | Protocol | Use Case |
|---|---|---|
| Arrow Flight | gRPC | High-performance binary streaming for analytics tools |
Data flows through extractors into Apache Parquet files stored in object storage (S3/GCS/local), with metadata tracked in PostgreSQL.
Prerequisites
To run Amp, you first need a PostgreSQL database for metadata:
docker compose up -dThis runs the metadata DB at postgresql://postgres:postgres@localhost:5432/amp. Configure it in your config file:
[metadata_db]
url = "postgresql://postgres:postgres@localhost:5432/amp"Implementation Example: Step by Step
The examples in this guide use Amp-managed staging and production datasets, which require approved access. To get started with Amp, reach out to the Edge & Node team. They'll help you set up access and connect quickly. Once it's enabled, you can run the examples as shown and start querying live data.
Step 1: Define Raw Event Tables
Here's a quick example using a DEX Router and ERC-20 tokens. In Amp, users define datasets using defineDataset() in a TypeScript config file:
// amp.config.ts
import { defineDataset, eventTables } from "@edgeandnode/amp";
import DEXRouterABI from "./abis/DEXRouter.json";
import ERC20ABI from "./abis/ERC20.json";
export default defineDataset(() => {
const dexTables = eventTables(DEXRouterABI);
const erc20Tables = eventTables(ERC20ABI);
return {
namespace: "analytics",
name: "dex",
network: "mainnet",
description: "DEX swap and token transfer events",
dependencies: {
mainnet: "_/eth_rpc@latest",
},
tables: {
...dexTables,
...erc20Tables,
},
};
});Amp instantly generates tables from contract events:
| Table | Description |
|---|---|
swap | Swap events from DEX Router |
transfer | ERC-20 transfer events |
add_liquidity | Liquidity addition events |
No hand-crafting. No string copy/paste errors.
Step 2: Build Derived Tables
Once you have raw events, you'll want filtered views without writing the same SQL over and over.
Important: Amp derived tables are limited to streaming SQL. GROUP BY, LIMIT, and ORDER BY are not supported in table definitions because data is processed as a continuous stream organized by block ranges. Aggregations must be performed at query time.
Here's how users add derived tables for filtering:
// amp.config.ts
import { defineDataset, eventTables } from "@edgeandnode/amp";
import DEXRouterABI from "./abis/DEXRouter.json";
export default defineDataset(() => {
const baseTables = eventTables(DEXRouterABI);
return {
namespace: "analytics",
name: "dex",
network: "mainnet",
description: "DEX analytics with filtered views",
dependencies: {
mainnet: "_/eth_rpc@latest",
},
tables: {
...baseTables,
// Derived table: filter for large swaps (streaming-compatible)
large_swaps: {
sql: `
SELECT
block_num,
tx_hash,
sender,
token_in,
token_out,
amount_in,
amount_out,
timestamp
FROM swap
WHERE amount_in > 1000000
`,
},
// Derived table: swaps involving WETH
weth_swaps: {
sql: `
SELECT *
FROM swap
WHERE token_in = '0xC02aaA39b223FE8D0A0e5C4F27eAD9083C756Cc2'
OR token_out = '0xC02aaA39b223FE8D0A0e5C4F27eAD9083C756Cc2'
`,
},
},
};
});For aggregations like daily metrics, run queries at request time through Amp's query APIs:
-- Query via Arrow Flight
SELECT
DATE(timestamp) AS day,
COUNT(*) AS total_swaps,
COUNT(DISTINCT sender) AS unique_traders
FROM "analytics/dex@latest".swap
GROUP BY 1
ORDER BY 1 DESC
LIMIT 30;Step 3: Reorg Handling Out of the Box
Blockchains sometimes undergo chain reorganizations, often called reorgs, where recently confirmed blocks are replaced by a different canonical chain. When this happens, downstream systems must roll back incorrect data and re-apply corrected transaction data.
With batch pipelines, you have to manually track and reconcile reorgs, pause ingestion, and run reconciliation jobs to roll back and replay affected transactions. This increases operational complexity and risks double-counting or missing events.
Amp handles chain reorgs automatically through block range organization and revision-based execution. You don't need to configure, build, or maintain any reorg-handling logic.
Traditional databases face a tradeoff: reorg handling typically requires costly reconciliation. Amp avoids this by organizing data into immutable Parquet files and serving queries from consistent revisions. Reorgs are processed in parallel, and a single metadata update automatically switches the active view when the corrected version is ready.
This delivers continuous ingestion, consistent queries, and built-in reorg correctness without operational overhead.
Step 4: Dataset Versioning & Namespacing
Amp uses content-addressable manifests with tags pointing to specific versions:
"local/counter@dev" # Local development
"analytics/dex@0.1.0" # Specific semantic version
"analytics/dex@latest" # Latest published version
"system/anvil@0.0.1" # Built-in chain dataUsers should always specify a valid namespace. Registering under _ is not allowed, so all references must use an explicit, valid namespace.
This provides:
- Immutability: Manifests never change once stored
- Deduplication: Identical manifests stored only once
- Reproducibility: Pin exact versions for models and dashboards
Teams can query datasets using this format:
# Local development
pnpm amp query 'SELECT * FROM "local/dex@dev".swap LIMIT 10'
# Published dataset
pnpm amp query 'SELECT * FROM "analytics/dex@latest".large_swaps LIMIT 10'Results: What Teams Can Expect
Here's the potential impact Amp brings:
| Metric | Before Amp | After Amp |
|---|---|---|
| Time to get fresh data | 24+ hours | < 3 minutes |
| Reorg-related dashboard errors | Frequent | Zero |
| Engineering hours spent on pipeline | High | Minimal |
| Trust in data correctness | Low | High |
Analysts spend their time on insights, not debugging parsers.
Client Options
Amp provides multiple ways to query your data:
| Client | Use Case |
|---|---|
| @edgeandnode/amp (TypeScript SDK) | Application integration, programmatic queries |
| Rust CLI | Administrative operations, scripting |
| Python Client | Interactive notebooks, data science workflows |
| Arrow Flight API | High-performance binary streaming for analytics tools |
| Amp Studio | Visual query builder and dataset browser |
All clients use the same SQL query language and connect to the same Amp server.
Installation
The easiest way to install Amp is using ampup, the official version manager. See the Amp repo.
Conclusion
Amp turns what used to be a messy, brittle, and slow ETL project into a robust, low-maintenance, real-time analytics pipeline.
With:
- Automatic event → SQL tables via
eventTables(abi) - Built-in reorg correctness
- Derived tables for streaming-compatible filtered views
- Content-addressable dataset versioning (
namespace/name@version) - Dual query APIs (Arrow Flight for performance)
Teams can spend time building BI and ML models rather than maintaining bespoke parsers and data wrangling scripts.
If your next product needs live blockchain analytics you can trust, Amp is worth a close look.
The repo and demo are linked below.