Home

Platform API Reference

Clean datasets programmatically via REST API. Submit jobs, poll status, and download results. Your API key tier determines rate limits and quotas.

Quick Start

Three steps to clean your first dataset:

1

Create an API key in Dashboard > Settings > API Keys

2

Submit a cleaning job with your file URL

3

Poll job status, then download the cleaned result

bash
# 1. Submit a cleaning job
curl -X POST https://augea.org/api/platform/v1/clean \
  -H "Authorization: Bearer aug_live_YOUR_KEY_HERE" \
  -H "Content-Type: application/json" \
  -d '{"file_url": "https://my-bucket.s3.amazonaws.com/data.csv", "modality": "tabular"}'

# Response: {"job_id": "abc-123", "status": "queued", "request_id": "req_..."}

# 2. Check job status
curl https://augea.org/api/platform/v1/clean/abc-123 \
  -H "Authorization: Bearer aug_live_YOUR_KEY_HERE"

# 3. Download cleaned result (when status is "completed")
curl https://augea.org/api/platform/v1/clean/abc-123/download \
  -H "Authorization: Bearer aug_live_YOUR_KEY_HERE" -o cleaned_data.csv

Authentication

All API requests require a valid API key in the Authorization header.

http
Authorization: Bearer aug_live_YOUR_KEY_HERE

Security

  • Keys are shown only once at creation. Store securely.
  • Never expose keys in frontend code, git repos, or client-side JavaScript.
  • Use environment variables: AUGEA_API_KEY
  • Rotate keys immediately if compromised via Dashboard > Settings > API Keys.
  • Each request is logged, rate-limited, and traced with a unique request_id.

Base URL

text
https://augea.org/api/platform/v1

All endpoints are relative to this base.

Endpoints

POST/clean

Submit a new data cleaning job.

Request Body

file_urlstring*

Public URL of the file to clean. Must be HTTPS. Localhost, private IPs, and file:// are blocked.

modalitystring

File type hint. One of: tabular, text, image, audio, video, model, pointcloud

webhook_urlstring

URL to receive a POST when the job completes. Signed with HMAC-SHA256.

metadataobject

Arbitrary JSON (max 4KB) stored with the job. Returned in GET responses.

bash
curl -X POST /api/platform/v1/clean \
  -H "Authorization: Bearer aug_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "file_url": "https://my-bucket.s3.amazonaws.com/raw.csv",
    "modality": "tabular",
    "webhook_url": "https://my-app.com/hooks/cleaning",
    "metadata": {"project": "sensor-v2", "batch": 42}
  }'

Response 201

json
{
  "job_id": "3a471fec-925e-4a55-80ef-313d3d42fe48",
  "status": "queued",
  "request_id": "req_e8b6815367a0444b962966de65340a8f"
}

Response Headers

X-Request-ID: Unique request trace ID

X-RateLimit-Limit-Daily: Your daily job quota

X-RateLimit-Remaining-Daily: Jobs remaining today

X-RateLimit-Limit-Monthly: Your monthly quota

X-RateLimit-Remaining-Monthly: Jobs remaining this month

GET/clean

List your cleaning jobs with optional filtering and cursor-based pagination.

Query Parameters

statusstring

Filter by status: queued, running, completed, failed

limitnumber

Results per page (1–100, default 20)

cursorstring

ISO timestamp from next_cursor of previous page

bash
curl "/api/platform/v1/clean?status=completed&limit=10" \
  -H "Authorization: Bearer aug_live_..."

Response 200

json
{
  "jobs": [
    {
      "id": "3a471fec-...",
      "status": "completed",
      "modality": "tabular",
      "quality_score": 94,
      "created_at": "2026-02-25T12:00:00Z",
      "updated_at": "2026-02-25T12:01:30Z"
    }
  ],
  "has_more": false,
  "next_cursor": null,
  "request_id": "req_..."
}
GET/clean/:job_id

Get the status and details of a specific cleaning job.

bash
curl /api/platform/v1/clean/3a471fec-925e-4a55-80ef-313d3d42fe48 \
  -H "Authorization: Bearer aug_live_..."

Response 200

json
{
  "job_id": "3a471fec-...",
  "status": "completed",
  "modality": "tabular",
  "quality_score": 94,
  "created_at": "2026-02-25T12:00:00Z",
  "completed_at": "2026-02-25T12:01:30Z",
  "error": null,
  "request_id": "req_..."
}
GET/clean/:job_id/download

Download the cleaned output file. Only available when status is 'completed'.

bash
curl /api/platform/v1/clean/3a471fec-.../download \
  -H "Authorization: Bearer aug_live_..." \
  -o cleaned_output.csv

Returns 422 job_not_ready if the job is not yet completed.

Webhooks

When you provide a webhook_url, we'll POST to it when the job finishes. Each webhook is signed so you can verify it came from Augea.

Verifying signatures

The X-Augea-Signature header contains a timestamp and HMAC-SHA256 signature:

text
X-Augea-Signature: t=1708900000,v1=5af3e2c8b9d1f4...
javascript
import crypto from "crypto";

function verifyWebhook(payload, header, secret) {
  const [tPart, sigPart] = header.split(",");
  const timestamp = tPart.replace("t=", "");
  const signature = sigPart.replace("v1=", "");
  
  const expected = crypto
    .createHmac("sha256", secret)
    .update(`${timestamp}.${JSON.stringify(payload)}`)
    .digest("hex");
  
  return crypto.timingSafeEqual(
    Buffer.from(signature, "hex"),
    Buffer.from(expected, "hex")
  );
}

Error Codes

StatusError CodeDescription
400bad_requestMissing/invalid field (file_url, modality, etc.)
400invalid_modalityUnsupported modality value
401missing_keyNo Authorization header
401invalid_keyAPI key not found or malformed
401revoked_keyAPI key has been revoked
404job_not_foundJob ID does not exist or is not yours
422job_not_readyJob not yet completed (for download)
429rate_limit_exceededToo many requests/minute
429daily_limit_exceededDaily job quota reached
429monthly_limit_exceededMonthly job quota reached
500internal_errorServer error. Retry or contact support.

Rate Limits & Tiers

Your API key tier is set by your subscription plan. Upgrade in Dashboard > Settings > Billing.

TierRateDailyMonthlyMax Rows
Free10/min5/day50/mo1,000
Pro (Creator)30/min50/day500/mo10,000
Enterprise (Team)100/min500/day5,000/mo100,000

Code Examples

Python

python
import requests, time, os

API_KEY = os.environ["AUGEA_API_KEY"]
BASE = "https://augea.org/api/platform/v1"
HEADERS = {"Authorization": f"Bearer {API_KEY}"}

# Submit job
resp = requests.post(f"{BASE}/clean", headers=HEADERS, json={
    "file_url": "https://my-bucket.s3.amazonaws.com/raw.csv",
    "modality": "tabular",
})
job_id = resp.json()["job_id"]
print(f"Job {job_id} queued")

# Poll until done
while True:
    status = requests.get(f"{BASE}/clean/{job_id}", headers=HEADERS).json()
    if status["status"] in ("completed", "failed"):
        break
    time.sleep(5)

# Download
if status["status"] == "completed":
    data = requests.get(f"{BASE}/clean/{job_id}/download", headers=HEADERS)
    with open("cleaned.csv", "wb") as f:
        f.write(data.content)
    print("Done!")

TypeScript / Node.js

typescript
const API_KEY = process.env.AUGEA_API_KEY!;
const BASE = "https://augea.org/api/platform/v1";
const headers = { Authorization: `Bearer ${API_KEY}` };

// Submit job
const { job_id } = await fetch(`${BASE}/clean`, {
  method: "POST",
  headers: { ...headers, "Content-Type": "application/json" },
  body: JSON.stringify({
    file_url: "https://my-bucket.s3.amazonaws.com/raw.csv",
    modality: "tabular",
  }),
}).then(r => r.json());

console.log(`Job ${job_id} queued`);

// Poll until done
let status: string;
do {
  await new Promise(r => setTimeout(r, 5000));
  const job = await fetch(`${BASE}/clean/${job_id}`, { headers }).then(r => r.json());
  status = job.status;
} while (status !== "completed" && status !== "failed");

// Download
const file = await fetch(`${BASE}/clean/${job_id}/download`, { headers });
const buffer = await file.arrayBuffer();
Bun.write("cleaned.csv", buffer); // or fs.writeFileSync for Node

cURL (one-liner)

bash
# Submit, get job_id, poll, download
JOB=$(curl -s -X POST https://augea.org/api/platform/v1/clean \
  -H "Authorization: Bearer $AUGEA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"file_url":"https://my-bucket.s3.amazonaws.com/raw.csv"}' | jq -r .job_id)

echo "Polling job $JOB..."
while [ "$(curl -s https://augea.org/api/platform/v1/clean/$JOB \
  -H "Authorization: Bearer $AUGEA_API_KEY" | jq -r .status)" != "completed" ]; do
  sleep 5
done

curl -o cleaned.csv https://augea.org/api/platform/v1/clean/$JOB/download \
  -H "Authorization: Bearer $AUGEA_API_KEY"

Ready to get started?

Create an API key and submit your first cleaning job.

Get API Key