Skip to main content

Corpus Product API Integration Guide

This guide provides everything you need to integrate with the Intrace Corpus Product API.

Getting Started

1. Obtain API Key

Contact your Intrace representative to receive an API key. The API key is required for all requests.

2. Base URLs

Production: https://api.corpus.intrace.ai Development: http://localhost:8000 (for testing)

3. Authentication

All requests must include your API key in the X-API-Key header:
curl -H "X-API-Key: your-api-key-here" \
  https://api.corpus.intrace.ai/v1/events

Common Use Cases

Available Event Filters

The /v1/events endpoint supports the following filter parameters: Categorical filters — all accept multiple values (repeat the parameter to filter by multiple values):
  • event_category - Filter by category (e.g., “conflict”, “cyber”, “crime”)
  • event_type - Filter by type within a category
  • event_subtype - Filter by subtype within a type
  • country - Filter by country code
Date filters:
  • date_from - Inclusive start date (YYYY-MM-DD)
  • date_to - Inclusive end date (YYYY-MM-DD)
Numeric range filters:
  • salience_score_min / salience_score_max - Filter by salience score
  • article_count_min / article_count_max - Filter by number of source articles
  • actor_count_min / actor_count_max - Filter by number of actors involved
Text search:
  • search - Full-text search across title and description
Sorting:
  • sort_by - Sort column: event_date (default), salience_score, article_count, title, created_at
  • sort_order - Sort direction: “asc” or “desc” (default: “desc”)
Note: An unrecognised sort_by value silently falls back to event_date. No validation error is returned — double-check spellings if results appear sorted unexpectedly.

Multi-Value Filtering

Categorical filters (event_category, event_type, event_subtype, country) accept multiple values by repeating the parameter:
import requests

# Filter by multiple categories and multiple countries at once
response = requests.get(
    "https://api.corpus.intrace.ai/v1/events",
    headers={"X-API-Key": "your-api-key"},
    params=[
        ("event_category", "conflict"),
        ("event_category", "crime"),
        ("country", "UA"),
        ("country", "RU"),
    ]
)
# cURL equivalent
curl -H "X-API-Key: your-api-key" \
  "https://api.corpus.intrace.ai/v1/events?event_category=conflict&event_category=crime&country=UA&country=RU"

Query Events by Category

Filter events by category to focus on specific domains:
import requests

response = requests.get(
    "https://api.corpus.intrace.ai/v1/events",
    headers={"X-API-Key": "your-api-key"},
    params={
        "event_category": "cyber",
        "page": 1,
        "page_size": 50
    }
)

events = response.json()
print(f"Found {events['total']} cyber events")
for event in events['items']:
    print(f"- {event['title']}")

Filter by Date Range

Get events within a specific time period:
from datetime import date, timedelta

end_date = date.today()
start_date = end_date - timedelta(days=30)

response = requests.get(
    "https://api.corpus.intrace.ai/v1/events",
    headers={"X-API-Key": "your-api-key"},
    params={
        "date_from": start_date.isoformat(),
        "date_to": end_date.isoformat(),
        "event_category": "conflict",
        "country": "UA"
    }
)

Search Events

Search across event titles and descriptions:
response = requests.get(
    "https://api.corpus.intrace.ai/v1/events",
    headers={"X-API-Key": "your-api-key"},
    params={
        "search": "ransomware attack",
        "event_category": "cyber"
    }
)

Get Event Details

Retrieve full details including category-specific data:
event_id = "550e8400-e29b-41d4-a716-446655440000"

response = requests.get(
    f"https://api.corpus.intrace.ai/v1/events/{event_id}",
    headers={"X-API-Key": "your-api-key"}
)

event = response.json()
print(f"Event: {event['title']}")
print(f"Category: {event['eventCategory']}")
print(f"Location: {event['location']} ({event['locationCountry']})")

# Category-specific fields (e.g., fatalities for conflict events)
if event.get('categorySpecificData'):
    print(f"Details: {event['categorySpecificData']}")

Export Data

Export filtered events in various formats:
# Export as CSV — use stream=True and iter_content; do not call .json()
response = requests.get(
    "https://api.corpus.intrace.ai/v1/events/export",
    headers={"X-API-Key": "your-api-key"},
    params={
        "format": "csv",
        "event_category": "political",
        "date_from": "2026-01-01",
        "date_to": "2026-01-31"
    },
    stream=True
)
response.raise_for_status()
with open("political_events_jan2026.csv", "wb") as f:
    for chunk in response.iter_content(chunk_size=8192):
        f.write(chunk)

# Export as GeoJSON — same streaming pattern; do not call .json()
response = requests.get(
    "https://api.corpus.intrace.ai/v1/events/export",
    headers={"X-API-Key": "your-api-key"},
    params={
        "format": "geojson",
        "country": "SY",
        "event_category": "conflict"
    },
    stream=True
)
response.raise_for_status()
with open("syria_conflict.geojson", "wb") as f:
    for chunk in response.iter_content(chunk_size=8192):
        f.write(chunk)

Discover Filter Values

Use metadata endpoints to find valid filter values:
# Get all available categories
categories = requests.get(
    "https://api.corpus.intrace.ai/v1/metadata/event-categories",
    headers={"X-API-Key": "your-api-key"}
).json()

print("Available categories:")
for cat in categories['categories']:
    print(f"- {cat['value']}: {cat['count']} events")

# Get types within a category
types = requests.get(
    "https://api.corpus.intrace.ai/v1/metadata/event-types",
    headers={"X-API-Key": "your-api-key"},
    params={"category": "cyber"}
).json()

print("\nCyber event types:")
for typ in types['types']:
    print(f"- {typ['value']}: {typ['count']} events")

Available Actor Filters

The /v1/actors endpoint supports: Categorical filters (multi-value, repeat parameter for multiple values):
  • actor_type - Filter by actor category (e.g., “rebel_group”, “state_government”)
  • country - Filter by actor’s primary country
Numeric range filters:
  • event_count_min / event_count_max - Filter by number of events the actor appears in
  • article_count_min / article_count_max - Filter by number of source articles
Text search:
  • search - Search in canonical actor name
Sorting:
  • sort_by - Sort column: canonical_name (default), event_count, article_count, created_at
  • sort_order - Sort direction: “asc” (default) or “desc”
Note: An unrecognised sort_by value silently falls back to canonical_name. No validation error is returned. The /v1/actors/{id}/events endpoint accepts the same filter and sort parameters as /v1/events.

Pagination Best Practices

Handle large result sets with pagination:
def get_all_events(category: str):
    """Fetch all events for a category using pagination."""
    all_events = []
    page = 1

    while True:
        response = requests.get(
            "https://api.corpus.intrace.ai/v1/events",
            headers={"X-API-Key": "your-api-key"},
            params={
                "event_category": category,
                "page": page,
                "page_size": 500  # Max page size
            }
        )

        data = response.json()
        all_events.extend(data['items'])

        if page >= data['totalPages']:
            break

        page += 1

    return all_events

# Usage
all_cyber_events = get_all_events("cyber")
print(f"Retrieved {len(all_cyber_events)} cyber events")

Paginating Actors

The same pagination pattern applies to actors:
def get_all_actors(actor_types: list[str] = None, country: list[str] = None) -> list:
    """Fetch all actors matching optional filters."""
    all_actors = []
    page = 1
    params = {"page_size": 500}
    if actor_types:
        params["actor_type"] = actor_types
    if country:
        params["country"] = country

    while True:
        params["page"] = page
        response = requests.get(
            "https://api.corpus.intrace.ai/v1/actors",
            headers={"X-API-Key": "your-api-key"},
            params=params
        )
        data = response.json()
        all_actors.extend(data["items"])

        if page >= data["totalPages"]:
            break
        page += 1

    return all_actors

# All rebel groups and militias in Nigeria
actors = get_all_actors(actor_types=["rebel_group", "militia"], country=["NG"])
print(f"Retrieved {len(actors)} actors")

End-to-End Workflow Example

This example shows a complete analyst workflow: discover valid filter values, query high-salience conflict events in a region, then pull the event history for the most active actor involved.
import requests

BASE_URL = "https://api.corpus.intrace.ai"
HEADERS = {"X-API-Key": "your-api-key"}

# Step 1: Discover available event types for conflict
types = requests.get(
    f"{BASE_URL}/v1/metadata/event-types",
    headers=HEADERS,
    params={"category": "conflict"}
).json()
print("Conflict types:", [t["value"] for t in types["types"]])

# Step 2: Query high-salience conflict events in the Sahel region
response = requests.get(
    f"{BASE_URL}/v1/events",
    headers=HEADERS,
    params=[
        ("event_category", "conflict"),
        ("country", "ML"),
        ("country", "BF"),
        ("country", "NE"),
        ("date_from", "2025-01-01"),
        ("salience_score_min", "0.6"),
        ("sort_by", "salience_score"),
        ("sort_order", "desc"),
        ("page_size", "20"),
    ]
)
events = response.json()
print(f"Found {events['total']} high-salience conflict events in the Sahel")

# Step 3: Find the most-mentioned actor across these events
from collections import Counter
actor_counts = Counter()
for event in events["items"]:
    event_detail = requests.get(
        f"{BASE_URL}/v1/events/{event['id']}",
        headers=HEADERS
    ).json()
    # categorySpecificData may contain actor breakdown depending on event type

# Step 4: Find actors with high event counts in the region
actors = requests.get(
    f"{BASE_URL}/v1/actors",
    headers=HEADERS,
    params=[
        ("actor_type", "rebel_group"),
        ("actor_type", "militia"),
        ("country", "ML"),
        ("event_count_min", "10"),
        ("sort_by", "event_count"),
        ("sort_order", "desc"),
    ]
).json()

top_actor = actors["items"][0] if actors["items"] else None
if top_actor:
    print(f"Top actor: {top_actor['canonicalName']} ({top_actor['eventCount']} events)")

    # Step 5: Get that actor's conflict events since 2025, sorted by salience
    actor_events = requests.get(
        f"{BASE_URL}/v1/actors/{top_actor['id']}/events",
        headers=HEADERS,
        params={
            "event_category": "conflict",
            "date_from": "2025-01-01",
            "sort_by": "salience_score",
            "sort_order": "desc",
            "page_size": "50",
        }
    ).json()
    print(f"  {actor_events['total']} conflict events found")
    for e in actor_events["items"][:5]:
        print(f"  - [{e['eventDate']}] {e['title']}")

    # Step 6: Export actor's events as CSV for offline analysis
    csv_response = requests.get(
        f"{BASE_URL}/v1/events/export",
        headers=HEADERS,
        params={
            "format": "csv",
            "event_category": "conflict",
            "date_from": "2025-01-01",
            "sort_by": "salience_score",
            "sort_order": "desc",
        },
        stream=True
    )
    with open(f"actor_conflict_events.csv", "wb") as f:
        for chunk in csv_response.iter_content(chunk_size=8192):
            f.write(chunk)
    print("Exported to actor_conflict_events.csv")

Code Examples

Python

Complete example using the requests library:
import requests
from typing import Any, Dict, List, Optional

class IntraceCorpusClient:
    def __init__(self, api_key: str, base_url: str = "https://api.corpus.intrace.ai"):
        self.api_key = api_key
        self.base_url = base_url
        self.headers = {"X-API-Key": api_key}

    def list_events(
        self,
        event_category: Optional[List[str]] = None,
        event_type: Optional[List[str]] = None,
        country: Optional[List[str]] = None,
        date_from: Optional[str] = None,
        date_to: Optional[str] = None,
        salience_score_min: Optional[float] = None,
        sort_by: str = "event_date",
        sort_order: str = "desc",
        page: int = 1,
        page_size: int = 50,
    ) -> Dict[str, Any]:
        """List events with optional filters.

        Multi-value params (event_category, event_type, country) are passed
        as a list of tuples so requests sends repeated query params correctly.
        """
        params: List[tuple] = [
            ("page", page),
            ("page_size", page_size),
            ("sort_by", sort_by),
            ("sort_order", sort_order),
        ]
        for v in (event_category or []):
            params.append(("event_category", v))
        for v in (event_type or []):
            params.append(("event_type", v))
        for v in (country or []):
            params.append(("country", v))
        if date_from:
            params.append(("date_from", date_from))
        if date_to:
            params.append(("date_to", date_to))
        if salience_score_min is not None:
            params.append(("salience_score_min", salience_score_min))

        response = requests.get(
            f"{self.base_url}/v1/events",
            headers=self.headers,
            params=params,
        )
        response.raise_for_status()
        return response.json()

    def get_event(self, event_id: str) -> Dict[str, Any]:
        """Get a single event by ID."""
        response = requests.get(
            f"{self.base_url}/v1/events/{event_id}",
            headers=self.headers
        )
        response.raise_for_status()
        return response.json()

    def export_events(
        self,
        output_path: str,
        format: str = "csv",
        **filters
    ) -> None:
        """Export events in specified format, streaming response to a file.

        The export endpoint returns a file download — do not call .json() on the
        response. Always stream the body to disk using iter_content.
        """
        params = {"format": format, **filters}

        response = requests.get(
            f"{self.base_url}/v1/events/export",
            headers=self.headers,
            params=params,
            stream=True
        )
        response.raise_for_status()
        with open(output_path, "wb") as f:
            for chunk in response.iter_content(chunk_size=8192):
                f.write(chunk)

# Usage
client = IntraceCorpusClient(api_key="your-api-key")
events = client.list_events(
    event_category=["cyber", "crime"],
    country=["US"],
    salience_score_min=0.5,
    sort_by="salience_score",
    sort_order="desc",
)
print(f"Found {events['total']} events")

JavaScript/TypeScript

Example using fetch API. Note that multi-value filters require append not set on URLSearchParams:
class IntraceCorpusClient {
  constructor(
    private apiKey: string,
    private baseUrl: string = "https://api.corpus.intrace.ai"
  ) {}

  private get headers() {
    return { "X-API-Key": this.apiKey };
  }

  async listEvents(params: {
    eventCategory?: string | string[];
    eventType?: string | string[];
    eventSubtype?: string | string[];
    country?: string | string[];
    dateFrom?: string;
    dateTo?: string;
    search?: string;
    salienceScoreMin?: number;
    salienceScoreMax?: number;
    articleCountMin?: number;
    articleCountMax?: number;
    actorCountMin?: number;
    actorCountMax?: number;
    sortBy?: string;
    sortOrder?: "asc" | "desc";
    page?: number;
    pageSize?: number;
  }): Promise<any> {
    const q = new URLSearchParams();

    // Multi-value params — always use append so repeated keys work correctly
    const appendMulti = (key: string, val?: string | string[]) => {
      if (!val) return;
      (Array.isArray(val) ? val : [val]).forEach(v => q.append(key, v));
    };

    appendMulti("event_category", params.eventCategory);
    appendMulti("event_type", params.eventType);
    appendMulti("event_subtype", params.eventSubtype);
    appendMulti("country", params.country);

    if (params.dateFrom) q.set("date_from", params.dateFrom);
    if (params.dateTo) q.set("date_to", params.dateTo);
    if (params.search) q.set("search", params.search);
    if (params.salienceScoreMin != null) q.set("salience_score_min", String(params.salienceScoreMin));
    if (params.salienceScoreMax != null) q.set("salience_score_max", String(params.salienceScoreMax));
    if (params.articleCountMin != null) q.set("article_count_min", String(params.articleCountMin));
    if (params.articleCountMax != null) q.set("article_count_max", String(params.articleCountMax));
    if (params.actorCountMin != null) q.set("actor_count_min", String(params.actorCountMin));
    if (params.actorCountMax != null) q.set("actor_count_max", String(params.actorCountMax));
    if (params.sortBy) q.set("sort_by", params.sortBy);
    if (params.sortOrder) q.set("sort_order", params.sortOrder);
    q.set("page", String(params.page ?? 1));
    q.set("page_size", String(params.pageSize ?? 50));

    const response = await fetch(`${this.baseUrl}/v1/events?${q}`, {
      headers: this.headers,
    });
    if (!response.ok) throw new Error(`API error ${response.status}: ${response.statusText}`);
    return response.json();
  }

  async getEvent(eventId: string): Promise<any> {
    const response = await fetch(`${this.baseUrl}/v1/events/${eventId}`, {
      headers: this.headers,
    });
    if (!response.ok) throw new Error(`API error ${response.status}: ${response.statusText}`);
    return response.json();
  }

  async exportEvents(params: {
    format?: "csv" | "json" | "geojson" | "acled" | "flat";
    eventCategory?: string | string[];
    country?: string | string[];
    dateFrom?: string;
    dateTo?: string;
    sortBy?: string;
    sortOrder?: "asc" | "desc";
  }): Promise<Blob> {
    const q = new URLSearchParams();
    if (params.format) q.set("format", params.format);
    const appendMulti = (key: string, val?: string | string[]) => {
      if (!val) return;
      (Array.isArray(val) ? val : [val]).forEach(v => q.append(key, v));
    };
    appendMulti("event_category", params.eventCategory);
    appendMulti("country", params.country);
    if (params.dateFrom) q.set("date_from", params.dateFrom);
    if (params.dateTo) q.set("date_to", params.dateTo);
    if (params.sortBy) q.set("sort_by", params.sortBy);
    if (params.sortOrder) q.set("sort_order", params.sortOrder);

    const response = await fetch(`${this.baseUrl}/v1/events/export?${q}`, {
      headers: this.headers,
    });
    if (!response.ok) throw new Error(`API error ${response.status}: ${response.statusText}`);
    return response.blob(); // write to disk or pass to a download link
  }
}

// Usage
const client = new IntraceCorpusClient("your-api-key");

// Multi-value filter
const events = await client.listEvents({
  eventCategory: ["conflict", "crime"],
  country: ["UA", "RU"],
  salienceScoreMin: 0.5,
  sortBy: "salience_score",
  sortOrder: "desc",
});
console.log(`Found ${events.total} events`);

// Export as CSV (returns a Blob, not JSON)
const blob = await client.exportEvents({
  format: "csv",
  eventCategory: "conflict",
  dateFrom: "2025-01-01",
  sortBy: "salience_score",
  sortOrder: "desc",
});

cURL

Basic cURL examples:
# List events
curl -H "X-API-Key: your-api-key" \
  "https://api.corpus.intrace.ai/v1/events?event_category=conflict&page=1&page_size=10"

# Get specific event
curl -H "X-API-Key: your-api-key" \
  "https://api.corpus.intrace.ai/v1/events/550e8400-e29b-41d4-a716-446655440000"

# Export to CSV
curl -H "X-API-Key: your-api-key" \
  "https://api.corpus.intrace.ai/v1/events/export?format=csv&event_category=cyber" \
  -o cyber_events.csv

# Get metadata
curl -H "X-API-Key: your-api-key" \
  "https://api.corpus.intrace.ai/v1/metadata/event-categories"

Export Formats

CSV Format

Comma-separated values with headers. Suitable for spreadsheet analysis.
  • Use case: Import into Excel, Google Sheets, or data analysis tools
  • File extension: .csv
  • MIME type: text/csv

JSON Format

Array of complete event objects with actors and metadata.
  • Use case: Programmatic processing, full data fidelity
  • File extension: .json
  • MIME type: application/json

GeoJSON Format

Geographic data format following RFC 7946 specification.
  • Use case: Mapping applications, GIS tools, Leaflet/Mapbox
  • File extension: .geojson
  • MIME type: application/geo+json

ACLED Format

Compatible with ACLED (Armed Conflict Location & Event Data Project) schema. Only available for conflict category events.
  • Use case: Integration with existing ACLED workflows and tools
  • File extension: .json
  • MIME type: application/json
Actor Position Fields:
  • actor1, actor2: Principal and secondary actors
  • assoc_actor_1, assoc_actor_2: Internal sub-units (semicolon-separated)
  • supporting_actor_1, supporting_actor_2: External supporting actors (comma-separated)
  • inter1, inter2: Inter-group codes (1=state, 2=rebel, 3=political militia, etc.)
  • interaction: Combined interaction code (e.g., “1-2” for state vs rebel)
Example:
import json

# Export conflict events in ACLED format
response = requests.get(
    "https://api.corpus.intrace.ai/v1/events/export",
    headers={"X-API-Key": "your-api-key"},
    params={
        "format": "acled",
        "event_category": "conflict",
        "country": "NG",
        "date_from": "2026-01-01",
        "date_to": "2026-01-31"
    },
    stream=True
)
response.raise_for_status()
with open("nigeria_conflict_acled.json", "wb") as f:
    for chunk in response.iter_content(chunk_size=8192):
        f.write(chunk)

# Parse after saving
with open("nigeria_conflict_acled.json") as f:
    events = json.load(f)
for event in events:
    print(f"{event['event_date']}: {event['actor1']} vs {event['actor2']}")
    if event['supporting_actor_1']:
        print(f"  Supporting actor1: {event['supporting_actor_1']}")

Flat Format

Denormalized JSON with actor positions as explicit columns, suitable for spreadsheet import and tabular analysis.
  • Use case: Spreadsheet import where actor relationships need to be explicit
  • File extension: .json
  • MIME type: application/json
Actor Position Columns:
  • principal_actor, secondary_actor: Main actors
  • principal_actor_category, secondary_actor_category: Actor categories (state_forces, rebel_group, etc.)
  • sub_principal_actors, sub_secondary_actors: Internal sub-units (semicolon-separated)
  • supporting_principal_actor, supporting_secondary_actor: External supporting actors (comma-separated)
Key Distinction:
  • Sub-actors: Internal organizational units (e.g., “7th Division” within “Nigerian Army”)
  • Supporting actors: External organizations providing support (e.g., “U.S. Special Forces” supporting “Colombian Army”)
Example:
import json

# Export in flat format for spreadsheet analysis
response = requests.get(
    "https://api.corpus.intrace.ai/v1/events/export",
    headers={"X-API-Key": "your-api-key"},
    params={
        "format": "flat",
        "event_category": "conflict",
        "date_from": "2026-01-01"
    },
    stream=True
)
response.raise_for_status()
with open("conflict_events_flat.json", "wb") as f:
    for chunk in response.iter_content(chunk_size=8192):
        f.write(chunk)

with open("conflict_events_flat.json") as f:
    events = json.load(f)
for event in events:
    print(f"Event: {event['title']}")
    print(f"  Principal: {event['principal_actor']} ({event['principal_actor_category']})")
    if event['sub_principal_actors']:
        print(f"    Internal units: {event['sub_principal_actors']}")
    if event['supporting_principal_actor']:
        print(f"    External support: {event['supporting_principal_actor']}")

Error Handling

The API uses standard HTTP status codes. All error responses share the same body shape:
{
  "detail": "human-readable message"
}
CodeMeaningExample detail
200Success
401Unauthorized"Invalid or missing API key"
404Not Found"Event not found"
422Validation Error"value is not a valid integer"
429Rate Limited"Rate limit exceeded"
500Server Error"Internal server error"
401 responses are returned when the X-API-Key header is absent, empty, or does not match any active key. There is no distinction between a missing and an invalid key in the response body. 422 responses are returned for type validation failures (e.g., passing "abc" as article_count_min). They do not fire for unrecognised categorical values — passing an unknown event_category returns an empty result set, not an error. Silent fallbacks — two parameters degrade gracefully instead of returning 422:
  • An unrecognised sort_by value silently falls back to the default sort column for that endpoint (event_date for events, canonical_name for actors).
try:
    response = requests.get(
        f"https://api.corpus.intrace.ai/v1/events/{event_id}",
        headers={"X-API-Key": api_key}
    )
    response.raise_for_status()
    event = response.json()
except requests.exceptions.HTTPError as e:
    status = e.response.status_code
    detail = e.response.json().get("detail", "unknown error")
    if status == 401:
        raise RuntimeError(f"Authentication failed: {detail}")
    elif status == 404:
        raise RuntimeError(f"Not found: {detail}")
    elif status == 422:
        raise ValueError(f"Invalid parameters: {detail}")
    elif status == 429:
        raise RuntimeError(f"Rate limited — retry after backoff")
    else:
        raise RuntimeError(f"API error {status}: {detail}")

Rate Limits

Default rate limits:
  • 100 requests per minute per API key
  • 10,000 requests per day per API key
The API does not currently return rate limit headers (X-RateLimit-Remaining, Retry-After). Implement exponential backoff on 429 responses rather than relying on header-based scheduling.
import time

def get_with_backoff(url: str, headers: dict, params: dict, max_retries: int = 3) -> dict:
    """GET request with exponential backoff on 429."""
    for attempt in range(max_retries):
        response = requests.get(url, headers=headers, params=params)
        if response.status_code == 429:
            wait = 2 ** attempt
            time.sleep(wait)
            continue
        response.raise_for_status()
        return response.json()
    raise RuntimeError("Exceeded max retries due to rate limiting")
Contact support if you need higher limits for your use case.

Next Steps