Batched Inference Results

Introduction

Efficiently retrieve multiple detection results in batches

The Batched Inference Results endpoint allows you to retrieve multiple detection results in a single API call. This is particularly useful for bulk data processing, analytics, and historical data retrieval. The endpoint supports pagination through checkpoints, enabling you to efficiently process large volumes of inference results.

Endpoint Overview

The batched results endpoint returns inference results (fingerprinting and geolocation data) for a specific project within a given time range. Results are returned in batches of up to 5,000 records per request.

GET /api/v1/innerworks/inference/batch

Query Parameters

Required Parameters

x-api-key (header): Your API key for authentication

Optional Parameters

batch_size (integer): Number of results to return per request. Defaults to 1000. Maximum value is 5000.

start_time (integer): Unix timestamp (in milliseconds) to start retrieving results from. If not provided, retrieves from the beginning of available data.

checkpoint (string): Base64-encoded checkpoint string from a previous request. Used for pagination to retrieve the next batch of results.

Making a Request

Basic Request

Retrieve the first batch of results with default batch size (1000):

curl --request GET \
--url 'https://api.prod.innerworks.me/api/v1/innerworks/inference/batch' \
--header 'x-api-key: YOUR_API_KEY'
Custom Batch Size

Retrieve a specific number of results:

curl --request GET \
--url 'https://api.prod.innerworks.me/api/v1/innerworks/inference/batch?batch_size=2000' \
--header 'x-api-key: YOUR_API_KEY'
Time Range Query

Retrieve results starting from a specific time:

curl --request GET \
--url 'https://api.prod.innerworks.me/api/v1/innerworks/inference/batch?start_time=1704067200000&batch_size=500' \
--header 'x-api-key: YOUR_API_KEY'
Pagination with Checkpoint

Use the checkpoint from a previous response to get the next batch:

curl --request GET \
--url 'https://api.prod.innerworks.me/api/v1/innerworks/inference/batch?checkpoint=CHECKPOINT_STRING&batch_size=1000' \
--header 'x-api-key: YOUR_API_KEY'

Response Format

The endpoint returns a JSON object containing the batch size, an array of results, and a checkpoint for pagination.

{
  "batchSize": 1000,
  "results": [
    {
      "requestId": "550e8400-e29b-41d4-a716-446655440000",
      "projectId": "project-123",
      "timestamp": "2024-01-01T10:00:00.000Z",
      "result": {
        "fingerprinting": {
          "browser": "Chrome",
          "fingerprint": "abc123def456",
          "deviceOs": "Windows",
          "socialId": "user-789"
        },
        "geolocation": {
          "vpnDetectionReasons": "IP analysis",
          "vpnIsEnabled": true,
          "trueGeoLocationName": "United States",
          "trueGeoLocationCode": "US",
          "vpnProvider": "NordVPN",
          "vpnLocationCode": "CA",
          "stateUS": "NY",
          "trueUserIpAddress": "192.168.1.1",
          "requestIpAddress": "10.0.0.1"
        }
      }
    }
  ],
  "checkpoint": "eyJ2ZXJzaW9uIjoxLCJzdGF0ZSI6WzE3MDQxMDMyMDAwMDAsInJlcS0xMDAiXX0="
}

Response Fields

Top-Level Fields

batchSize (integer): The number of results requested in this batch.

results (array): Array of inference result objects.

checkpoint (string): Base64-encoded checkpoint for retrieving the next batch. Use this value in the `checkpoint` query parameter for subsequent requests.

Result Object Fields

requestId (string): Unique identifier for the inference request.

projectId (string): Project identifier.

timestamp (string): ISO 8601 formatted timestamp of when the inference was performed.

result (object): Contains the detection results.

Fingerprinting Fields (when available)

fingerprint (string): The unique browser fingerprint.

browser (string): Detected browser name.

deviceOs (string): Detected operating system.

socialId (string): User-provided social identifier.

Geolocation Fields (when available)

vpnIsEnabled (boolean): Whether VPN was detected.

vpnDetectionReasons (string): Explanation of how VPN was detected.

trueGeoLocationName (string): Actual geographic location name.

trueGeoLocationCode (string): Two-letter country code.

vpnProvider (string): Detected VPN provider name (if VPN is enabled).

vpnLocationCode (string): Apparent location code when using VPN.

stateUS (string): US state code (when applicable).

trueUserIpAddress (string): The actual IP address of the user.

requestIpAddress (string): The IP address from the request headers.

Pagination Workflow

To retrieve all results for a time range, follow this workflow:

1. Make an initial request with `batchsize` and optionally `starttime`.

2. Process the results from the response.

3. Extract the `checkpoint` value from the response.

4. Make a subsequent request using the `checkpoint` query parameter.

5. Repeat steps 2-4 until the results array is empty, indicating no more data is available.

Example Pagination
# First request
curl --request GET \
--url 'https://api.prod.innerworks.me/api/v1/innerworks/inference/batch?batch_size=1000' \
--header 'x-api-key: YOUR_API_KEY'

# Response contains checkpoint: "eyJ2ZXJzaW9uIjoxLCJzdGF0ZSI6WzE3MDQxMDMyMDAwMDAsInJlcS0xMDAiXX0="

# Second request using checkpoint
curl --request GET \
--url 'https://api.prod.innerworks.me/api/v1/innerworks/inference/batch?checkpoint=eyJ2ZXJzaW9uIjoxLCJzdGF0ZSI6WzE3MDQxMDMyMDAwMDAsInJlcS0xMDAiXX0%3D&batch_size=1000' \
--header 'x-api-key: YOUR_API_KEY'

Error Responses

Invalid Batch Size
{
  "cause": "batch_size",
  "error": "batch_size must be between 1 and 5000"
}
Invalid Start Time
{
  "cause": "start_time",
  "error": "start_time must be before the current time"
}
Invalid Checkpoint
{
  "cause": "checkpoint",
  "error": "invalid checkpoint string"
}
Missing API Key
{
  "statusCode": 401,
  "message": "Header x-api-key is required"
}

Best Practices

• Use appropriate batch sizes based on your needs. Larger batches reduce the number of API calls but increase response time.

• Always persist the checkpoint value for reliable pagination, especially when processing large datasets.

• When checkpoint is provided, the `start_time` parameter is ignored. The checkpoint contains the position information.

• Empty results array indicates you've reached the end of available data for the specified time range.

• The checkpoint mechanism ensures consistent pagination even if new data is being added concurrently.

• Results are returned in chronological order (oldest first).

Use Cases

Analytics and Reporting: Bulk retrieve historical data for analysis and visualization.

Data Warehousing: Export detection results to your own data warehouse or analytics platform.

Audit and Compliance: Retrieve complete audit trails of detection events.

Machine Learning: Extract training data for custom ML models.

Backup and Archival: Create backups of your detection data.