Installation
Install the SpinDynamics SDK for your language of choice. All SDKs are published to their respective package registries and receive weekly releases aligned with platform updates.
Python
Node.js
Python 3.9+ or Node.js 18+. All SDKs are generated from our OpenAPI spec and ship with full type annotations.
Authentication
All API requests require a valid API key. You can generate keys from the SpinDynamics dashboard under Settings → API Keys. Keys are scoped to your organization and support granular permission controls.
API keys follow the format sd_live_* for production and sd_test_* for sandbox environments. Test keys route to a simulated inference backend and are free to use during development.
Quick Start
Deploy a model with adaptive routing and run your first inference request in under 30 seconds. This example uses the Python SDK, but the same flow applies to all languages.
The strategy="adaptive" parameter tells the RL router to continuously optimize placement across the specified regions. The constraints object enforces hard limits -- in this case, 50ms max latency and EU GDPR data residency compliance.
Deployments
A Deployment represents a model that has been registered, compiled, and distributed across your infrastructure fleet. Deployments have a lifecycle: they are created, become active, can be updated, and eventually deleted.
Creating a deployment
Use client.deploy() to create a new deployment. The platform handles model compilation, quantization, and distribution to the target regions automatically.
Managing deployments
Deployments transition through the following states: provisioning → compiling → distributing → active. The full lifecycle typically completes in under 90 seconds for supported model architectures.
Routing Policies
Routing policies define how the RL engine makes placement decisions. Each policy specifies optimization objectives, relative weights, and hard constraints. The RL agent learns an optimal policy within these bounds.
Available objectives include minimize_latency, minimize_cost, maximize_throughput, and maximize_availability. Weights are normalized and must sum to 1.0. The RL agent typically converges within 200 requests of a policy change.
Regions & Constraints
SpinDynamics supports 200+ regions across edge PoPs, cloud providers, and on-prem clusters. Region identifiers follow the pattern provider-geography-zone and support glob matching for flexible targeting.
us-east-1,eu-west-1,ap-southeast-1-- standard cloud regionsedge-na-*,edge-eu-*-- edge point-of-presence groupsonprem-*-- on-premises clusters registered via the SpinDynamics agent
Constraints are hard limits enforced at the routing layer before the RL agent evaluates placement options. Available constraint types:
| Constraint | Type | Description |
|---|---|---|
| max_latency_ms | int | Maximum acceptable p99 latency in milliseconds |
| data_residency | string | Compliance profile: eu-gdpr, us-hipaa, ca-pipeda, etc. |
| excluded_regions | string[] | Glob patterns for regions to exclude from routing |
| preferred_providers | string[] | Prefer specific cloud providers: aws, gcp, azure |
| encryption | string | Encryption standard: aes-256-gcm, fips-140-3 |
Python SDK Reference
The Python SDK (spindynamics) is the primary interface for interacting with the SpinDynamics platform. It is fully typed, async-native with sync wrappers, and supports streaming responses out of the box.
Cortex
The main client class. All platform operations are accessed through a Cortex instance.
| Member | Kind | Description |
|---|---|---|
| Cortex(api_key=None) | class | Initialize the client. Reads SPINDYNAMICS_API_KEY from environment if api_key is not provided. |
| .deploy(model, strategy, regions, constraints, autoscale) | method | Create a new deployment. Returns a Deployment object. |
| .infer(deployment_id, prompt, max_tokens, stream) | method | Run inference against a deployment. Returns InferenceResponse. |
| .deployments | property | Access the deployments manager: .list(), .get(id), .update(id), .delete(id). |
| .routing | property | Access the routing manager: .create_policy(), .list_policies(), .get_policy(id). |
| .telemetry | property | Access telemetry data: .query(), .export(), .alerts(). |
Deployment
Represents a live model deployment on the SpinDynamics platform.
| Member | Kind | Description |
|---|---|---|
| .id | property | Unique deployment identifier (e.g. dep_3kx9f2a...). |
| .model | property | Model name string. |
| .status | property | Current lifecycle state: provisioning, active, draining, deleted. |
| .regions | property | List of active regions where the model is deployed. |
| .replicas | property | Current replica count across all regions. |
| .metrics | property | Live metrics: .p50, .p99, .rps, .cost_per_1k. |
InferenceResponse
| Member | Kind | Description |
|---|---|---|
| .text | property | The generated text output. |
| .region | property | Region where the inference was executed. |
| .latency_ms | property | End-to-end latency in milliseconds. |
| .tokens_used | property | Total tokens consumed (prompt + completion). |
| .trace_id | property | Distributed trace ID for observability. |
RoutingPolicy
| Member | Kind | Description |
|---|---|---|
| .id | property | Unique policy identifier. |
| .objectives | property | List of optimization objectives. |
| .weights | property | Objective weights (normalized to sum to 1.0). |
| .constraints | property | Hard constraints dictionary. |
| .convergence_status | property | RL convergence state: exploring, converging, converged. |
Node.js SDK
The Node.js SDK (@spindynamics/sdk) provides a TypeScript-first interface with full type definitions. It uses native fetch under the hood and supports both ESM and CommonJS.
The Node.js SDK mirrors the Python SDK API surface. All methods return typed promises and support both callback and async/await patterns. Streaming responses use ReadableStream natively.
REST API
All platform capabilities are available via the REST API at https://api.spindynamics.net/v1. Authenticate by passing your API key in the Authorization header as a Bearer token.
Endpoints
Example: cURL
The REST API enforces rate limits per API key: 1,000 requests/second for inference, 100 requests/second for management endpoints. Contact us for higher limits on Enterprise plans.
Custom Routing
Beyond the default adaptive strategy, SpinDynamics supports fully custom routing configurations. You can define multi-objective policies, set exploration rates for the RL agent, and create region-pinned deployments for strict compliance scenarios.
The exploration_rate parameter controls how often the RL agent explores non-optimal routes to discover better placements. Set to 0 for pure exploitation (production-safe), or up to 0.1 for aggressive exploration during testing.
Auto-scaling
SpinDynamics provides predictive auto-scaling powered by time-series forecasting. The system provisions capacity 2-5 minutes ahead of demand spikes, eliminating cold-start cascades and over-provisioning.
When spot_fallback is enabled, the scheduler automatically falls back to spot instances during demand spikes, reducing compute costs by up to 70% without impacting latency SLAs.
Monitoring
SpinDynamics exposes full-stack observability through the telemetry API. Query latency distributions, cost attribution, model drift scores, and routing decisions programmatically.
SpinDynamics natively integrates with Datadog, Prometheus, Grafana, and OpenTelemetry. All metrics are also available through the REST API at /v1/telemetry for custom dashboards and automation workflows.