Streaming

Amazon Kinesis

Real-time data streaming and analytics

Kinesis Data Streams (KDS)

Mental model

  • Durable ordered log with partitioning (shards) for custom real-time consumers.
  • Use it when you need multiple consumers, sub-second-ish pipelines, or you want to build your own stream processing (Flink, custom apps, Lambda consumers).

Capacity knobs (the ones that bite)

  • Shard count = capacity.

    • Per shard: up to ~1 MB/s writes and ~2 MB/s reads (shared across consumers unless using enhanced fan-out). ([AWS Documentation][1])
  • Partition key strategy: bad keys ⇒ hot shards.

  • Retention: default 24h; extended retention adds cost. ([Amazon Web Services, Inc.][2])

  • Enhanced fan-out (EFO): dedicated 2 MB/s per shard per consumer; costs extra per consumer-shard-hour (+ per GB retrieved). ([AWS Documentation][3])

  • Consumers: shared throughput vs EFO (choose based on number of readers + SLA).

Pricing mental model

  • Provisioned mode bills primarily:

    • Shard-hours (always-on capacity)
    • PUT payload units (producers; 25KB units)
    • Optional: extended retention, EFO consumer-shard-hours + data retrieved ([Amazon Web Services, Inc.][2])
  • Back-of-envelope: KDS cost ≈ “(shards × 24×30) + producer PUT volume + (EFO × consumers)”.

Agentic/GenAI usage patterns

  • Streaming agent telemetry (tool calls, traces), clickstream, prompt/response events for near-real-time analytics, online feature/event pipelines.

Terraform (Kinesis stream)

resource "aws_kinesis_stream" "events" {
  name             = var.name
  shard_count      = var.shards
  retention_period = 24 # hours (increase if you need replay)

  stream_mode_details {
    stream_mode = "PROVISIONED"
    # or "ON_DEMAND" if you prefer (provider support varies by version)
  }

  encryption_type = "KMS"
  kms_key_id      = var.kms_key_arn

  tags = var.tags
}

variable "name"        { type = string }
variable "shards"      { type = number default = 2 }
variable "kms_key_arn" { type = string }
variable "tags"        { type = map(string) default = {} }