Skip to main content

Metrics Overview

Learn about the metrics system in Scrapest, what data is collected, and how to interpret performance measurements.

Metrics Architecture

Data Collection

Scrapest collects metrics at multiple points in the data processing pipeline:
  • Source Metrics: Time to fetch data from external sources
  • Processing Metrics: Time to process and enrich data
  • Delivery Metrics: Time to deliver data to clients
  • System Metrics: Platform performance and resource usage

Storage and Retention

  • Real-time Storage: Redis for current and recent metrics
  • Time Window: 24-hour rolling window for detailed metrics
  • Aggregation: Percentile calculations for performance analysis
  • Cleanup: Automatic cleanup of metrics older than 24 hours

Available Metrics

Latency Metrics

Source Latency

  • Definition: Time from request to receiving data from external source
  • Measurement: End-to-end time including network and API processing
  • Units: Milliseconds (ms)
  • Sources: Twitter/X API, other social media platforms

Internal Latency

  • Definition: Time for internal processing and enrichment
  • Measurement: Data processing, filtering, and transformation time
  • Units: Milliseconds (ms)
  • Components: Parsing, validation, enrichment, formatting

Performance Metrics

Request Volume

  • Definition: Number of requests processed per time period
  • Measurement: Count of successful requests
  • Granularity: Per minute, per hour, per day
  • Breakdown: By endpoint, by API key, by data source

Success/Error Rates

  • Definition: Percentage of successful vs failed operations
  • Measurement: Success count / Total requests
  • Error Types: HTTP errors, validation errors, timeouts
  • Granularity: Real-time monitoring with historical trends

Metric Percentiles

Understanding Percentiles

Percentiles help understand the distribution of performance data:
  • P50 (Median): 50% of requests complete faster than this value
  • P95: 95% of requests complete faster than this value
  • P99: 99% of requests complete faster than this value

Why Percentiles Matter

  • P50: Represents typical user experience
  • P95: Shows performance for most users (excluding outliers)
  • P99: Indicates worst-case performance for nearly all users

Example Interpretation

Source Latency:
- P50: 150ms (typical request takes 150ms)
- P95: 450ms (95% of requests complete within 450ms)
- P99: 1200ms (99% of requests complete within 1.2s)

Data Sources

Twitter/X Metrics

  • API Response Times: Twitter/X API performance
  • Rate Limit Monitoring: Remaining API quota and reset times
  • Data Quality: Completeness and accuracy metrics

Platform Metrics

  • WebSocket Performance: Connection establishment and message delivery
  • SSE Performance: Stream initiation and data delivery
  • Database Performance: Query execution and connection times

Metric Access Methods

HTTP API

GET /metrics
Authorization: Bearer YOUR_API_KEY

Source-Specific Metrics

GET /metrics/{source}
Authorization: Bearer YOUR_API_KEY

Response Format

{
  "window_hours": 24,
  "count": {
    "source": 15420,
    "internal": 15420
  },
  "source_latency_ms": {
    "p50": 150,
    "p95": 450,
    "p99": 1200
  },
  "internal_latency_ms": {
    "p50": 25,
    "p95": 75,
    "p99": 150
  }
}

Performance Benchmarks

Target Performance

  • Source Latency P50: < 200ms
  • Source Latency P95: < 500ms
  • Source Latency P99: < 1500ms
  • Internal Latency P50: < 50ms
  • Internal Latency P95: < 100ms
  • Internal Latency P99: < 200ms

Current Performance

  • Source Latency P50: 150ms ✅
  • Source Latency P95: 450ms ✅
  • Source Latency P99: 1200ms ✅
  • Internal Latency P50: 25ms ✅
  • Internal Latency P95: 75ms ✅
  • Internal Latency P99: 150ms ✅

Using Metrics for Optimization

Identifying Bottlenecks

  1. High Source Latency: External API performance issues
  2. High Internal Latency: Processing pipeline optimization needed
  3. High P99 Latency: Outlier investigation required

Performance Tuning

  1. Caching: Reduce source API calls for repeated requests
  2. Connection Pooling: Optimize database and external API connections
  3. Load Balancing: Distribute load across multiple instances

Capacity Planning

  1. Volume Trends: Monitor request growth patterns
  2. Resource Utilization: Track CPU, memory, and network usage
  3. Scaling Decisions: Use metrics to inform scaling triggers

Alerting and Monitoring

Alert Thresholds

  • Source Latency P95: Alert if > 1000ms
  • Internal Latency P95: Alert if > 200ms
  • Error Rate: Alert if > 5%
  • Request Volume: Alert if unusual spikes or drops

Monitoring Dashboards

  • Real-time Metrics: Live performance monitoring
  • Historical Trends: 24-hour performance patterns
  • Service Health: Component-level health status
  • Resource Usage: Infrastructure utilization metrics

Best Practices

For Developers

  • Monitor Your Usage: Track your API usage patterns
  • Handle Latency: Implement appropriate timeout handling
  • Error Recovery: Build robust error handling and retry logic
  • Rate Limiting: Respect API rate limits and quotas

For Operations

  • Set Up Alerts: Configure appropriate alert thresholds
  • Monitor Trends: Watch for performance degradation over time
  • Plan Capacity: Use metrics for capacity planning
  • Document Incidents: Track and analyze performance incidents
Next: Performance Metrics