Metrics Overview
Learn about the metrics system in Scrapest, what data is collected, and how to interpret performance measurements.Metrics Architecture
Data Collection
Scrapest collects metrics at multiple points in the data processing pipeline:- Source Metrics: Time to fetch data from external sources
- Processing Metrics: Time to process and enrich data
- Delivery Metrics: Time to deliver data to clients
- System Metrics: Platform performance and resource usage
Storage and Retention
- Real-time Storage: Redis for current and recent metrics
- Time Window: 24-hour rolling window for detailed metrics
- Aggregation: Percentile calculations for performance analysis
- Cleanup: Automatic cleanup of metrics older than 24 hours
Available Metrics
Latency Metrics
Source Latency
- Definition: Time from request to receiving data from external source
- Measurement: End-to-end time including network and API processing
- Units: Milliseconds (ms)
- Sources: Twitter/X API, other social media platforms
Internal Latency
- Definition: Time for internal processing and enrichment
- Measurement: Data processing, filtering, and transformation time
- Units: Milliseconds (ms)
- Components: Parsing, validation, enrichment, formatting
Performance Metrics
Request Volume
- Definition: Number of requests processed per time period
- Measurement: Count of successful requests
- Granularity: Per minute, per hour, per day
- Breakdown: By endpoint, by API key, by data source
Success/Error Rates
- Definition: Percentage of successful vs failed operations
- Measurement: Success count / Total requests
- Error Types: HTTP errors, validation errors, timeouts
- Granularity: Real-time monitoring with historical trends
Metric Percentiles
Understanding Percentiles
Percentiles help understand the distribution of performance data:- P50 (Median): 50% of requests complete faster than this value
- P95: 95% of requests complete faster than this value
- P99: 99% of requests complete faster than this value
Why Percentiles Matter
- P50: Represents typical user experience
- P95: Shows performance for most users (excluding outliers)
- P99: Indicates worst-case performance for nearly all users
Example Interpretation
Data Sources
Twitter/X Metrics
- API Response Times: Twitter/X API performance
- Rate Limit Monitoring: Remaining API quota and reset times
- Data Quality: Completeness and accuracy metrics
Platform Metrics
- WebSocket Performance: Connection establishment and message delivery
- SSE Performance: Stream initiation and data delivery
- Database Performance: Query execution and connection times
Metric Access Methods
HTTP API
Source-Specific Metrics
Response Format
Performance Benchmarks
Target Performance
- Source Latency P50: < 200ms
- Source Latency P95: < 500ms
- Source Latency P99: < 1500ms
- Internal Latency P50: < 50ms
- Internal Latency P95: < 100ms
- Internal Latency P99: < 200ms
Current Performance
- Source Latency P50: 150ms ✅
- Source Latency P95: 450ms ✅
- Source Latency P99: 1200ms ✅
- Internal Latency P50: 25ms ✅
- Internal Latency P95: 75ms ✅
- Internal Latency P99: 150ms ✅
Using Metrics for Optimization
Identifying Bottlenecks
- High Source Latency: External API performance issues
- High Internal Latency: Processing pipeline optimization needed
- High P99 Latency: Outlier investigation required
Performance Tuning
- Caching: Reduce source API calls for repeated requests
- Connection Pooling: Optimize database and external API connections
- Load Balancing: Distribute load across multiple instances
Capacity Planning
- Volume Trends: Monitor request growth patterns
- Resource Utilization: Track CPU, memory, and network usage
- Scaling Decisions: Use metrics to inform scaling triggers
Alerting and Monitoring
Alert Thresholds
- Source Latency P95: Alert if > 1000ms
- Internal Latency P95: Alert if > 200ms
- Error Rate: Alert if > 5%
- Request Volume: Alert if unusual spikes or drops
Monitoring Dashboards
- Real-time Metrics: Live performance monitoring
- Historical Trends: 24-hour performance patterns
- Service Health: Component-level health status
- Resource Usage: Infrastructure utilization metrics
Best Practices
For Developers
- Monitor Your Usage: Track your API usage patterns
- Handle Latency: Implement appropriate timeout handling
- Error Recovery: Build robust error handling and retry logic
- Rate Limiting: Respect API rate limits and quotas
For Operations
- Set Up Alerts: Configure appropriate alert thresholds
- Monitor Trends: Watch for performance degradation over time
- Plan Capacity: Use metrics for capacity planning
- Document Incidents: Track and analyze performance incidents