Instrumentation

Note

Starting slide 11

Software instrumentation: a technique to insert measurement or monitoring mechanisms into software
- e.g. adding timers around key functions, counting database queries/api calls, measuring resource usage (CPU, memory, …)
- Measurement: defines what to collect
- Instrumentation: defines how to collect it
Intrusive instrumentation: modifying the original source code by inserting chunks of source code to collect data for analytical purposes (e.g. logging frameworks)
- problems:
  - need & must understand the source code
  - requires lots of upfront planning to ensure implementation code is implemented as part of the normal system’s implementation
  - requirements define what data needs to be captured
  - design that integrates instrumentation and integrates it into overall system
  - part of the normal dev process → not an afterthought
  - hard to remove/modify when you no longer need instrumentation and/or requirements change (…? how is it hard)
- Unguarded implementation: when you insert instrumentation directly into source code without any restrictions/constraints on when/how it is executed
- Guarded instrumentation: instrumentation placed directly into source code, but with restrictions/constraints on when/how it is executed
- Proxy instrumentation: place instrumentation code in a proxy object and proxy calls real implementation

// ... guarded implementation
public void doSomething() {
    if (INSTRUMENT) // <- this is the guard
        log.i("Counter: ", this.counter_);
}
// ...

// ... proxy instrumentation
public SC_Inst(SC sc_impl) { // <- instrumentation object
    this.sc_impl_ = sc_impl;
}
 
public void doSomething() { // <- instrumentation code
    log.i("Counter: " + this.sc_impl_.getCounter());
    this.sc_impl_.doSomething();
}

Logs: record of events or messages that occur in a software system
- Application logs: provides information about the behaviour of the system (e.g. errors, warnings, status updates)
- System logs: generated by the operating system or infrastructure components, such as log messages from system services, system events, or performance metrics
Metrics: numerical value that describes the performance or behaviour of a software system (e.g. response time, system resources, user engagement/conversion rate, …
Trace: provides a complete picture of the path taken by a request as it travels through a system → used for debugging (helps us understand control flow, timing, interactions)
- Event (system-level): low level system/application events to analyze performance and behaviour on a single machine
- Distributed tracing: tracks a request as it moves across multiple services to understand end-to-end flow and latency in distributed systems
Problems with instrumentation & data analysis:
- different tracing tools use incompatible formats, making cross-system analysis hard
- trace data grows fast and is expensive to store and query
- large trace datasets are difficult to visualize and interpret
- traces are hard to link with logs and metrics
- many tools cannot provide instant feedback
- hard to track events across multiple nodes or services
Distributed tracing: a process of collecting end-to-end transaction graphs in near real time
- Trace: represents the entire journey of a request
- Span: represents a single operation call
- Tags/logs: to annotate the spans with some contextual information (tags apply to the whole span, logs represent some even that happened during the span)
  - a log always has a timestamp that fails within the span’s start-end time interval
Monolith architecture: single, tightly coupled application structure (all components in one unit)
- pros:
  - simple to build, test, deploy on
  - easy to use for small systems
- cons:
  - difficult to scale or modify as codebase grows
  - a change in one part can break the whole system
  - slow deployment cycles and limited flexibility
  - performance bottlenecks due to shared resources
Microservice architecture: application is split into independent, loosely coupled services, each service has its own process and communicates via APIs
- pros:
  - scale only the parts that need it (scalability)
  - failures are isolated
  - use different languages or technologies (flexibility)
  - update services independently (fast deployment)
  - easy to test and maintain (modularity)

Note

Ended slide 44

Connor's Notes

⬅️ Back to portfolio

Explorer

COSC 3P95: Lecture 11 Notes

Instrumentation

Graph View