Runtime Code Sensor Deep-dive

Introduction

Hud is redefining how software teams detect and resolve production issues. Installed with a single line of code, Hud’s production code sensor alerts engineers to problems before they impact customers—pinpointing the root cause instantly, with the exact context needed to fix it.

Hud currently supports Node.js and Python backends, Java and Go are coming soon. 

Top Level View: Hud gives engineers a bird’s-eye view of code behavior in production, highlighting what matters most—instantly.

Production-Aware Development: See code like never before. Hud puts real time function call volumes, error spikes, and performance trends right to where the engineer works. They can navigate real-time production code paths and follow “hot” functions directly to the root cause.

Enterprise Grade: Hud is built with enterprise-grade technology and processes. Hud is SOC2 Type II, ISO 27001 and GDPR compliant, and fully supports SAML and SCIM authentication. 

Setting up Hud 

Hud is installed with a single SDK init line. The moment the service wakes up - data starts flowing, issues are found and root causes are reported. Hud is automatically aware of code changes and adjusts itself with versions allowing users to see behavior changes across code versions in production.

Add Hud to the codebase and install the IDE extension to get started, see all resources here:

Resource

Detect production issues with root cause already figured out 

Hud’s sensor runs alongside the application, identifying issues in critical service components, such as errors and slowdowns in endpoints and queue consumers, while simultaneously pinpointing their root causes at the function level. When Hud reports an issue, it reports it with the root cause.

For example, Hud detects that a new version was deployed to production, and starts comparing its behavior to the baseline, to quickly alert if any negative business impact happens in production - with its root cause. Alerts are sent automatically with no configuration needed, in cases where endpoints or event consumers experience errors or degradations. 

Clicking the alert’s deeplink takes the engineer to that function’s page at Hud, where they can see that function’s behavior. They can also click the IDE icon on the web and jump straight to the right place in the codebase. 

Hud’s IDE extension augments the code with real time context from production. Above each function, the engineer sees a “Hudder” showing that function’s real time behavior in production. In the bottom panel they see the real time production call graph, where red lines show exception flows. Hud takes them to the specific function that’s the root cause of the issue.

Explore services from a production point of view

Hud highlights the top issues in each service, guiding engineers directly to relevant code.

Examples for quick views at a service level:

  • Find the slowest endpoints, queue consumers and functions
  • Find the ones that have the been slowing down the most recently (performance trend)
  • Find the endpoints, queue consumers and functions with the most errors
  • Find the endpoints, queue consumers and functions whose error trend is the highest

See code in a new way: production-aware experience in the IDE

Hudders give engineers real-time actionable insights into the behavior of the functions they’re working on. While coding, engineers can see real-time production data, such as call volumes, error spikes, and duration trends - directly above each function in their IDE (VS Code, JetBrains, Cursor).

Some functions are more sensitive than others; they could be part of many different product flows, or be invoked millions of times a minute, or cost a lot, etc. Hud lets engineers easily understand the potential ramifications of making changes to the function. 

Engineers can easily navigate the code by which functions call which functions in production, including through “hot” code paths where slowdowns and exceptions are propagated - easily finding either the deep root cause or the business problem it creates.

Take testing and gradual rollouts to the next level

Hud knows what code actually runs in production and what the “hottest” code paths are. When engineers install Hud in their testing environment, they receive a comparison between the test coverage and the actual activity in production. When Hud is installed in a CI/CD pipeline with tagged PRs and baselines, engineers can easily identify whether any endpoints were degraded by the PR and pinpoint the code root cause (with an optional GitHub integration to help engineers quickly navigate to the root cause).

Hud can also help engineers identify issues with gradual rollout versions before full rollout. When Hud runs on both versions in production, it compares errors and degradations automatically and alerts engineers to the root causes of those differences, preventing the full rollout of erroneous versions.

How Hud works

Hud has the following major components: 

  1. SDK
  2. Backend
  3. Web Application 
  4. IDE Extension

The SDK is integrated within the codebase, periodically collecting metadata and aggregated behavior data about endpoints, queue consumers and functions. This information is securely transmitted via TLS to our servers, hosted in AWS’s eu-central-1 region (Frankfurt).

Once the metadata reaches our servers, it is processed to generate actionable insights, which are stored in an RDS database and a Clickhouse Cloud instance. Hud’s IDE extension and web application use Auth0 for user authentication and communicate with the backend over TLS.

Here’s an illustration of Hud’s high level architecture:

Hud’s SDK

The SDK is a library installed like any other package from the language’s packages repository (pypi.org for Python and npmjs.com for Node). It’s initialized using a single line of code which passes the application API key (per environment) and service name.

When the service spins up, our SDK maps the server’s metadata and all the function signatures, and starts gathering aggregated statistical data about every function’s behavior (e.g. frequency, duration, errors, callers, costs). Hud does not send any of the code to our servers, but rather function metadata, hashes, and aggregated statistics.

The SDK is initialized with the following parameters:

The HUD_ENABLE environment variable

When Hud’s SDK loads, the first thing it does is look for the HUD_ENABLE environment variable. If it can’t be found, Hud immediately and gracefully shuts down without affecting the application. This can be monitored in the service’s stdout. The SDK starts working only if HUD_ENABLE is present. 

SDK Communication

The SDK has the following communication channels:

  1. Stdout: The SDK reports key status events in the service’s stdout
  2. Local log file: The SDK maintains a small local log file that can be examined to audit its activity; Hud can also operate without a log file if needed

Communication with Hud’s servers: The SDK sends events via HTTPS (over TLS) to https://api-prod.hud.io, which authenticates to Hud’s backend using the API key provided.

For example, a function metadata sent to Hud’s servers may look like:


{    
"start_line" : 33,    
"end_line" : 45,    
"name" : "handleUserMetadata",    
"is_async" : false,    
"file" : "/app/user-package/server.py",    
"function_id" :
	"d679eeb3-d79a-4f72-ab2d-800b114ae540",    
"source_code_hash" :
	"2801f83f4a058501b9c7bdadee5b36ba93cf0521"},
}

A function behavior aggregated metric sent to Hud’s servers may look like:


{    

"sum_squared_duration" : 34026365544,    
"count" : 54,    
"caller_function_id" : [        
	["", "user-package/main.py", 102],
    ],    
"timestamp" : "2024-08-19T09:14:38.042113+00:00",    
"sum_duration" : 184462368,    
"timeslice" : 4537126917,    
"sampled_count" : 15,    
"function_id" : "d679eeb3-d79a-4f72-ab2d-800b114ae540"

},

Hud’s IDE Extension

Hud’s IDE extension provides the engineer with actionable production insights straight in their IDE. It is available for the following IDEs through their official extension marketplaces: 

  • VSCode
  • Jetbrains suite (e.g. Webstorm, Pycharm)
  • Cursor

All extensions authenticate users using Auth0’s Google social login or SSO, and communicate with the backend over secure TLS connection with 1.2 enforcement.

Sign up for Early Access

Thanks for signing up...
Oops! Something went wrong while submitting the form.