Sculpt¶
Background¶
Not all uncertainty can be solved by collecting more data. Sometimes the data itself has implicit (and irreducible) noise that is not relevant to the pattern being measured. For example, in the real-world the sensors we measure our data with are rarely perfect (e.g., cameras have noise in the color readings of pixels). When humans label and annotate data, there is similar noise in our own annotations (e.g., two doctors may diagnose the same exact patient differently). These conflicting measurements cause irreducible error and thus confusion to an AI model during training.
This uncertainty, known as aleatoric uncertainty, can be accurately quantified and modeled using the Capsa Sculpt wrapper to prevent training harm. While typically we assume our training data is perfect and without noise, this wrapper allows us to estimate and account for our data’s imperfections by “Sculpting” our output to a desired probability distribution which captures any remaining irreducible noise. Users select a type of distribution (e.g., Normal) they want to fit to their output and Capsa takes care of the rest!
Usage¶
Wrapping your model with capsa_torch.sculpt
from torch import nn
from capsa_torch import sculpt
# Define your model
model = nn.Sequential(...)
# Specify the distribution to fit your output to.
# Note: you don't need to instantiate an object, just pass the class itself to the wrapper
dist = sculpt.Normal
# Build a wrapper with this distribution and wrap!
wrapper = sculpt.Wrapper(dist, n_layers=3)
wrapped_model = wrapper(model)
# or in one line
wrapped_model = sculpt.Wrapper(distribution=dist, n_layers=3)(model)
Calling your wrapped model
# By default, your wrapped model returns a prediction, just like normal
prediction = wrapped_model(input_batch)
# But if you use `return_risk=True` you will also automatically get uncertainty too!
prediction, uncertainty = wrapped_model(input_batch, return_risk=True)
Training your wrapped model
A model wrapped with the Sculpt wrapper must be trained differently in order to produce accurate measures of uncertainty. Here’s an example of how a training loop would change with the Sculpt wrapper.
for (x_batch, y_batch) in train_dataloader:
# Zero your gradients for every batch!
optimizer.zero_grad()
# Predict output and uncertainty for this batch
y_pred, y_risk = model(x_batch, return_risk=True)
# Compute the loss and its gradients
loss = loss_fn(y_batch, y_pred)
# Add wrapper specific loss to also train y_risk
loss += scale * dist.loss_function(y_batch, y_pred, y_risk)
# Alternatively, use the reparameterization trick to train y_risk
sampled_y_pred = dist.sample((y_pred, y_risk))
loss = loss_fn(y_batch, sampled_y_pred) # Note here we use the original loss_fn for the model
# Adjust learning weights
loss.backward()
optimizer.step()
for (x_batch, y_batch) in train_dataloader:
# Zero your gradients for every batch!
optimizer.zero_grad()
# Predict output for this batch
y_pred = model(x_batch)
# Compute the loss and its gradients
loss = loss_fn(y_batch, y_pred)
# Adjust learning weights
loss.backward()
optimizer.step()
Examples
API Reference
Wrapping your model with capsa_tf.sculpt
from torch import nn
from capsa_tf import sculpt
# Define your model
def model(...):
...
# Specify the distribution to fit your output to.
# Note: you don't need to instantiate an object, just pass the class itself to the wrapper
dist = sculpt.Normal
# Build a wrapper with this distribution and wrap!
wrapper = sculpt.Wrapper(dist, n_layers=3)
wrapped_model = wrapper(model)
# or in one line
wrapped_model = sculpt.Wrapper(distribution=dist, n_layers=3)(model)
Calling your wrapped model
# By default, your wrapped model returns a prediction, just like normal
prediction = wrapped_model(input_batch)
# But if you use `return_risk=True` you will also automatically get uncertainty too!
prediction, uncertainty = wrapped_model(input_batch, return_risk=True)
Training your wrapped model
Warning
The steps below are specific to the sculpt wrapper and do not apply to training models wrapped by other wrappers. Please visit the corresponding wrapper page for any additional info on how to train models wrapped by the other wrappers.
When training your sculpt wrapped model, you will need to use one of the following techniques to train the probability distribution predictions. Note that the following code reuses the dist
distribution defined in the previous step.
In addition to your original loss function, use a second loss function which we provide.
# Before
y_hat = model(x)
loss = original_loss_fn(y_hat, y)
# After
y_hat, y_risk = model(x, return_risk=True)
loss = original_loss_fn(y_hat, y) + dist.loss_function(y, y_hat, y_risk)
Or sample your output value from the predicted distribution, before passing the sampled values to the original loss function.
# Before
y_hat = model(x)
loss = original_loss_fn(y_hat, y)
# After
y_hat_and_risk = model(x, return_risk=True)
y_hat_sampled = dist.sample(y_hat_and_risk)
loss = original_loss_fn(y_hat_sampled, y)
Examples
API Reference