Appearance
Inspections
Hidden Bias Analysis
class relai.vision.classification.inspect.analyze_bias.AnalyzeBias(model: RELAIModel, dataset: RELAIClassificationDataset, sorter: RELAIImageSorter, fabric: Fabric = None, tagger: RELAIImageTagger = None, classes_list: list[str] = None, batch_size: int = 64, num_workers: int = 4, metric: str = 'accuracy')
Bases: RELAIAlgorithm
Analyze bias in a dataset. The algorithm takes in a dataset and a model and returns a list of potentially biased concepts in the dataset. It then asks the user to provide feedback on the concepts and returns the final results using the feedback file and other filtering parameters.
- Parameters:
- model (RELAIModel) – Model to use for analyzing bias.
- dataset (RELAIClassificationDataset) – Dataset to use for analyzing bias.
- sorter (RELAIImageSorter) – Sorter to use for sorting the images.
- fabric (L.Fabric , optional) – Fabric to use for the algorithm. If not provided, a new fabric will be created.
- num_workers (int , optional) – Number of workers to use for the dataloader. Default: 4.
- tagger (RELAIImageTagger , optional) – Tagger to use for annotating the dataset. If not provided, the concepts will have to be specified manually. Default: None.
- classes_list (list *[*str ] , optional) – List of classes to use for analyzing bias. If not provided, the algorithm will run over all classes in the dataset.
- batch_size (int , optional) – Batch size to use for the dataloader. Default: 64.
- num_workers – Number of workers to use for the dataloader. Default: 4.
- metric (str , optional) – Metric to use for analyzing bias. Has to be either ‘accuracy’ or ‘confidence’. Default: ‘accuracy’.
Example:
python
from relai.models import get_model
from relai.datasets import get_dataset
from relai.vision.data_utils import get_sorter, get_tagger
from relai.vision.classification.inspect import AnalyzeBias
model = get_model('resnet50', ckpt_path='/path/to/model.ckpt')
dataset = get_dataset('imagefolder', '/path/to/dataset', 'val')
sorter = get_sorter("clip", "openai/clip-vit-base-patch32", )
tagger = get_tagger('ram', "ram", "swin_l", f'ram_swin_large_14m')
bias_analyzer = AnalyzeBias(model, dataset, sorter, tagger=tagger)
query_file = bias_analyzer.get_potential_biases(save_to_path='/path/to/save/query')
# use relai frontend to view the query file and provide feedbacks, after which
# a feedback file will be generated by relai frontend
feedback_file = ... # provide the path to the feedback file here
bias_analyzer.get_final_results(feedback_file, save_to_path='/path/to/save/results')get_final_results(feedback_file: str = None, sig_thresh: float = 0.2, acc_thresh: float = 0.0, top_k: int = 5, display_top_k: int = 4, save_results: bool = True, save_to_path: str = None)
Second method to be called by the user. Returns final results using feedback file (if present) and other filtering parameters.
- Parameters:
- feedback_file (str , optional) – path to feedback file (if present). Default: None
- sig_thresh (float , optional) – a (ceiling) threshold on probability that difference in scores when a concept is present or absent is due to chance. Defaults to 0.2.
- acc_thresh (float , optional) – a (floor) threshold on difference in scores when a concept is present or absent to be considered significant enough to be included in the final results. Defaults to 0.0.
- top_k (int , optional) – number of images to use for computing spur gap. Defaults to 5.
- display_top_k (int , optional) – number of images to display in the final results. Defaults to 4.
- save_results (bool , optional) – If True, the results will be saved in a .relai file to be displayed further. Default: True
- save_to_path (str , optional) – Path to save the results. Will be auto-generated by RELAIAlgorithm._save if not provided
- Returns: A dictionary containing the final results
- Return type: dict
get_potential_biases(concept_tags: dict[str, list[str]] = None, rankings_per_concept_class: dict[str, dict[str, Tensor]] = None, top_concepts: int = 4, top_k: int = 5, ask_feedback: bool = True, save_to_path: str = None)
First method to be called in the algorithm, finds a list of potentially biased concepts and asks the user to provide feedback on them if needed.
- Parameters:
- concept_tags (dict *[*str , list *[*str ] ] , optional) – A dictionary of concepts to be used for biases. If not provided, the algorithm will use a tagger to extract concepts from the dataset.
- rankings_per_concept_class (dict *[*str , dict *[*str , torch.Tensor ] ] , optional) – A dictionary of rankings for each concept per class. If not provided, the algorithm will use the sorter to compute rankings.
- top_concepts (int , optional) – Number of top concepts to use for analyzing bias. Default: 4.
- ask_feedback (bool , optional) – If True, the algorithm will ask the user to provide feedback on the concepts. Default: True.
- top_k (int , optional) – Number of images to use for getting user feedback. Default: 5.
- save_to_path (str , optional) – Path to save the query file. Will be auto-generated by RELAIAlgorithm._save if not provided.
- Returns: A dictionary of image rankings for each concept per class
- Return type: dict[str, dict[str, torch.Tensor]]
Complex Bias Analysis
class relai.vision.classification.inspect.analyze_complex_bias.AnalyzeComplexBias(model: RELAIModel, dataset: RELAIClassificationDataset, tagger: RELAIImageTagger, fabric: Fabric = None, num_workers: int = 4, classes_list: list[str] = None, batch_size: int = 32, metric: str = 'accuracy')
Bases: RELAIAlgorithm
Extract complex biases from a model and dataset. The complex biases are combinations of tags that are associated with a significant change in model performance.
- Parameters:
- model (RELAIModel) – Model to use for extracting complex biases.
- dataset (RELAIClassificationDataset) – Dataset to use for extracting complex biases.
- tagger (RELAIImageTagger) – Tagger to use for annotating the dataset.
- fabric (L.Fabric , optional) – Fabric to use for the algorithm. If not provided, a new fabric will be created.
- num_workers (int , optional) – Number of workers to use for the dataloader. Default: 4.
- classes_list (list *[*str ] , optional) – List of classes to use for extracting complex biases. Default: None.
- batch_size (int , optional) – Batch size to use for the model. Default: 32.
- metric (str , optional) – Metric to use for evaluating model performance. Default: ‘accuracy’.
Example:
python
from relai.vision.classification.inspect import AnalyzeComplexBias
from relai.models.utils import get_model
from relai.datasets.utils import get_dataset
from relai.vision.data_utils import get_tagger
model = get_model('resnet50', ckpt_path='/path/to/resnet50_20m.pth')
dataset = get_dataset('imagefolder', '/path/to/dataset/val')
tagger = get_tagger('ram', 'swin_l', '/path/to/ram_swin_large_14m.pth')
bias_analyzer = AnalyzeComplexBias(model, dataset, tagger=tagger)
results_file = bias_analyzer.extract_complex_biases(save_to_path='/path/to/save/results')extract_complex_biases(mode: str = 'exhaustive', max_tags_in_combo: int = 3, min_pop_thresh: int = 10, acc_diff_thresh: float = 0.1, sig_thresh: float = 0.05, minimal_criteria: list[float] = None, display_k: int = 5, save_results: bool = True, save_to_path: str = None)
Extract complex biases (composed of more than 1 concept) from the model and dataset
- Parameters:
mode (str , optional) – Algorithm for extracting complex biases. Has to be one of [‘exhaustive’]. Default: ‘exhaustive’.
max_tags_in_combo (int , optional) – Maximum number of tags in a combination. Default: 3.
min_pop_thresh (int , optional) – Minimum population threshold for a tag to be considered. Default: 10.
acc_diff_thresh (float , optional) – Minimum accuracy difference threshold for a tag to be considered. Default: 0.1
sig_thresh (float , optional) – Significance threshold for a tag to be considered. Default: 0.05.
minimal_criteria (list , optional) – List of criteria for a concept to be minimial. A concept $$c$$ is minimal if for all $$ c’ in c$$ , $$ c c’ $$ is minimal, and accuracy on $$c$$ is lower/greater than accuracy on $$c c’$$ for all $$c’ in c $$ by a threshold $b_{
|C|}$, dependent on $|C|$. minimal_criteria is thus a list where the i-th element is the threshold for a concept with i tags. If provided, must be a list of accuracy thresholds of length at least the maximum number of tags in a combination. Not applied by default. Default: None.
display_k (int , optional) – Number of images to display for each concept. Default: 5.
save_results (bool , optional) – If True, the results will be saved in a .relai file to be displayed further. Default: True.
save_to_path (str , optional) – Path to save the results. Will be auto-generated by RELAIAlgorithm._save if not provided.
- Returns: Dictionary containing the complex biases extracted from the model and dataset
- Return type: dict
Saliency Maps
class relai.vision.classification.inspect.saliency_map.AppendSaliency(dataset, model, feature_idx, alpha=0.25, fabric=None)
Bases: object
class relai.vision.classification.inspect.saliency_map.SaliencyMap(model: Module, method: str, fabric: Fabric = None)
Bases: RELAIAlgorithm
Generate saliency maps for a model. Wrapper around the Captum library.
- Parameters:
- model – Model to generate saliency maps for
- fabric – Fabric to use for the algorithm. If not provided, a new fabric will be created.
Example:
python
from relai.vision.classification.inspect import SaliencyMap
model = ... # setup model here
image = ... # get image here
saliency_map = SaliencyMap(model)
saliency_map.generate_saliency_map(image) # returns a saliency map for the imagegenerate_saliency_map(imgs: Tensor, target: int = None, baseline: Tensor = None, n_samples: int = 5, n_steps: int = 50, internal_batch_size: int = 1, stdevs: float = 0.0001, neuron_index: int = None, layer: Module = None, save_maps: bool = False, save_to_path: str | Path = None)
Generate a saliency map for a set of images.
- Parameters:
- imgs (torch.Tensor) – Image to generate the saliency map for.
- target (int) – Target class index to generate the saliency map for. Default: None.
- method – Method to use for generating the saliency map. Available options: [“integrated_gradients”, “gradient_shap”, “deep_lift”, “deep_lift_shap”, “layer_conductance”, “neuron_conductance”, “noise_tunnel”, “layer_grad_cam”]. Default: “integrated_gradients”.
- baseline – Baseline to use for the saliency map. Default: None.
- n_steps (int) – Number of steps to use for the saliency map. Default: 50.
- internal_batch_size (int) – Batch size to use for the saliency map. Default: 1.
- stdevs (float) – Standard deviation to use for the saliency map. Default: 0.0001.
- neuron_index (int) – Neuron index to use for the saliency map. Default: None.
- layer (nn.Module) – Layer to use for the saliency map. Default: None.
- save_maps (bool) – Whether to save the saliency maps. Default: False.
- save_to_path (Union *[*str , Path ]) – Path to save the saliency maps. Will be automatically generated if not provided. Default: None.
- Returns: Saliency maps for the images