mmpretrain.visualization¶
This package includes visualizer and some helper functions for visualization.
Visualizer¶
- class mmpretrain.visualization.UniversalVisualizer(name='visualizer', image=None, vis_backends=None, save_dir=None, fig_save_cfg={'frameon': False}, fig_show_cfg={'frameon': False})[source]¶
- Universal Visualizer for multiple tasks. - Parameters:
- name (str) – Name of the instance. Defaults to ‘visualizer’. 
- image (np.ndarray, optional) – the origin image to draw. The format should be RGB. Defaults to None. 
- vis_backends (list, optional) – Visual backend config list. Defaults to None. 
- save_dir (str, optional) – Save file dir for all storage backends. If it is None, the backend storage will not save any data. 
- fig_save_cfg (dict) – Keyword parameters of figure for saving. Defaults to empty dict. 
- fig_show_cfg (dict) – Keyword parameters of figure for showing. Defaults to empty dict. 
 
 - visualize_cls(image, data_sample, classes=None, draw_gt=True, draw_pred=True, draw_score=True, resize=None, rescale_factor=None, text_cfg={}, show=False, wait_time=0, out_file=None, name='', step=0)[source]¶
- Visualize image classification result. - This method will draw an text box on the input image to visualize the information about image classification, like the ground-truth label and prediction label. - Parameters:
- image (np.ndarray) – The image to draw. The format should be RGB. 
- data_sample ( - DataSample) – The annotation of the image.
- classes (Sequence[str], optional) – The categories names. Defaults to None. 
- draw_gt (bool) – Whether to draw ground-truth labels. Defaults to True. 
- draw_pred (bool) – Whether to draw prediction labels. Defaults to True. 
- draw_score (bool) – Whether to draw the prediction scores of prediction categories. Defaults to True. 
- resize (int, optional) – Resize the short edge of the image to the specified length before visualization. Defaults to None. 
- rescale_factor (float, optional) – Rescale the image by the rescale factor before visualization. Defaults to None. 
- text_cfg (dict) – Extra text setting, which accepts arguments of - mmengine.Visualizer.draw_texts(). Defaults to an empty dict.
- show (bool) – Whether to display the drawn image in a window, please confirm your are able to access the graphical interface. Defaults to False. 
- wait_time (float) – The display time (s). Defaults to 0, which means “forever”. 
- out_file (str, optional) – Extra path to save the visualization result. If specified, the visualizer will only save the result image to the out_file and ignore its storage backends. Defaults to None. 
- name (str) – The image identifier. It’s useful when using the storage backends of the visualizer to save or display the image. Defaults to an empty string. 
- step (int) – The global step value. It’s useful to record a series of visualization results for the same image with the storage backends. Defaults to 0. 
 
- Returns:
- The visualization image. 
- Return type:
- np.ndarray 
 
 - visualize_i2t_retrieval(image, data_sample, prototype_dataset, topk=1, draw_score=True, resize=None, text_cfg={}, show=False, wait_time=0, out_file=None, name='', step=0)[source]¶
- Visualize Image-To-Text retrieval result. - This method will draw the input image and the texts retrieved from the prototype dataset. - Parameters:
- image (np.ndarray) – The image to draw. The format should be RGB. 
- data_sample ( - DataSample) – The annotation of the image.
- prototype_dataset (Sequence[str]) – The prototype dataset. It should be a list of texts. 
- topk (int) – To visualize the topk matching items. Defaults to 1. 
- draw_score (bool) – Whether to draw the prediction scores of prediction categories. Defaults to True. 
- resize (int, optional) – Resize the short edge of the image to the specified length before visualization. Defaults to None. 
- text_cfg (dict) – Extra text setting, which accepts arguments of - mmengine.Visualizer.draw_texts(). Defaults to an empty dict.
- show (bool) – Whether to display the drawn image in a window, please confirm your are able to access the graphical interface. Defaults to False. 
- wait_time (float) – The display time (s). Defaults to 0, which means “forever”. 
- out_file (str, optional) – Extra path to save the visualization result. If specified, the visualizer will only save the result image to the out_file and ignore its storage backends. Defaults to None. 
- name (str) – The image identifier. It’s useful when using the storage backends of the visualizer to save or display the image. Defaults to an empty string. 
- step (int) – The global step value. It’s useful to record a series of visualization results for the same image with the storage backends. Defaults to 0. 
 
- Returns:
- The visualization image. 
- Return type:
- np.ndarray 
 
 - visualize_image_caption(image, data_sample, resize=None, text_cfg={}, show=False, wait_time=0, out_file=None, name='', step=0)[source]¶
- Visualize image caption result. - This method will draw the input image and the images caption. - Parameters:
- image (np.ndarray) – The image to draw. The format should be RGB. 
- data_sample ( - DataSample) – The annotation of the image.
- resize (int, optional) – Resize the long edge of the image to the specified length before visualization. Defaults to None. 
- text_cfg (dict) – Extra text setting, which accepts arguments of - plt.text(). Defaults to an empty dict.
- show (bool) – Whether to display the drawn image in a window, please confirm your are able to access the graphical interface. Defaults to False. 
- wait_time (float) – The display time (s). Defaults to 0, which means “forever”. 
- out_file (str, optional) – Extra path to save the visualization result. If specified, the visualizer will only save the result image to the out_file and ignore its storage backends. Defaults to None. 
- name (str) – The image identifier. It’s useful when using the storage backends of the visualizer to save or display the image. Defaults to an empty string. 
- step (int) – The global step value. It’s useful to record a series of visualization results for the same image with the storage backends. Defaults to 0. 
 
- Returns:
- The visualization image. 
- Return type:
- np.ndarray 
 
 - visualize_image_retrieval(image, data_sample, prototype_dataset, topk=1, draw_score=True, resize=None, text_cfg={}, show=False, wait_time=0, out_file=None, name='', step=0)[source]¶
- Visualize image retrieval result. - This method will draw the input image and the images retrieved from the prototype dataset. - Parameters:
- image (np.ndarray) – The image to draw. The format should be RGB. 
- data_sample ( - DataSample) – The annotation of the image.
- prototype_dataset ( - BaseDataset) – The prototype dataset. It should have get_data_info method and return a dict includes img_path.
- draw_score (bool) – Whether to draw the match scores of the retrieved images. Defaults to True. 
- resize (int, optional) – Resize the long edge of the image to the specified length before visualization. Defaults to None. 
- text_cfg (dict) – Extra text setting, which accepts arguments of - plt.text(). Defaults to an empty dict.
- show (bool) – Whether to display the drawn image in a window, please confirm your are able to access the graphical interface. Defaults to False. 
- wait_time (float) – The display time (s). Defaults to 0, which means “forever”. 
- out_file (str, optional) – Extra path to save the visualization result. If specified, the visualizer will only save the result image to the out_file and ignore its storage backends. Defaults to None. 
- name (str) – The image identifier. It’s useful when using the storage backends of the visualizer to save or display the image. Defaults to an empty string. 
- step (int) – The global step value. It’s useful to record a series of visualization results for the same image with the storage backends. Defaults to 0. 
 
- Returns:
- The visualization image. 
- Return type:
- np.ndarray 
 
 - visualize_masked_image(image, data_sample, resize=224, color='black', alpha=0.8, show=False, wait_time=0, out_file=None, name='', step=0)[source]¶
- Visualize masked image. - This method will draw an image with binary mask. - Parameters:
- image (np.ndarray) – The image to draw. The format should be RGB. 
- data_sample ( - DataSample) – The annotation of the image.
- resize (int | Tuple[int]) – Resize the input image to the specified shape. Defaults to 224. 
- color (str | Tuple[int]) – The color of the binary mask. Defaults to “black”. 
- alpha (int | float) – The transparency of the mask. Defaults to 0.8. 
- show (bool) – Whether to display the drawn image in a window, please confirm your are able to access the graphical interface. Defaults to False. 
- wait_time (float) – The display time (s). Defaults to 0, which means “forever”. 
- out_file (str, optional) – Extra path to save the visualization result. If specified, the visualizer will only save the result image to the out_file and ignore its storage backends. Defaults to None. 
- name (str) – The image identifier. It’s useful when using the storage backends of the visualizer to save or display the image. Defaults to an empty string. 
- step (int) – The global step value. It’s useful to record a series of visualization results for the same image with the storage backends. Defaults to 0. 
 
- Returns:
- The visualization image. 
- Return type:
- np.ndarray 
 
 - visualize_t2i_retrieval(text, data_sample, prototype_dataset, topk=1, draw_score=True, text_cfg={}, fig_cfg={}, show=False, wait_time=0, out_file=None, name='', step=0)[source]¶
- Visualize Text-To-Image retrieval result. - This method will draw the input text and the images retrieved from the prototype dataset. - Parameters:
- image (np.ndarray) – The image to draw. The format should be RGB. 
- data_sample ( - DataSample) – The annotation of the image.
- prototype_dataset ( - BaseDataset) – The prototype dataset. It should have get_data_info method and return a dict includes img_path.
- topk (int) – To visualize the topk matching items. Defaults to 1. 
- draw_score (bool) – Whether to draw the match scores of the retrieved images. Defaults to True. 
- text_cfg (dict) – Extra text setting, which accepts arguments of - plt.text(). Defaults to an empty dict.
- fig_cfg (dict) – Extra figure setting, which accepts arguments of - plt.Figure(). Defaults to an empty dict.
- show (bool) – Whether to display the drawn image in a window, please confirm your are able to access the graphical interface. Defaults to False. 
- wait_time (float) – The display time (s). Defaults to 0, which means “forever”. 
- out_file (str, optional) – Extra path to save the visualization result. If specified, the visualizer will only save the result image to the out_file and ignore its storage backends. Defaults to None. 
- name (str) – The image identifier. It’s useful when using the storage backends of the visualizer to save or display the image. Defaults to an empty string. 
- step (int) – The global step value. It’s useful to record a series of visualization results for the same image with the storage backends. Defaults to 0. 
 
- Returns:
- The visualization image. 
- Return type:
- np.ndarray 
 
 - visualize_visual_grounding(image, data_sample, resize=None, text_cfg={}, show=False, wait_time=0, out_file=None, name='', line_width=3, bbox_color='green', step=0)[source]¶
- Visualize visual grounding result. - This method will draw the input image, bbox and the object. - Parameters:
- image (np.ndarray) – The image to draw. The format should be RGB. 
- data_sample ( - DataSample) – The annotation of the image.
- resize (int, optional) – Resize the long edge of the image to the specified length before visualization. Defaults to None. 
- text_cfg (dict) – Extra text setting, which accepts arguments of - plt.text(). Defaults to an empty dict.
- show (bool) – Whether to display the drawn image in a window, please confirm your are able to access the graphical interface. Defaults to False. 
- wait_time (float) – The display time (s). Defaults to 0, which means “forever”. 
- out_file (str, optional) – Extra path to save the visualization result. If specified, the visualizer will only save the result image to the out_file and ignore its storage backends. Defaults to None. 
- name (str) – The image identifier. It’s useful when using the storage backends of the visualizer to save or display the image. Defaults to an empty string. 
- step (int) – The global step value. It’s useful to record a series of visualization results for the same image with the storage backends. Defaults to 0. 
 
- Returns:
- The visualization image. 
- Return type:
- np.ndarray 
 
 - visualize_vqa(image, data_sample, resize=None, text_cfg={}, show=False, wait_time=0, out_file=None, name='', step=0)[source]¶
- Visualize visual question answering result. - This method will draw the input image, question and answer. - Parameters:
- image (np.ndarray) – The image to draw. The format should be RGB. 
- data_sample ( - DataSample) – The annotation of the image.
- resize (int, optional) – Resize the long edge of the image to the specified length before visualization. Defaults to None. 
- text_cfg (dict) – Extra text setting, which accepts arguments of - plt.text(). Defaults to an empty dict.
- show (bool) – Whether to display the drawn image in a window, please confirm your are able to access the graphical interface. Defaults to False. 
- wait_time (float) – The display time (s). Defaults to 0, which means “forever”. 
- out_file (str, optional) – Extra path to save the visualization result. If specified, the visualizer will only save the result image to the out_file and ignore its storage backends. Defaults to None. 
- name (str) – The image identifier. It’s useful when using the storage backends of the visualizer to save or display the image. Defaults to an empty string. 
- step (int) – The global step value. It’s useful to record a series of visualization results for the same image with the storage backends. Defaults to 0. 
 
- Returns:
- The visualization image. 
- Return type:
- np.ndarray