
module Description Pre-processing tools for DL/ML applications. General tools., display=False)

Computes the mean image from a list of image batches.

  • X (list) – List containing image batches. That is, an element of the list is of shape (N, W, H), where N: #samples, W: image width, H: image height. Arrays of size (N, W, H, C) with C: #channels are also accepted. If you only have a single batch (e.g. B), you just need to encapsulate it in a list (i.e. [B]).
  • display (bool) – Set to True to display information as function is executed.

Mean image (size W x H).

Return type:

numpy.array, display=False)

Computes the mean pixel from all samples in the list of image batches X.

  • X (list) – List containing image batches. That is, an element of the list is of shape (N, W, H), where N: #samples, W: image width, H: image height. Arrays of size (N, W, H, C) with C: #channels are also accepted. If you only have a single batch (e.g. B), you just need to encapsulate it in a list (i.e. [B]).
  • display (bool) – Set to True to display information as function is executed.

Pixel mean (scalar).

Return type:

float, pmean, display=False)

Computes the pixel standard deviation from all samples in the list of image batches X.

  • X (list) – List containing image batches. That is, an element of the list is of shape (N, W, H), where N: #samples, W: image width, H: image height. Arrays of size (N, W, H, C) with C: #channels are also accepted. If you only have a single batch (e.g. B), you just need to encapsulate it in a list (i.e. [B]).
  • pmean (float) – Pixel mean of bath list X (see get_mean_pixel()).
  • display (bool) – Set to True to display information as function is executed.

Pixel standard deviation (scalar).

Return type:

float, display=False)

Gets the maximum and minimum pixel values of a list of image batches.

  • X (list) – List containing image batches. That is, an element of the list is of shape (N, W, H), where N: #samples, W: image width, H: image height. Arrays of size (N, W, H, C) with C: #channels are also accepted. If you only have a single batch (e.g. B), you just need to encapsulate it in a list (i.e. [B]).
  • display (bool) – Set to True to display information as function is executed.

Maximum (first element) and minimum (second element) values.

Return type:

tuple, size, ignore_last_axis=False)

Resizes the image according to size using `cv2.resize`_ with bilinear interpolation.

  • X (numpy.array) – Image of shape (N, W, H), where N: #samples, W: image width, H: image height.
  • size (tuple) – Size to reshape images.
  • ignore_last_axis (bool) – Set to True if images have dimensionality (W, H, 1).

List containing image batches with resized images. E.g. (2,2).

Return type:



Bases: object

Parent class for image preprocessing classes. This class does not implement any method, please refer to its child classes.


Use the mode according to your DL framework. Available modes:

  • keras: Reshape to have an extra axis for Keras.

Applies the preprocessing pipeline to class attribute data.

Parameters:X (numpy array) – Array with images.

Reshapes the dimensions of the list so that it is suitable for the specified DL framework. So far, only ‘keras’ option is available.

Parameters:X (numpy.ndarray) – List of images.
Returns:List of reshaped images
Return type:list
class, std, reshape_mode, resize_factor=None)


Child of of ImagePreprocessor. Assuming an input image \(X\), this preprocessor first centres and normalises it as


where \(\mu\) and \(\sigma\) denote the pixel mean and standard deviation, respectively. Next, it resizes the image using the method resize().

  • resize_factor – To resize the image. For instance, half the dimensions by setting this parameter equal to 2.
  • mean – Used to centre the data.
  • std – Used to normalise the data.
  • reshape_mode – Used to normalise the data. See ImagePreprocessor

Processes an array of images, scaling and normalising them as required and, eventually, reshapes the list to be suitable for a specific DL framework.

Parameters:X (numpy.ndarray) – List with image data (images as numpy arrays).
Returns:Updated, preprocessed list of images.
class, scale_factor, reshape_mode, resize_factor=None)


Child of ImagePreprocessor. Assuming an input image \(X\), this preprocessor first centres and normalises it as

\[\frac{X \ominus \hat{X}}{s}\]

where \(\hat{X}\) denotes the image mean (same size as :math: X), \(s\) is the scale factor (scalar) and \(\ominus\) is pixel-wise subtraction operations. Next, it resizes the image using the method resize().

  • mean_image – Mean image (2D matrix)
  • scale_factor – Used to normalise the data.
  • resize_factor – To resize the image. Define the new size of the images.
  • reshape_mode – Used to normalise the data. See ImagePreprocessor

Processes an array of images, scaling and normalising them as required and, eventually, reshapes the list to be suitable for a specific DL framework.

Parameters:X (numpy.ndarray) – List with image data (images as numpy arrays).
Returns:Updated, preprocessed list of images., shuffle=False)

Reads a chunk of data stored as h5.

  • path_to_file (str) – Path name to the H5 file to read
  • shuffle (bool) – Set to true if data should be shuffled.

Array of images (X), array of labels (Y)

Return type:

list, chunk_filenames, features, ignore_classes=None, display=False)

Loads a set of h5 files as individual arrays in a list.

  • dataset_dir (str) – Directory containing the chunk files.
  • chunk_filenames (list) – Filenames of the data chunks.
  • features (list) – Features to retrieve from the h5 data chunks as string names.
  • ignore_classes (list, default None) – List of class labels to consider. Labels as ints. By default it considers all classes.
  • display (bool, default False) – Set to True to have some informative messages printed out.

List with the data chunks as numpy.arrays.

Return type:

list, Y, batch_sz)

Generates batches of data from samples X and labels Y.

  • X (list) – Sample data.
  • Y (list) – Label data.
  • batch_sz (int) – Batch size.

Generator of batches of samples, labels and weights (importance of samples).

Return type:


class, monitor='val_loss', verbose=0, save_best_only=True, save_weights_only=False, mode='auto', period=1)

Bases: keras.callbacks.ModelCheckpoint

Keras callback used to store the model weights after each epoch. More details in keras documentation.

  • filepath (str) – Folder to store the weights HDF5 files.
  • monitor (str, default 'val_loss') – Model metric to rely on when storing the weights.
  • verbose (int) – Set to 1 to display execution information
  • save_best_only – Set to True to only store weights when model performance (according to metric monitor) improves.
  • save_weights_only (bool, default False) – Set to True to only store the weights.
  • mode (str, default 'auto') –
  • period (int, default 1) –
class, Y)

Bases: keras.callbacks.Callback

Plot a scatterplot illustrating the ground truth values and the network predictions after each epoch.

Parameters:X – Batch of samples. Shape (N, W, H, C), where N: #samples,

W: width, H: height and C: #channels. :type X: numpy.array :param Y: Target values. :type Y: numpy.array

plot_regression(y_true, y_pred, show=False, save=False, filename='none')

Plot scatterplot with network estimation and ground truth values.

  • y_true (numpy.array) – Ground truth values.
  • y_pred (numpy.array) – Network estimation values.
  • show (bool) – Set to True to plot a figure.
  • save (bool) – Set to True to store the image.
  • filename (str) – Name of the file if image is stored.
on_epoch_end(epoch, logs={})

Executed after each epoch.

  • epoch – Current epoch.
  • logs