Spatially disjoint ground truth splits

The ground truth of the Toulouse Hyperspectral Data Set is a shapefile that contains polygons associated to a land cover class. Spatially close polygons are grouped together, resulting in a few hundreds of groups. The split of the ground truth consists in assigning each group to a set (among the labeled training set, the unlabeled training set, the validation set and the test set) such that the proportions of pixels in every sets respect some conditions. Standard splits of the ground truth are provided with the TlseHypDataSet class described in the Dataset section.

class TlseHypDataSet.utils.dataset.DisjointDataSplit(dataset, split=None, proportions=None, file=None, n_solutions=1000)

A class to produce spatially disjoint train / test splits of the ground truth as described in …

Parameters
  • dataset – A TlseHypDataSet object

  • split – An array of size (1 x n_groups) specifying the assignment of each group to a set

  • proportions – A list in the following format: [p_labeled, p_val and p_test]. If the argument split is not given, compute a split such that the proportions of pixels in the labeled training set, the validation set and the test set are greater than p_labeled, p_val and p_test, respectively.

  • file – Path to a file where a split is saved in a pickle format

  • n_solutions – the maximum number of solutions for the SAT solver (used only with the proportions argument)

property groups_
Returns

A dict whose keys are sets and values are lists with assigned groups

property indices_
Returns

A dict whose keys are sets and values are sample indices in the TlseHypDataSet

property sets_
Returns

A dict whose keys are sets and values are labeled training, unlabeled training, validation and test Pytorch datasets