selfclean_audio.datasets.label_error_dataset#

Members

LabelErrorDataset

Dataset that creates label errors by changing labels of selected samples.

class selfclean_audio.datasets.label_error_dataset.LabelErrorDataset(dataset: Dataset, frac_error: float = 0.1, n_errors: int | None = None, change_for_every_label: bool = False, random_state: int = 42, name: str | None = None)[source]#

Dataset that creates label errors by changing labels of selected samples. This follows the same approach as the image domain implementation.

Parameters:
  • dataset – Original clean dataset

  • frac_error – Fraction of samples with label errors

  • n_errors – Exact number of errors (overrides frac_error)

  • change_for_every_label – If True, change labels for each class separately

  • random_state – Random seed for reproducibility

  • name – Dataset name for logging

get_errors() list[int][source]#

Return ground truth label error indicators. This matches the interface from the image domain.

Returns:

List of 0/1 indicating whether each sample has a label error

info()[source]#

Print dataset information