Python API¶

This section includes information for using the pure Python API of bob.ap.

bob.ap.get_config()[source]¶: Returns a string containing the configuration information.

class bob.ap.Ceps(sampling_frequency[, win_length_ms=20.[, win_shift_ms=10.[, n_filters=24[, n_ceps=19[, f_min=0.[, f_max=4000.[, delta_win=2[, pre_emphasis_coeff=0.95[, mel_scale=True[, dct_norm=True[, normalize_mean=True[, rect_filter=False[, inverse_filter=False[, normalize_spectrum=False[, ssfc_features=False[, scfc_features=False[, scmc_features=False]]]]]]]]]]]]]]]]]) → new Ceps¶

Bases: bob.ap.Spectrogram

Ceps(other) -> new Ceps

Objects of this class, after configuration, can extract the cepstral coefficients from 1D audio array/signals.

Parameters:

sampling_frequency: [float] the sampling frequency/frequency rate
win_length_ms: [float] the window length in miliseconds
win_shift_ms: [float] the window shift in miliseconds
n_filters: [int] the number of filter bands
n_ceps: [int] the number of cepstral coefficients
f_min: [double] the minimum frequency of the filter bank
f_max: [double] the maximum frequency of the filter bank
delta_win: [int] The integer delta value used for computing the first and second order derivatives
pre_emphasis_coeff: [double] the coefficient used for the pre-emphasis
mel_scale: [bool] tells whether cepstral features are extracted on a linear (LFCC, set it to False) or Mel (MFCC, set it to True - the default)
dct_norm: [bool] A factor by which the cepstral coefficients are multiplied
normalize_mean: [bool] Tells whether frame should be normalized by subtracting mean (True) or dividing by max_range (False) True is the default value.
rect_filter: [bool] tells whether to apply the filter in the inversed order, i.e., from high frequencies to low (set it to True''). ``False is the default value.
inverse_filter: [bool] tells whether cepstral features are extracted using a rectungular filter (set it to True), i.e., RFCC features, instead of the default filter (the default value is False)
normalize_spectrum: [bool] Tells whether to normalize the power spectrum of the signal. The default value is False.
ssfc_features: [bool] Set to true if you want to compute Subband Spectral Flux Coefficients (SSFC), which measures the frame-by-frame change in the power spectrum
scfc_features: [bool] Set to true if you want to compute Spectral Centroid Frequency Coefficients (SCFC), which capture detailed information about subbands similar to formant frequencies
scmc_features: [bool] Set to true if you want to compute Spectral Centroid Magnitude Coefficients (SCMC), which capture detailed information about subbands similar to SCFC features
other: [Ceps] an object of which is or inherits from Ceps that will be deep-copied into a new instance.

dct_norm¶: A factor by which the cepstral coefficients are multiplied

delta_win¶: The integer delta value used for computing the first and second order derivatives

energy_bands¶: Tells whether we compute a spectrogram or energy bands

energy_filter¶: Tells whether we use the energy or the square root of the energy

energy_floor¶: The energy flooring threshold

f_max¶: The maximum frequency of the filter bank

f_min¶: The minimum frequency of the filter bank

get_shape(input) → tuple¶

Computes the shape of the output features, given the size of an input array or an input array.

Parameters:

input: [int|array] Either an integral value or an array for which the output shape of this extractor is going to be computed.

This method always returns a 2-tuple containing the shape of output features produced by this extractor.

inverse_filter¶: Tells whether the filter is applied in the inversed order when cepstral features are extracted

log_filter¶: Tells whether we use the log triangular filter or the triangular filter

mel_scale¶: Tells whether cepstral features are extracted on a linear (LFCC) or Mel (MFCC) scale

n_ceps¶: The number of cepstral coefficients

n_filters¶: The number of filter bands

normalize_mean¶: Tells whether frame should be normalized by subtracting mean (True) or dividing by max_range (False)

normalize_spectrum¶: Tells whether the filter is applied in the inversed order when cepstral features are extracted

pre_emphasis_coeff¶: The coefficient used for the pre-emphasis

rect_filter¶: Tells whether cepstral features are extracted using a rectangular scale

sampling_frequency¶: The sampling frequency/frequency rate

scfc_features¶: Make true if you want to compute SCFC features

scmc_features¶: Make true if you want to compute SCMC features

ssfc_features¶: Make true if you want to compute SSFC features

win_length¶: The normalized window length w.r.t. the sample frequency

win_length_ms¶: The window length of the cepstral analysis in milliseconds

win_shift¶: The normalized window shift w.r.t. the sample frequency

win_shift_ms¶: The window shift of the cepstral analysis in milliseconds

with_delta¶: Tells if we add the first derivatives to the output feature

with_delta_delta¶: Tells if we add the second derivatives to the output feature

with_energy¶: Tells if we add the energy to the output feature

class bob.ap.Energy(sampling_frequency[, win_length_ms=20.[, win_shift_ms=10.[, normalize_mean=True]]]) → new Energy¶

Bases: bob.ap.FrameExtractor

Energy(other) -> new Energy

Objects of this class, after configuration, can extract the energy of frames extracted from a 1D audio array/signal.

Parameters:

sampling_frequency: [float] the sampling frequency/frequency rate
win_length_ms: [float] the window length in miliseconds
win_shift_ms: [float] the window shift in miliseconds
normalize_mean: [bool] Tells whether frame should be normalized by subtracting mean (True) or dividing by max_range (False) True is the default value.
other: [Energy] an object of which is or inherits from Energy that will be deep-copied into a new instance.

energy_floor¶: The energy flooring threshold

get_shape(input) → tuple¶

Computes the shape of the output features, given the size of an input array or an input array.

Parameters:

input: [int|array] Either an integral value or an array for which the output shape of this extractor is going to be computed.

This method always returns a 2-tuple containing the shape of output features produced by this extractor.

normalize_mean¶: Tells whether frame should be normalized by subtracting mean (True) or dividing by max_range (False)

sampling_frequency¶: The sampling frequency/frequency rate

win_length¶: The normalized window length w.r.t. the sample frequency

win_length_ms¶: The window length of the cepstral analysis in milliseconds

win_shift¶: The normalized window shift w.r.t. the sample frequency

win_shift_ms¶: The window shift of the cepstral analysis in milliseconds

class bob.ap.FrameExtractor(sampling_frequency[, win_length_ms=20.[, win_shift_ms=10.[, normalize_mean=True]]]) → new FrameExtractor¶

Bases: object

FrameExtractor(other) -> new FrameExtractor

This class is a base type for classes that perform audio processing on a frame basis. It can be instantiated from Python.

Objects of this class, after configuration, can extract audio frame from a 1D audio array/signal. You can instantiate objects of this class by passing a set of construction parameters or another object of which the base type is FrameExtractor.

Parameters:

sampling_frequency: [float] the sampling frequency/frequency rate
win_length_ms: [float] the window length in miliseconds
win_shift_ms: [float] the window shift in miliseconds
normalize_mean: [bool] Tells whether frame should be normalized by subtracting mean (True) or dividing by max_range (False) True is the default value.
other: [FrameExtractor] an object of which is or inherits from a FrameExtractor that will be deep-copied into a new instance.

get_shape(input) → tuple¶

Computes the shape of the output features, given the size of an input array or an input array.

Parameters:

input: [int|array] Either an integral value or an array for which the output shape of this extractor is going to be computed.

This method always returns a 2-tuple containing the shape of output features produced by this extractor.

normalize_mean¶: Tells whether frame should be normalized by subtracting mean (True) or dividing by max_range (False)

sampling_frequency¶: The sampling frequency/frequency rate

win_length¶: The normalized window length w.r.t. the sample frequency

win_length_ms¶: The window length of the cepstral analysis in milliseconds

win_shift¶: The normalized window shift w.r.t. the sample frequency

win_shift_ms¶: The window shift of the cepstral analysis in milliseconds

class bob.ap.Spectrogram(sampling_frequency[, win_length_ms=20.[, win_shift_ms=10.[, n_filters=24[, f_min=0.[, f_max=4000.[, pre_emphasis_coeff=0.95[, mel_scale=True[, normalize_mean=True[, rect_filter=False[, inverse_filter=False[, normalize_spectrum=False[, ssfc_features=False[, scfc_features=False[, scmc_features=False]]]]]]]]]]]]]]) → new Spectrogram¶

Bases: bob.ap.Energy

Spectrogram(other) -> new Spectrogram

Objects of this class, after configuration, can extract the spectrogram from 1D audio array/signals.

Parameters:

sampling_frequency: [float] the sampling frequency/frequency rate
win_length_ms: [float] the window length in miliseconds
win_shift_ms: [float] the window shift in miliseconds
n_filters: [int] the number of filter bands
f_min: [double] the minimum frequency of the filter bank
f_max: [double] the maximum frequency of the filter bank
pre_emphasis_coeff: [double] the coefficient used for the pre-emphasis
mel_scale: [bool] tells whether cepstral features are extracted on a linear (LFCC, set it to False) or Mel (MFCC, set it to True - the default)
normalize_mean: [bool] Tells whether frame should be normalized by subtracting mean (True) or dividing by max_range (False) True is the default value.
rect_filter: [bool] tells whether to apply the filter in the inversed order, i.e., from high frequencies to low (set it to True''). ``False is the default value.
inverse_filter: [bool] tells whether cepstral features are extracted using a rectungular filter (set it to True), i.e., RFCC features, instead of the default filter (the default value is False)
normalize_spectrum: [bool] Tells whether to normalize the power spectrum of the signal. The default value is False.
ssfc_features: [bool] Set to true if you want to compute Subband Spectral Flux Coefficients (SSFC), which measures the frame-by-frame change in the power spectrum
scfc_features: [bool] Set to true if you want to compute Spectral Centroid Frequency Coefficients (SCFC), which capture detailed information about subbands similar to formant frequencies
scmc_features: [bool] Set to true if you want to compute Spectral Centroid Magnitude Coefficients (SCMC), which capture detailed information about subbands similar to SCFC features
other: [Spectrogram] an object of which is or inherits from Spectrogram that will be deep-copied into a new instance.

energy_bands¶: Tells whether we compute a spectrogram or energy bands

energy_filter¶: Tells whether we use the energy or the square root of the energy

energy_floor¶: The energy flooring threshold

f_max¶: The maximum frequency of the filter bank

f_min¶: The minimum frequency of the filter bank

get_shape(input) → tuple¶

Computes the shape of the output features, given the size of an input array or an input array.

Parameters:

input: [int|array] Either an integral value or an array for which the output shape of this extractor is going to be computed.

This method always returns a 2-tuple containing the shape of output features produced by this extractor.

inverse_filter¶: Tells whether the filter is applied in the inversed order when cepstral features are extracted

log_filter¶: Tells whether we use the log triangular filter or the triangular filter

mel_scale¶: Tells whether cepstral features are extracted on a linear (LFCC) or Mel (MFCC) scale

n_filters¶: The number of filter bands

normalize_mean¶: Tells whether frame should be normalized by subtracting mean (True) or dividing by max_range (False)

normalize_spectrum¶: Tells whether the filter is applied in the inversed order when cepstral features are extracted

pre_emphasis_coeff¶: The coefficient used for the pre-emphasis

rect_filter¶: Tells whether cepstral features are extracted using a rectangular scale

sampling_frequency¶: The sampling frequency/frequency rate

scfc_features¶: Make true if you want to compute SCFC features

scmc_features¶: Make true if you want to compute SCMC features

ssfc_features¶: Make true if you want to compute SSFC features

win_length¶: The normalized window length w.r.t. the sample frequency

win_length_ms¶: The window length of the cepstral analysis in milliseconds

win_shift¶: The normalized window shift w.r.t. the sample frequency

win_shift_ms¶: The window shift of the cepstral analysis in milliseconds

Python API¶

Previous topic

Next topic

This Page