Tutorial 6: Audio feature analysis
By default, when creating a Corpus
, the feature analysis performed on the audio
sources is an MFCC (Mel Frequency Cepstral Coefficient) analysis.
This is meant to capture the timbral characteristics of the audio sources, and
use that as a criteria for matching each segment from the target in a Mosaic
,
with possible segments from the audio sources in a Corpus
.
However, we can set which features to use when creating a Corpus
. Currently,
the available features are:
"timbre"
: equivalent to an MFCC analysis."pitch"
: equivalent to a chroma analysis.
In short, we can create a Corpus
based on timbre, pitch, or both.
The decision on which features to use will greatly depend on the types of targets
you want to use.
Here’s a quick example:
from gamut.features import Corpus
# set audio source(s) for corpus
source = '/path/to/source/audio/folder-or-file'
# create corpus based on pitch content
pitch_based_corpus = Corpus(source=source, features=['pitch'])
# create corpus based on timbral content
timbre_based_corpus = Corpus(source=source, features=['timbre'])
# create corpus based on pitch AND timbral content
pitch_timbre_based_corpus = Corpus(source=source, features=['pitch', 'timbre'])
Note
Specifying which features to use has important implications, since Mosaic
instances are created
based on the same features used by the input Corpus
instance(s).
Similarly, when combining Corpus
instances as input for a Mosaic
, it will only work if all of them were created
on the same features.