Dynamic Timbre Tracking in Continuous Speech: A Method fo...

ABSTRACT

This paper introduces Dynamic Timbre Tracking (DTT), a framework for modeling the temporal evolution of acoustic parameters in continuous speech. DTT extends traditional static parameters into continuous trajectories using a unified state-space formulation, comprising feature extraction, continuous-state timbre modeling, and temporal dynamics characterization. The method is evaluated on benchmarks for speaker identification, emotion recognition, and voice quality assessment, showing competitive performance with DNN embeddings.

PAPER · PDF

manuscript.pdf ↓ Download PDF

Loading PDF...

↓ View full paper PDF →

Key findings

DTT extends static acoustic parameters into continuous trajectories for timbre analysis.

The framework includes multi-resolution feature extraction and phoneme-aware modeling.

DTT characterizes rate-of-change, acceleration, and interaction patterns among timbre dimensions.

The method offers explicit interpretability and negligible inference cost compared to DNN embeddings.

DTT achieves competitive performance on established benchmarks for speech analysis tasks.

Limitations & open questions

The paper does not discuss the scalability of DTT for very large datasets.

The generalization of DTT to other languages and accents is not addressed.

Dynamic Timbre Tracking in Continuous Speech: A Method for Extended Acoustic Parameter Trajectory Analysis

Key findings

Limitations & open questions

Related Papers