A noninvasive methodology for studying the kinematics of speech production is presented. It is based on the tracking of very small and light passive markers attached to the subjects' face. Using a pair of TV cameras, the 3-D markers' positions are computed in real time, at a subpixel accuracy, by a dedicated hardware. From these data, the time course of a set of parameters which describe lip and jaw movement is computed; in addition, a semiautomatic procedure that identifies the exact onset and offset of the investigated sequences has been developed. To compare the results over different productions, a time normalization procedure based on a continuous inverse Fourier transform has been implemented.