![]() Humans are used to interact with each other by sound, and our everyday listening skills are well-developed for extracting infor- mation from our environment. The results indicate that the approach is applicable for inferring the hand configuration. A naïve Bayes classifier was constructed to automatically classify the data using two different feature sets. The method was developed based on analysis of synthetic and recorded hand clap sounds, labeled with the corresponding hand configurations. this paper, a technique for inferring the configuration of a clap- per’s hands from a hand clapping sound is described. This result must be because the synthesis. Comparison with the synthetic data results shows that the systematic overlaps between different classes differ from those of the synthetic data. For test subject 2 the results are good, with 0.64 % correct classification rate. We notice that although these results are worse than for the synthetic data, they still are well above chance level. The classifier was trained with the training data and tested with the validation data in 20 suc- cessive runs with differently selected training and validation sets, and the obtained results were averaged. For real claps, we performed a randomized cross-validation procedure, in which the recorded data was divided into separate training and validation sets, with the probability of a clap event belonging to the test set being 0.33. This is a promising result, considering the fact that any real-life environment does incorporate some degree of reverberation. ![]() The results for the reverberant signals were quite well aligned with the results of the cases without reverberation. The artificial reverberation did not affect the results much in any of the cases. Without windowing, the performance was 64.4 %. The overall performance of the filter-coefficient clas- sification was affected by windowing the analysis frame. The overall performance of the magnitude spectrum bin clas- sification was 71.7 %, and that of the filter coefficient classification was 69.9 %. We leave the construction of such a taxonomy for the future. Indeed, even for the synthetic data, these classification results suggest a different kind of taxonomy for the hand configurations. Taking a closer look at the results shows that if classes P1 and A2 were clustered to one class, and P2 and A1 to another class, the results would be better. This is in line with the original inspection of overlaps in the classes in Fig. There is systematic misclassification of class P1 as A2, and vice versa. Also the clap types P3, A3, and A1- reach the accu- racy of more than 80 % in the FFT bin case. The best results are obtained with clap type A1+, which is classified correctly over 90 % of all instances in both cases. From the Table 2, we can see that the classification accuracy of different classes varies. The diagonal elements show the success ratio of each clap type being labeled correctly into its own class. The numbers explicate the portion of instances of one class classified into each class. In the table, the rows cor- respond to the actual hand configuration, and the columns corre- spond to the automatic classification result. To provide a reference for the classifier performance, the results for the synthetic data are presented in Table 2 for both the FFT bins and the IIR coefficients as features. In ad- dition, a sequence of flamenco type claps was recorded by one of the subjects, with hand configurations resembling the clap types A1+ and A3 in Fig. Two male subjects A and B performed 20 claps of each type. We recorded sequences of real hand claps in an ITU-R BS.116 standard listening room, with reverberation time of 0.3 s. The training set was not part of the test data. We also generated four 30 second sequences of randomized claps without reverberation and two 30 second sets with artificial reverberation ( freeverb ̃ ) for testing. The set consisted of 190 claps of randomized clap types. As a training set, we used a 60 second se- quence of synthetic claps without reverberation. To test the classification technique presented in Section 3, we gen- erated synthetic data sets with the ClaPD synthesis engine pre- sented in Section 2.1. We first evaluated the classification approach with syn- thetic data, and then proceeded with real hand clap recordings. To evaluate the classification technique, several experiments were conducted. , y N, we can com- pute the log-likelihood of each conditional distribution p ( Y | C = c ) and select the maximum likelihood class to be the most likely class for these observations.
0 Comments
Leave a Reply. |