In the experiments, we employed one hundred films for training and testing the film-DNN. The films were arbitrarily retrieved from the YouTube website and produced by many countries and regions, including Germany, America, France, etc. Each film belongs to one of the five categories, i.e., horror thrillers, science fiction, action, literature-love (romance), and comedy films. Every category includes twenty-five films, so the number of films is evenly balanced. In the training phase, we employed 75 films as the training set for training the film-DNN. The remainder of the 25 films is utilized as the validation set, which is used to evaluate the performance of the proposed film-DNN for the recognition of film types.
Initially, we employed 25 films for training and testing the film-DNN as a preliminary experiment. Fig. 9 demonstrates the relationship between the epochs and loss value for training the film-DNN. As the epochs number increases, the value of the loss decreases. We can find that the value of the loss function cannot be further reduced significantly if the number of epochs exceeds 4000. Therefore, the number of epochs is chosen to be 4000 in the experiments. Fig. 10 demonstrates the relationship between the number of epochs and the accuracy rate in training the film-DNN. The performance is further improved when the number of neurons increases for a layer. If the number of neurons is greater than twenty units in a layer, the performance cannot be further improved. Accordingly, the number of neurons is selected to be twenty for each layer in the experiments.
Fig. 11 demonstrates the scores of recognizing the film type by using the film-DNN. The best recognition scores in the categories of horror thrillers, action, science fiction, and literature-love (romance) films are much greater than those of the other film types. Employing the deviation and mean of HSV as feature parameters is suitable for the recognition of film types.
The recall rate, precision one, and F-measure are employed to assess the recognition performance of the proposed film-DNN. The precision rate of film type recognition is defined as
In (9), the score of precision rate p (in %) is high when the number of correctly recognized film types increases. Conversely, the precision rate p% gets a low value if the number of correctly recognized film types decreases.
The fraction of films, whose types are correctly recognized, can be reflected by the recall rate, it is defined by
In (10), it can be found that the higher the number of correctly recognized film types, the larger the score of the recall rate r%, which represents superior performance for the recognition of film types. The F-measure can be regarded as a measure to simultaneously consider the overall performance of a film type recognition system, we employ the F-measure for assessment, given as
In (11), the score of the F-measure obtains a high score if the scores of recall rate r% in (10) and accuracy rate p% in (9) are both high. Conversely, the score of the F-measure decreases when the scores of the recall rate r% and accuracy rate p% are low. Accordingly, the F-measure is utilized to evaluate the overall performance of the proposed film-DNN for the recognition of film types. The performance in terms of the recall rate r%, accuracy rate p%, and F-measure are presented in Table 1.
Measures Film type Score (%) Precision rate Action 83.33 Comedy 100.00 Literature-love 83.33 Science fiction 100.00 Horror thrillers 100.00 Recall rate Action 100.00 Comedy 60.00 Literature-love 100.00 Science fiction 100.00 Horror thrillers 100.00 F-measure Action 90.91 Comedy 75.00 Literature-love 90.91 Science fiction 100.00 Horror thrillers 100.00
Table 1. Comparison of recognition results for various film types in terms of precision rate, recall rate, and F-measure
In the performance of precision rate, the types of the horror thrillers, science fiction, and comedy films can be correctly recognized with precision rate equaling 100%. Most film types of literature-love and action films are also correctly recognized. The proposed film-DNN can successfully classify horror thrillers, action, science fiction, and literature-love films. The recall rate of these type approaches 100%. The overall performance in terms of the F-measure for the proposed film-DNN is greater than 90.91% for most film types, in particular for the types of horror thrillers and science fiction it reaches 100%. Consequently, the performance of the proposed film-DNN is acceptable.
By observing the performance of film type recognition presented in Table 1 and Fig. 11, the proposed approach can effectively recognize the film types. Therefore, using the standard derivation and mean of HSV, and the ratios of brightness and saturation as training and validation features given in (8) for the film-DNN one can well recognize the film types. Finally, all films were separated into a training set of 75 films and validation set of 25 films to test the performance of the film-DNN. Each type has twenty films as a training set and five films as a validation set. The number of films for each type is evenly balanced and distributed over training and testing sets. Because a film contains plentiful contents, the film could belong to two or more film types. Accordingly, we evaluated the performance of film type recognition by using the Top 1, Top 2, and Top 3 of the recognized results, where the Top 1 presents the most similarity category. The recall rates are 52%, 76%, and 100% for using Top 1, Top 2, and Top 3 criteria, respectively. These results mean that recognized results can completely fit the target film type by using the Top 3 criterion.
As shown in Fig. 11, the recognition accuracy of the comedy films is not as good as the other films. The error is caused by the romance films, which obtains the highest scores at the output of the film-DNN for the target film being a comedy. Due to the properties of the romance and comedy films being similar, the color styles of these two film types are also similar, because the HSV properties of films vary significantly. The color style also depends on the director of the film. It is hard to obtain a high accuracy rate for the recognition of film types, in particular for a great number of films. Accordingly, the proposed method of film type recognition can be regarded as an auxiliary tool for users to quickly obtain their interesting films.
Table 2 presents the performance comparison for the rule-based method and proposed film-DNN. These two methods utilized the same features as given in (8). The rule-based method creates an analysis tree for the film type recognition and requires to set appropriate thresholds for each feature according to the distribution characteristics of films. The performance is very sensitive to the selected thresholds, so the thresholds are difficult to be defined. On the contrary, the proposed film-DNN can learn the statistical properties of film colors. It is not needed to define the threshold of each feature. Thus, the proposed film-DNN is more effective and robust than the rule-based method. Although the recall rate of the rule-based method can reach 100%, the precision rate can reach only 83.33%, which is not as high as the recall rate. The performance of these two measures are not balanced. The overall quality in terms of F-measure is 90.71%. On the contrary, the precision and recall rates of the proposed film-DNN are 93.33% and 92%, respectively. The performance of these two measures are evenly balanced. The overall quality in terms of F-measure of the proposed film-DNN (91.36%) is better than the rule-based method (90.71%). Consequently, the proposed film-DNN is better and more robust than the rule-based method.
Measures Rule-based method (%) Film-DNN (%) Precision rate 83.33 93.33 Recall rate 100.00 92.00 F-measure 90.71 91.36
Table 2. Performance comparison of film type recognition
Recognition of Film Type Using HSV Features on Deep-Learning Neural Networks
- Received Date: 2019-09-04
- Rev Recd Date: 2019-11-15
- Available Online: 2020-05-06
- Publish Date: 2020-03-01
- Deep-learning /
- film type recognition /
- hue, saturation, and brightness value (HSV) analysis /
- neural networks /
- video classification
Abstract: The number of films is numerous and the film contents are complex over the Internet and multimedia sources. It is time consuming for a viewer to select a favorite film. This paper presents an automatic recognition system of film types. Initially, a film is firstly sampled as frame sequences. The color space, including hue, saturation, and brightness value (HSV), is analyzed for each sampled frame by computing the deviation and mean of HSV for each film. These features are utilized as inputs to a deep-learning neural network (DNN) for the recognition of film types. One hundred films are utilized to train and validate the model parameters of DNN. In the testing phase, a film is recognized as one of the five categories, including action, comedy, horror thriller, romance, and science fiction, by the trained DNN. The experimental results reveal that the film types can be effectively recognized by the proposed approach, enabling the viewer to select an interesting film accurately and quickly.
|Citation:||Ching-Ta Lu, Jia-An Lin, Chia-Yi Chang, Chia-Hua Liu, Ling-Ling Wang, Kun-Fu Tseng. Recognition of Film Type Using HSV Features on Deep-Learning Neural Networks[J]. Journal of Electronic Science and Technology, 2020, 18(1): 31-41. doi: 10.11989/JEST.1674-862X.90904223|