The availability of big sets of data coming from brake durability tests paves the way for making predictions and decisions related to the noise coming from brakes. In this work, the workflow for detecting brake squeal and all its main characteristics is presented. Initially, a uniform set of data is generated, having a repetitive structure and format. This set of data will be used to train the machine learning algorithm. From the raw data coming from the vehicle data acquisition system, a spectrogram is mathematically generated, to graphically associate sound pressure level and noise frequency within the time domain. These spectrograms will be used to train the machine learning algorithm, which will be recognizing brake noise using the spectrogram images. The final objective is to detect squeal and to identify the frequency, sound pressure level, and subjective rating as well.
3. MACHINE LEARNING ALGORITHM
In the latest years the use of deep learning algorithms has increased significantly in the area of computer vision. Deep learning is defined by the use of neural networks to solve complex problems. In computer vision, there is a specific neural network widely used, the Convolutional Neural Networks (ConvNets).
ConvNets are neural networks able to learn local pattern on images and it is the reason why it is better than dense layer neural networks. ConvNets are nowadays the state of art for image classification and detection and it is because two main properties. The first one is called translation-invariant. This property allows this algorithm, that has learnt some pattern in an image, to find it anywhere, does not matter if the pattern has been always in the right bottom of the training images. The second property is called spatial hierarchies of patterns and it allows the ConvNets to learn since small patterns to huge patterns i.e., in the first convolutional layer it learns a small pattern, in the second layer it uses the previous small pattern to learn another pattern that is bigger than the first one.
This project relies on the idea of detecting specific unwanted sounds generated during a vehicle braking operation. It sounds curious the use of computer vision to detect sounds, however due to the quite fast advance in artificial vision, the use of ConvNets to solve this problem becomes an interesting approach. As mentioned before, using raw data it is possible to generate spectrograms of the vehicle braking process and we can use these spectrograms to detect the unwanted sound. In this project we are implementing an algorithm able to detect only one specific undesired sound called Squeal.
The first strategy thought was to use ConvNets to classify images, and then for each spectrogram we would have one output stating if the image contains or not squeal, however this approach is limited due the fact if only says if there is it or not. Added to the information of the presence of a squeal, we also may want to know the frequency of this squeal, the sound intensity of it (acoustic decibels) and in a braking action of 10 seconds, where exactly this starts and ends. Brainstorming about the information needed, the use of an object detection and classification seems interesting, mainly because it provides exactly where the squeal is and having this information, we can get frequency, acoustic decibel and duration. The convolutional neural network used for this task was RCNN that stands for Region-Based Convolutional Neural Network (RCNN).
3.1. Data pre-processing
The most important step for implementation of any algorithm of computer vision is the image’s processing, as the result of any machine learning algorithm depends on the quality of the input data.
For object detection and classification, it is necessary to provide to the algorithm the image and the location of the object to be detected on the image. To be able to implement RCNN we are using Tensorflow and it accepts two files as inputs of the ConvNet being the first one the image and the second one a xml file containing information about the object to be located on the image like coordinates, class, filename, etc. The process of generating this file is called data annotation. There are several software tools that can be used to do data annotation and a great one is called LabelImg. This tool is free and written completely in Python. The Figure 2 shows one example of how the data annotation is performed in LabelImg and the Figure 3 displays on example of xml file generated.
3.2. Faster RCNN (Region-Based Convolutional Neural Network)
In machine learning, several times the problem you have was already partial solved or even completely solved by someone else. Most of times it is not necessary to design machine learning algorithms from the scratch because we can reuse algorithms already trained to solve our task if they have similarities. This “transference of information” is called Transfer Learning. According to Andrew Ng, Transfer Learning will be the next driver of machine learning success. We can say that the future of machine learning relies on two main factors: Transfer Learning and Open Source Code as stated by Mendes (2019).
As stated before, we are using Tensorflow to be able to implement an object detection algorithm and it is because it is an open source platform where we can find several tools with a great documentation to be able to implement machine learning models.
In the Tensorflow repository in Github we can find several models pre trained on some datasets. Even our task is quite different from the ones performed by these models, we can use these pre trained models and train our own dataset. After training few models, the one with better performance in our case was the Faster Region-Based Convolutional Neural Network (RCNN)-Inception-v2.
RCNN are ConvNets widely used in object detection where the R stands for region proposal methods. This method recommends, on an image, several regions where the objects could be. This algorithm is computationally expensive, i.e., it takes long to do the inference on an image, as it runs convolutional layer for each region proposed. In 2016, it is introduced the Faster RCNN. This method announces the concept of RPN (Region Proposal Network) and this ConvNet is 250x faster than the normal RCNN. This speed relies on the fact that using this method the network is sharing convolutional layer with the object detection network.
As mentioned before, most of time it is not necessary to build neural networks from the scratch as we can use transfer learning to solve specific task. For our problem we are using a ConvNet called Inception V2, designed in 2015 by Dr. Christian Szegedy, a Google researcher. It is a deep network that tries not to reduce drastically the dimensions of the input image and it uses smart factorization to increase the computational efficiency.
The first step was to identify a set of data in order to train the algorithm. A specific dataset containing around 30.000 events was prepared in order to train the algorithm. The data was:
- Owned by Applus IDIADA (collected during internal development, investigation and internal quality assurance testing)
- Subjectively rated by master drivers properly qualified according to Applus IDIADA criteria
- Collected during standard brake noise durability testing in Mojacar
- Collected during testing on standard B, C, D and E segment passenger cars.
After training the model for almost 30.000 epochs, the model was able to get a mean average precision (mAP) of approximately 0.55 in the evaluation dataset. This result is quite satisfactory considering the complexity of the problem. The Figure 4 shows the evolution of mAP through the epochs.
About Applus IDIADA
With more than 25 years of experience and 2,450 engineers specializing in vehicle development, Applus IDIADA is a leading engineering company providing design, testing, engineering, and homologation services to the automotive industry worldwide.
Applus IDIADA is located in California and Michigan, with further presence in 25 other countries, mainly in Europe and Asia.
Tag: Brake Noise