Source: Applus IDIADA
ADELANTO, Calif. – The following is the second of three Technical Corner pieces by Antonio Rubio, Project Manager, Braking Systems at Applus IDIADA, on how the use of artificial intelligence can help in the development of braking systems NVH characteristics.
The first article:
This article presents how artificial intelligence can support the identification of brake noise in real time.
Particularly, the validation of the algorithm is shown to cover not only brake squeal in standard condition, but also to detect different brake noises, under different testing conditions (different standards, city and mountain driving, low and high ambient temperature, different vehicle category).
The first part of the article explained the mathematical background behind the solution; this new part will focus on the machine learning algorithm.
- Machine Learning Algorithm
In the latest years, the use of deep learning algorithms has increased significantly in the area of computer vision. Deep learning is defined using neural networks to solve complex problems. In computer vision, there is a specific neural network widely used, the Convolutional Neural Networks (ConvNets).
ConvNets are neural networks able to learn local pattern on images and it is the reason it is better than dense layer neural networks. ConvNets are nowadays the state of art for image classification and detection, and it is because two main properties. The first one is called translation-invariant. This property allows this algorithm, which has learnt some pattern in an image, to find it anywhere, does not matter if the pattern has been always in the right bottom of the training images. The second property is called spatial hierarchies of patterns and it allows the ConvNets to learn since small patterns to huge patterns i.e., in the first convolutional layer it learns a small pattern, in the second layer it uses the previous small pattern to learn another pattern that is bigger than the first one.
This project relies on the idea of detecting specific unwanted sounds generated during a vehicle braking operation. It sounds curious the use of computer vision to detect sounds, however due to the quite fast advance in artificial vision, the use of ConvNets to solve this problem becomes an interesting approach. As mentioned before, using raw data it is possible to generate spectrograms of the vehicle braking process and we can use these spectrograms to detect the unwanted sound. In this project we are implementing an algorithm able to detect only one specific undesired sound called Squeal.
The first strategy thought was to use ConvNets to classify images, and then for each spectrogram we would have one output stating if the image contains or not squeal, however this approach is limited due the fact if only says if there is it or not. Added to the information of the presence of a squeal, we also may want to know the frequency of this squeal, the sound intensity of it (acoustic decibels) and in a braking action of 10 seconds, where exactly this starts and ends. Brainstorming about the information needed, the use of an object detection and classification seems interesting, mainly because it provides exactly where the squeal is and having this information, we can get frequency, acoustic decibel and duration. The convolutional neural network used for this task was RCNN that stands for Region-Based Convolutional Neural Network (RCNN).
- Data pre-processing
The most important step for implementation of any algorithm of computer vision is the image’s processing, as the result of any machine learning algorithm depends on the quality of the input data.
For object detection and classification, it is necessary to provide to the algorithm the image and the location of the object to be detected on the image. To be able to implement RCNN we are using Tensorflow and it accepts two files as inputs of the ConvNets being the first one the image and the second one a xml file containing information about the object to be located on the image like coordinates, class, filename, etc. The process of generating this file is called data annotation. There are several software tools that can be used to do data annotation and a great one is called LabelImg. This tool is free and written completely in Python. The Figure 2 shows one example of how the data annotation is performed in LabelImg and the Figure 3 displays on example of xml file generated.
1.2 Faster RCNN (Region-Based Convolutional Neural Network)
In machine learning, several times the problem you have was already partial solved or even completely solved by someone else. Most of times it is not necessary to design machine learning algorithms from the scratch because we can reuse algorithms already trained to solve our task if they have similarities. This “transference of information” is called Transfer Learning. According to Andrew Ng, Transfer Learning will be the next driver of machine learning success. We can say that the future of machine learning relies on two main factors: Transfer Learning and Open-Source Code.
As stated before, we are using Tensorflow to be able to implement an object detection algorithm and it is because it is an open-source platform where we can find several tools with a great documentation to be able to implement machine learning models.
In the Tensorflow repository in Github we can find several models pre trained on some datasets. Even our task is quite different from the ones performed by these models, we can use these pre trained models and train our own dataset. After training few models, the one with better performance in our case was the Faster Region-Based Convolutional Neural Network (RCNN)-Inception-v2.
RCNN are ConvNets widely used in object detection where the R stands for region proposal methods. This method recommends, on an image, several regions where the objects could be. This algorithm is computationally expensive, i.e., it takes long to do the inference on an image, as it runs convolutional layer for each region proposed. In 2016, it is introduced the Faster RCNN. This method announces the concept of RPN (Region Proposal Network) and this ConvNets is 250x faster than the normal RCNN. This speed relies on the fact that using this method the network is sharing convolutional layer with the object detection network.
As mentioned before, most of time it is not necessary to build neural networks from the scratch as we can use transfer learning to solve specific task. For our problem we are using a ConvNets called Inception V2, designed in 2015 by Dr. Christian Szegedy, a Google researcher. It is a deep network that tries not to reduce drastically the dimensions of the input image and it uses smart factorization to increase the computational efficiency.
After training the model for almost 30.000 epochs, the model was able to get a mean average precision (mAP) of approximately 0.55 in the evaluation dataset. This result is quite satisfactory considering the complexity of the problem. The Figure 4 shows the evolution of mAP through the epochs.
ABOUT APPLUS IDIADA
With more than 25 years’ experience and 2,450 engineers specializing in vehicle development, Applus IDIADA is a leading engineering company providing design, testing, engineering, and homologation services to the automotive industry worldwide.
Applus IDIADA has locations in California and Michigan, with further presence in 25 other countries, mainly in Europe and Asia.