Pantograph Spark Fault Detection using YOLO

Pantograph-catenary is now the dominant form of current collection for modern electric trains because they can be used for higher voltages. Faults in pantograph-catenary systems threaten the operation and safety of railway transportation. They need to be continuously monitored and controlled to maintain safe transport. Pantograph may be damaged as a result of extreme weather conditions which can affect its normal operation, leading to failure of pantograph and overhead contact line systems. Poor contact between pantograph and overhead contact line causes thermal erosion to the wire. When the pantographs are exposed to air, they could deteriorate due to electrochemical reaction with the environment since they are made of metals. Movement of catenary lines and pantograph in high crosswinds has been found to cause the wire to be trapped in the pantograph. There is a serious issue regarding the quality of images generated by pantograph video monitoring system on high-speed railway trains which often shows inconsistencies of catenary faults. The application of traditional image processing and deep learning techniques have been unable to meet the requirements of spark detection. In this paper, a modern deep learning algorithm is proposed to detect sparks in the pantograph. Specifically, the YOLOv3 model is used to counter this problem that traditional image processing algorithms have been unable to. The results on a very large sample of data show the efficiency and real-time performance of the proposed method, which meets the requirements of pantograph spark detection in high-speed railway.


Introduction
Catenary is an important part of the traction power supply system in high speed railways and it consists of some catenary support components such as insulator, rotary double ear, and brace sleeve. The normal state of catenary support components is the basis of the normal operation of high speed trains. However fault states of catenary support components could occur due to the vibration caused by vehicles and the complex natural environment near the railway. Typical catenary support components' fault include breakage and fracture of the component.

Figure 1: Pantograph Diagram
The electricity to run electric trains comes from pantograph-catenary systems. As a result, defects in these systems pose a hazard to railway transportation's functioning and safety. The arcs created in the pantograph and catenary systems indicate that the system's wires have become overheated (Xin, Roberts, Weston and Stewart, 2020). Wires that are overheating might be a symptom of a defect or wear in the area. Defects in these systems may be recognized early on and steps may be taken to avoid accidents and high costs of maintenance. In recent years, monitoring pantograph-catenary systems and diagnosing faults in these systems has become a major concern. The avoidance of pantograph-catenary system failures has become increasingly important as train speeds have increased. Railway pantographs are used all over the world to gather electrical energy from the above catenary to power railway carriages. Faults in the pantograph system affect the dependability of railway operations by lowering the quality of the contact between the pantograph and the catenary (Huang, et al., 2018). Regular inspection chores are carried out at rolling stock depots to keep the pantographs in excellent functioning order. Pantograph examinations are currently only useful for detecting large problems and have limited capacities for detecting or diagnosing incipient defects. Pantograph condition monitoring has the ability to increase pantograph performance while lowering maintenance costs. A laboratory-based pantograph test rig has been constructed to acquire a knowledge of pantograph dynamic behaviours, particularly when incipient defects are present, as a first step in realizing practical pantograph condition monitoring (Xin, et al., 2020).
Many railway firms do routine maintenance on their equipment to repair any damaged parts and prevent breakdowns. Furthermore, the transportation service must be stopped on a regular basis for repair. This is not a great scenario for everyone concerned. As a result, pantograph-catenary system monitoring and detection systems have been researched in recent years. Early detection of faults in the monitored system is possible. This makes maintenance planning more efficient and timelier. As a result, service interruptions in railway transportation may be kept to a minimum. Furthermore, maintenance expenses will be cut since only equipment that has failed or is about to fail will be maintained. Karakose et al. have proposed an image processing based approach to diagnosis of pantograph-catenary systems. The approach has modeled the interaction between the pantograph and the catenary, and classifies the pantograph as dangerous, safe and defective by using image processing techniques.
Hao et al performed a dynamic analysis of the arc in the pantograph catenary system during the pantograph lowering. According to the MHD theory, they developed a pantograph-catenary arc model. Mokrani et al. performed a monitoring control for the pantograph-catenary system. They addressed the issue of regulating the contact force between the pantograph and the catenary. They proposed a linear time-varying model describing the evolution of contact force. The results obtained were satisfactory. Barmada and others suggest a method of detecting arcs in pantograph-catenary systems using support vector based classification. They found out when an arc came out with the output of a prototype that they obtained using voltage and current information from the system. Yang et al set up a signal and image processing based experimental setup for pantograph inspection. In another study, the vibrational signals of the catenary system were measured by a pantograph mounted device. In this method, good results were obtained when the defective region was large. But it cannot detect the fault early. In this paper, an approach using deep learning is proposed for the detection of arcs in pantograph-catenary systems as well as signal and image processing methods in the literature. In the proposed study, arc detection is performed using CNN (Convolutional Neural Network). Research in recent years shows that CNN is very successful in complex machine vision problems. CNN provides very successful results in operations such as classification, segmentation and object detection.
Signal and image processing techniques are used for state monitoring and automatic detection in pantograph-catenary systems. Vibration signal from the pantograph-catenary system can be analyzed related to the health condition. Or you can get information about the pantograph status with an accelerometer. Condition monitoring techniques with signal processing use the current and voltage signals obtained from the pantograph. Therefore, sensors must be installed on the train to obtain current and voltage. Normal or thermal cameras are used for image processing or computer vision techniques based status monitoring and automatic detection. Especially with thermal cameras, monitoring has become very popular in recent years. Because thermal cameras use infrared rays and pantograph-catenary systems emit heat when they touch each other. In this way, even in the dark, the system can be perceived.

RELATED LITERATURES 2. 1 Arc Detection Based On YOLO Architecture 2.1.1 Target Detection Algorithm
YOLO as a deep learning algorithm has improved drastically in the field of machine learning. Over the years various models of the algorithm have been developed, the first of which being the YOLOv1 detects the object by basing it as a regression problem. A single convolutional network predicts multiple bounding boxes and class probabilities for all the grid cells simultaneously. The input image is divided into S x S grids. If the center of a proper object falls into a grid cell, that grid cell will be considered in detecting that object. Each grid cell predicts N bounding boxes and confidence scores for the boxes and c class probabilities. At test time, the class probabilities and the individual box confidence predictions are multiplied resulting in class-specific confidence scores for each predicted box. These scores encode both the probability of that class appearing in the box and how well the predicted box fits the object. These predictions are encoded as an S × S × (N * 5 + c) tensor. The width (bw) and the height (bh) are predicted relative to the whole image. And that is why YOLOv1 uses Nx5 for calculating tensor (Du, 2018). Pr(Class i |Object) * Pr(Object) * IOU pred = Pr(Class i) * IOU pred. These confidence scores reflect how confident the model is in ensuring that the predicted box contains an object and how accurate the model thinks the box around the different objects is. Confidence score is defined as: Pr(Object) * IOUtruthpred YOLOv1's network has 24 convolutional layers as opposed to YOLOv2, which has 19 layers [10]. For evaluating YOLO model on the PASCAL VOC detection dataset, these values are used: S=7, therefore a 7x7 grid. N=2, number of bounding boxes. The PASCAL VOC dataset has 20 labelled classes so c=20. Therefore YOLOv1's final prediction is a 7x7x (5x2+20) =7x7x30 tensor. Here only 98 bounding boxes per image is used (Du, 2018; Redmon, Divvala, Girshick, & Farhadi, 2016) YOLOv3 (You Only Look Once) is a network that detects objects from beginning to end. YOLO divides the picture into SxS blocks, each of which is in charge of recognizing targets whose centre points lie inside the grid. Non-Maximum Suppression is used to remove redundant bounding boxes after detection. YOLOv3, the third version of YOLO, features various cutting-edge technologies, such as a residual block-based backbone, a feature pyramid network like network head for multi-scale prediction, batch normalization, anchor box prediction, and so on. Throughout the years, several solutions for solving the challenge of object identification have been offered. These methods concentrate on the solution at various stages. Recognition, classification, localisation, and object detection are the four fundamental steps. These strategies have faced hurdles such as output accuracy, resource cost, processing speed, and complexity difficulties as technology has progressed through time.  - [14] There are two important criteria for evaluating a target detection algorithm: mAP (mean Average Precision) and FPS (Frames Per Second). mAP represents the mean of the average accuracy (AP) for each category, and FPS represents how many images per second the target network can process. As shown in Fig. 2, the performance of each target detection algorithm in the PASCAL VOC 2007 and 2012 datasets is compared. Considering the accuracy and speed, YOLO series algorithm is more suitable for the requirements of this study. Because YOLOv3 is an improvement on YOLOv2, it improves the prediction accuracy while maintaining the advantage of speed. At the same time, the detection ability of small targets is improved, so the YOLOv3 target detection algorithm was selected in this study to detect spark fault.

0 YOLO Architecture
The YOLOv3 algorithm divides a picture into a grid first. Each grid cell forecasts the placement of a certain number of boundary boxes (also known as anchor boxes) around items that score well in the preset classifications. Each boundary box has a confidence score that indicates how accurate it believes the forecast should be, and each bounding box identifies just one item. To determine the most frequent forms and sizes, the e. Classes' Specificity The new YOLOv3 uses independent logistic classifiers and binary cross-entropy loss for class predictions during training. These enhancements make it possible to train YOLOv3 models on difficult datasets like Microsoft's Open Images Dataset (OID). OID has dozens of overlapping labels for photographs in the collection, such as "man" and "person." YOLOv3 uses a multi label approach, which allows for more descriptive classes with several bounding boxes (Handalage & Kuganandamurthy, 2021). Meanwhile, YOLOv2 used a softmax function, which converts a vector of numbers into a vector of probabilities, with the probabilities of each value proportional to the vector's relative size. When using a softmax, each bounding box is forced to belong to only one class, which is not always the case, especially with datasets like OID.

IMAGE PREPROCESSING 3.1.2 Image graying processing
The spark is generated in the contact friction between the catenary and the pantograph carbon slide. In order to accurately detect the spark, the image can be segmented from the location of the spark. The spark is detected and located in the pantograph carbon slide area, which can avoid the interference data generated by other areas of the image in the later algorithm and lead to false detection. According to the pantograph detection and positioning model, the image of the pantograph carbon slide area can be extracted. In order to reduce the computational complexity of image processing and improve the speed of image processing, the original image can be grayed. In the above formula, is the coordinate of image pixel points, is the Gray value of any pixel point, and , and are the three component values of RGB [23].

Grayscale Characteristics of Spark Image
The image of normal pantograph carbon slide plate area and the image of spark pantograph carbon slide plate area have the characteristics of gray distribution which can be clearly distinguished from each other [24]. According to the characteristics of spark grayscale, the image grayscale can reflect the brightness and morphological characteristics of spark [25] (the size of grayscale value reflects the brightness of spark, and the number of pixels in the range of spark grayscale value reflects the size and morphology of spark). As shown in Fig. 3.b, there is an obvious difference in gray value between the two, which is caused by different radiation intensities [28]. The gray histogram of the image is analyzed. The relatively high gray value is the gray value of the spark. At the same time, the existence of spark leads to the increase of the overall gray value of the image, the number of pixels in the high gray value area increases, and the number of pixels in the low gray value area decreases. In addition, the sample image is analyzed. When there is a spark in the image, the gray value of the spark generally ranges within [245,255].

Defect Detection
This section includes two overlap sections (marked by a red box), ten expansion joints, and eleven midpoint anchors. Since the actual catenary defects will inevitably cause impacts on both the front and rear collectors, that is, only four strain signals are simultaneously identified with impacts, each identified impact is a real impact defect Furthermore, for two kinds of specific defects, the methods to locate the defects and diagnose the cause of the problem through the strain signal can be determined.
(1)Detection of Defects Caused by Expansion Joints. In order to facilitate the smooth transition of the stagger of two adjacent conductor rails, the expansion joint is generally installed in a position where the stagger is zero on account of the installation requirement. On basis of this characteristic, the measured left and right strain values are zero, which helps to locate the expansion joint. When there are defects in the expansion joint at the two adjacent conductor rails, impacts are inevitably generated and a high peak appears in the strain signal. Thus, based on these features, positions in Figure 3.a have stagger values close to zero and are identified as expansion joints. This is also consistent with the construction plan (2)Detection of Defects Caused by Overlaps. When the pantograph passes through the overlap sections, it will make the measured values of the left and right strains rapidly change from one direction to another. An impulse signal in the strain is generated. It is this feature that helps locate the overlap section based on the measured strain signals. This was verified by comparison with the construction plan of the catenary. But if there is not only one impulse signal, this section would be considered a defective overlap section. The extra impulse signal is contributed by an overlap defect, like a misaligned overlap.

Fault Description of Pantograph
The pantograph can fail in many different ways including upper frame cracks, suspension system damage, pull rod fractures, and operating system failure, and so on. The pull rod of the pantograph is generally assumed as a two-force rod, and then the intensity and fatigue design is performed, only considering the axial draw and pressure loads. In other words, if there is only an axial load, the pull rod is strong enough to withstand the heavy load. However, the pull rod failure is actually observed, for example, fractures in the outer rings and the pretension bolt fractures. Figure 3 shows that the outer rings of the joint bearing of the pull rod were ripped during operation of the underground line.

EXPERIMENT AND ANALYSIS
Based on the algorithm in this paper, the experimental flow of spark detection is shown in Figure 4. The video shot by a high-definition camera mounted on a high-speed railway train is converted into images as the source of the image sample set for this experiment in this paper. The 2550 images from the sample set were randomly divided into 1800 from the training set and 750 for the testing and validation set from the test set. The experimental operating system in this paper is Windows 10, and the devices are Intel Core i5-10400F and NVIDIA GeForce RTX 2060.

Training Model
In order to verify the performance of the algorithm model in this paper, sparks are added as sample training to obtain the deep learning spark detection model. In order to reduce the potential error in comparison, the two models adopt the same training set and test set. The training set and test set included two types of pantographs, single bow and double bow, as well as the conditions during the day and night in good weather and the train in the tunnel. Images with sparks in the plot will be labeled with pantograph as a whole and spark In the deep learning spark detection model, while images without sparks will only be labeled with pantograph as a whole. However, the algorithm model will only be labeled by the pantograph parts when plotting and when there is no spark.

Test Model
The training set is used to obtain the pantograph positioning and detection model based on YOLOv3 in the algorithm model in this paper. The model is combined with a spark detection model based on image processing technology. Finally, the algorithm model of this paper is obtained and tested through the test set. The detection results of deep learning spark detection model and algorithm model in this paper are shown in Table 2. From the above table that the detection speed of the algorithm detection model in this paper is consistent with that of the deep learning spark detection model, both of which can meet the real-time detection requirements. When the deep learning spark detection model has over-fitting, objects in the picture will be mistakenly detected as sparks, the colors of sparks are mainly blue and white, and their shapes are mostly round in the training samples. But there were red and white, long strips of sparks in the test set. Spark characteristics are different from those of spark samples due to the lack of spark samples. Over fitting and too few samples will lead to errors and missed detection of the model. Part of the detection results of the deep learning spark detection model are shown in Fig. 5. The reason for the errors and omissions in the detection model of the algorithm in this paper is that the sparks in the images are small and weak, the proportion of pixels is small, and the gray value of the sparks does not reach the set threshold. Some undetected sparks can be detected by adjusting the threshold value, but setting the threshold too low will result in increased false detection. False detection will occur when the background gray value reaches the threshold value. The algorithm in this paper sets an appropriate threshold after analyzing the samples, and part of the detection results are shown in Figure 6

CONCLUSION
The pantograph detection model based on deep learning YOLOv3 was used to locate the pantograph parts in this paper, and then the spark location between the catenary and the pantograph and the characteristics of the spark image were analyzed. Finally, the spark generated in the carbon slide area was detected by image processing technology. The experimental results show that the proposed algorithm model can detect the pantograph spark of high-speed railway train with a small number of samples, and the real-time and accurate performance of the detection can meet the detection requirements. The interference of sunlight, light and weather to the detection of the tunnel should be taken into account to further improve the performance of the algorithm model in the future work.