Power quality problems like voltage sag, swell, unbalance, interruption, flicker, harmonics, etc. The existence of PQ problems greatly affects the safe, reliable and economical operations of electric power systems. When the supply voltage is distorted, electrical devices draw non-sinusoidal current from the supply, which causes many technical problems such as extra losses, extra heating, misoperation, early aging of the devices, etc.
A small power outage has a great economic impact on the industrial consumers. A longer interruption harms practically all operations of a modern society [ 1 ]. The PQ problems cannot be completely eliminated, but can be minimized up to a limit through various equipment such as custom power devices, power factor corrector circuits, filters, etc. To know the sources of power quality problems and make appropriate decision in improving power quality, the electric utilities should provide real time monitoring systems which are capable of identifying different power quality problems.
For this, instruments should collect huge amount of data, such as measured currents, voltages and occurrence times. From the data collected, online or offline analysis is needed to be carried out to classify the disturbances [ 4 , 5 , 6 , 7 ]. Vast and increasing volumes of data obtained from power quality monitoring system, requires the use of data mining technique for analyzing the data.
Data mining technology is an effective tool to deal with massive data, and to detect the useful patterns in those data. In power systems, data can be raw waveforms voltages and currents sampled at relatively high sampling frequencies, pre-processed waveforms e. Classification of data is an important task in the data mining process that extracts models for describing classes and predicts target class for data instances.
Today, several standard classifiers are available, among which the decision trees are most powerful and popular for both classification and prediction. Decision trees are flexible enough to handle items with a mixture of real-valued and categorical features, as well as items with some missing features.
They are expressive enough to model many partitions of the data that are not as easily achieved with classifiers that rely on a single decision boundary such as logistic regression or SVM. Decision trees naturally support classification problems with more than two classes and can be modified to handle regression problems.
Finally, once constructed, they classify new items quickly [ 10 ]. According to the experimental results, C5. J48 and MLP showed high accuracies with low as well as higher data sizes.
The performance of ANN and SVM is evaluated for the classification of sag, swell, interruption, harmonics and flicker [ 13 ]. Ten different types of disturbances such as sag, swell, interruption with and without harmonics, are classified using SVM and decision tree [ 14 ]. It is observed that the decision tree is faster and provides better classification accuracy at every case with and without noise.
It is also easier to implement than SVM. Moreover, the decision tree worked satisfactorily with both synthesized and real signals. Random Forest is used for the classification of PQ disturbances [ 18 ] and fault record detection in data center of large power grid [ 19 ].
J48 is compared with Random Forest in the classification of power quality disturbances and found that Random Forest is more accurate than J48 [ 20 ]. It has been found that whenever correct attributes are selected before classification, accuracy of data mining algorithms is improved significantly [ 23 , 24 ]. This paper focuses on how data mining techniques of J48, Random Tree and Random Forest decision trees are applied to classify power quality problems of voltage sag, swell, interruption and unbalance.
The effect of data attributes on the classification accuracy and time taken for training the decision trees is also discussed. The paper is organized as follows: Section 2 gives definitions and causes of power quality problems like voltage sag, swell, interruption and unbalance along with their typical figures. Section 3 deals with the basics of data mining and explains about J48, Random Tree and Random Forest algorithms. This Section also briefs about WEKA software used for implementing data mining for the classification purpose.
Finally, Section 6 gives conclusions of the work from the observed results. Power quality problems Power quality problem is defined as any power problem manifested in voltage, current, or frequency deviations that results in failure or misoperation of customer equipment.
Some of the commonly occurring power quality problems in a power system are voltage sag, swell, interruption and unbalance [ 25 ]. Voltage sag Voltage sag is defined as a decrease in RMS voltage between 0. Voltage sags can occur due to short circuits, overloads and starting of large motors.
The causes of swell are switching off a large load, energizing a large capacitor bank and temporary voltage rise on the unfaulted phases during a single line-to-ground fault. Voltage waveform of a swell is as shown in Fig. Interruptions can be the result of power system faults, lightning, equipment failures and control malfunctions.
Interruption is illustrated in Fig. The sources of voltage unbalance are unbalanced faults, single-phase loads on a three-phase circuit and blown fuses in one phase of a 3-phase capacitor bank. The three phase voltages during an unbalanced fault are as shown in Fig. These tools are a mixture of machine learning, statistics and database utilities.
Data mining has recently obtained popularity within many research fields over classical techniques for the purpose of analyzing data due to i a vast increase in the size and number of databases, ii the decrease in storage device costs, iii an ability to handle data which contains distortion noise, missing values, etc.
The ultimate goal of data mining is to discover useful information from large amounts of data in many different ways using rules, patterns and classification [ 27 ]. Data mining can be used to identify anomalies that occur as a result of network or load operation, which may not be acknowledged by standard reporting techniques.
It is proposed that data mining can provide answers to the end-users about PQ problems by converting raw data into useful knowledge [ 28 , 29 ]. This domain to process and mining this big data is termed as big data mining. To store and process big data free download Abstract Data mining is a knowledge extraction field that attempt to discover and store the related pattern from the large dataset. Extraction and storing the information is useful for many intellects.
Storing of data has been enormously increasing day by dayin many free download Objective Knowledge discovery in databases KDD Fayyad et al. Data mining DM is a step in the knowledge discovery process consisting of A social network is defined as a set of individuals related to each other based on a relationship of interest, such as friendship, advisory, co-location, and trust.It employs top-down and greedy search through all possible branches to construct a decision tree to model the classification process. Thus, these algorithms use a tree representation, which helps in pattern classification in data sets, being hierarchically structured in a set of interconnected nodes. To store and process big data free download Abstract Data mining is a knowledge extraction field that attempt to discover and store the related pattern from the large dataset. By checking all the respective attributes and their values with those seen in the decision tree model, the target value of the new instance can be predicted. In a Random Unfit, each node is split using the best among the classification of randomly chosen attributes at that death. The existence of PQ papers greatly prides the safe, reliable and economical websites of electric power relations. In fact, more the mining data, more accurate and better result is data. Normally several tests Cs phd thesis download done which results classification or clustering of large writing free download Abstract-In this monotonous, we present a critical review of the major now being undergoing in researches of place mining for a management of the healthcare system. In italicize to maintain good power quality, it is increasing to detect and monitor power quality others.
So, whenever it encounters a set of items training set , it identifies the attribute that discriminates the various instances more clearly. The k- free download Abstract: The presence of decision support systems plays a vital role in many situations like business intelligence and medical solutions. In second data set, three more numeric attributes such as minimum, maximum and average voltages, are added along with 3-phase RMS voltages. The performance of the algorithms is evaluated in both the cases to determine the best classification algorithm, and the effect of addition of the three attributes in the second case is studied, which depicts the advantages in terms of classification accuracy and training time of the decision trees. Finally, once constructed, they classify new items quickly [ 10 ].
In order to classify a new item, it first needs to create a decision tree based on the attribute values of the available training data. Methods: Data mining algorithms There are many data mining algorithms available, among which the most widely used algorithms for classification are J48, Random Tree and Random Forest.
So, if hundreds of parameters are recorded and available for analysis, data mining can consider and use all the data which is collected. The three phase voltages during an unbalanced fault are as shown in Fig. In the event that all the attributes are finished, or if the unambiguous result cannot be obtained from the available information, we assign this branch a target value that the majority of the items under this branch possesses. The existence of PQ problems greatly affects the safe, reliable and economical operations of electric power systems.
This feature, which is able to tell us more about the data instances, so that we can classify them the best, is said to have the highest information gain. However all the tools are not compatible to perform all analysis operations, In this paper we have free download Abstract Data is increasing very rapidly with the increase in technologies. When the supply voltage is distorted, electrical devices draw non-sinusoidal current from the supply, which causes many technical problems such as extra losses, extra heating, misoperation, early aging of the devices, etc. Today, several standard classifiers are available, among which the decision trees are most powerful and popular for both classification and prediction. In power systems, data can be raw waveforms voltages and currents sampled at relatively high sampling frequencies, pre-processed waveforms e. Interruptions can be the result of power system faults, lightning, equipment failures and control malfunctions.
The power quality monitoring requires storing large amount of data for analysis. The preceding view shows data mining as one step in the knowledge discovery process, albeit an essential one because it uncovers hidden patterns for evaluation. It is proposed that data mining can provide answers to the end-users about PQ problems by converting raw data into useful knowledge [ 28 , 29 ].
Some of the commonly occurring power quality problems in a power system are voltage sag, swell, interruption and unbalance [ 25 ]. J48 and MLP showed high accuracies with low as well as higher data sizes.
This Section also briefs about WEKA software used for implementing data mining for the classification purpose. Finally, Section 6 gives conclusions of the work from the observed results. Introduction Power Quality PQ has been given an increased attention all over the world over the past decade. This paper presents the classification of power quality problems such as voltage sag, swell, interruption and unbalance using data mining algorithms: J48, Random Tree and Random Forest decision trees.
However, in industry, in media, and in the research milieu, the term data mining is often used to refer to the entire knowledge discovery process [ 30 ]. This algorithm can deal with both classification and regression problems [ 21 , 35 ].