To search, Click below search items.


All Published Papers Search Service


A regional energy consumption analysis model using a novel outlier removal algorithm and k-means clustering method


Yuchen Wang, Shuxiang Xu, Wei Liu


Vol. 16  No. 3  pp. 21-29


The regional energy-saving work is an important part of China’s energy conservation projects. In this paper, we developed a regional energy consumption analysis model using the energy consumption data from a typical Chinese industrial city, Shaoxing. We incorporated a well-known data mining tool, the k-means clustering method, into our model to automatically classify our energy consumption data into low, medium and high clusters representing different energy consumption levels. This classification provides a basis for further analysis to help governments and enterprises to use energy more efficiently. However, there are a few of potential outliers in our data set, and the result of k-means might be strongly influenced by these outliers. To reduce the impact of these extremely large data points, we proposed a distance-based outliers removal algorithm as well as a corresponding parameters choosing algorithm which provides tuning parameters to make a balance between keeping and removing far away points. The experimental results show that our algorithms can effectively reduce the influence of outliers and make the k-means results more meaningful. The relationship between levels of consumption and industrial output was also examined as one possible way of further analysis based on our model.


machine learning data mining k-means outlier removal regional energy consumption analysis model