Abstract
|
The regional energy-saving work is an important part of China¡¯s energy conservation projects. In this paper, we developed a regional energy consumption analysis model using the energy consumption data from a typical Chinese industrial city, Shaoxing. We incorporated a well-known data mining tool, the k-means clustering method, into our model to automatically classify our energy consumption data into low, medium and high clusters representing different energy consumption levels. This classification provides a basis for further analysis to help governments and enterprises to use energy more efficiently. However, there are a few of potential outliers in our data set, and the result of k-means might be strongly influenced by these outliers. To reduce the impact of these extremely large data points, we proposed a distance-based outliers removal algorithm as well as a corresponding parameters choosing algorithm which provides tuning parameters to make a balance between keeping and removing far away points. The experimental results show that our algorithms can effectively reduce the influence of outliers and make the k-means results more meaningful. The relationship between levels of consumption and industrial output was also examined as one possible way of further analysis based on our model.
|