Efficient machine learning using partitioned restricted Boltzmann machines
Restricted Boltzmann Machines (RBM) are energy-based models that are used as generative learning models as well as crucial components of Deep Belief Networks (DBN). The most successful training method to date for RBMs is Contrastive Divergence. However, Contrastive Divergence is inefficient when the number of features is very high and the mixing rate of the Gibbs chain is slow. We develop a new training method that partitions a single RBM into multiple overlapping atomic RBMs. Each partition (RBM) is trained on a section of the input vector. Because it is partitioned into smaller RBMs, all available data can be used for training, and individual RBMs can be trained in parallel. Moreover, as the number of dimensions increases, the number of partitions can be increased to reduce runtime computational resource requirements significantly. All other recently developed methods for training RBMs suffer from some serious disadvantage under bounded computational resources; one is forced to either use a subsample of the whole data, run fewer iterations (early stop criterion), or both. Our Partitioned-RBM method provides an innovative scheme to overcome this shortcoming. By analyzing the role of spatial locality in Deep Belief Networks (DBN), we show that spatially local information becomes diffused as the network becomes deeper. We demonstrate that deep learning based on partitioning of Restricted Boltzmann Machines (RBMs) is capable of retaining spatially local information. As a result, in addition to computational improvement, reconstruction and classification accuracy of the model is also improved using our Partitioned-RBM training method.