Missing text reconstruction
Abstract
Missing-text reconstruction (MTR) is a new application of text-oriented pattern recognition. The goal of MTR is to reconstruct documents in which fragments of original text are missing. Using n-gram models of the document's source language, the MTR algorithm makes sets of hypotheses of the missing text, and combines these sets with a probability combining rule to form the best supported reconstruction of the missing text. A prototype software system (mitre) was developed as a proof-of-concept for the MTR techniques discussed.