eScript is an app for electronically transcribing handwritten and typed documents – a kind of electronic scriptorium.

Its approach is based on machine learning techniques and uses a neural network model to carry out the recognition processing. A neural network is a computer model that is meant to work similarly to the human brain, but the details are not important unless you are particularly interested.

Essentially there is a computer model which has to be taught, rather as the human brain has to be taught. The teaching method is to prepare a number of shapes (or 'glyphs') in image format (such as 'jpg' or 'png') and to associate meaning to those shapes, in our case, computer readable characters (such as 'a', 'f', '5', '&') are associated with the shapes of those characters.

Once sufficient shapes and meanings have been prepared they are fed into the model to 'train' it. Then, when a similar shape is presented to the model in the future (during transcription) the expectation is that the model will recognise the shape and substitute the meaning that has been associated with it. It does this on a probability basis so an exact match between shapes should not be necessary - just a close similarity. We can thus present a document to the model as a series of shapes and wait to see what results are fed out at the other end of the process.

Bear in mind that the association between shapes and meaning is straightforward with typed characters but may be more problematic with handwritten characters. In particular it is probably not possible to provide a universal set of shapes and meanings that will work for all handwriting because of the almost unique nature of each individual’s handwriting. Also note that it is not compulsory that a single character is associated with a shape – some shapes which represent multiple characters written close together (or that overlap) may need a multiple character meaning. This is perfectly valid.

