Speech transcription with Sphinx

The transcription duration could be much better (at least on my machine), but it's about on par with Google's transcription service considering network latency. This may be tweaked with smaller training sets and specialised configuration parameters at the cost of some quality, which seems to be alright at the moment. It looks like a lot of trial and error will be necessary for that.

