asnowfall: 03/04/08

This is about my Audio-dictionary project who's goal is to convert English dictionary in to a mp3 file. English dictionary contains far too many words and frankly I am not interested in all of them, in fact I need just tiny part of it so this audio dictionary holds selected few words.

Simplistic view
~~~~~~~~~~~~~~~
(M-W Dictionary) ----> (Text-To-Speech Engine) ---> (WAV file)

Detailed steps
~~~~~~~~~~~~~~~
0) Installed AT&T Text-To-Speech or "TTS Engine" on my PC
1) Choose the words from TEXT version of Merriam Webster Dictionary,
to make files like a.xml, b xml.. etc
2) Wrote a program to
a. read words from XML file and feed in to "TTS Engine"
b. programmed "TTS Engine" save the words as WAV file.
3) Convert the WAV file to MP3 file.

Current status
~~~~~~~~~~~~~~
I have prepared the WAV containing the words starting with letter A and yet to repeat this proven mechanism on the words starting from other letters.

Why.....?
In my profession of software engineering I use few words for everything and feel lacking when I try to write paragraphs, especially the compound sentences are open pits. So to do something for this I am trying to improve my grammar and also vocabulary and this audio dictionary is part of that town.

Collecting Word...
Two years ago, I started thinking about making an audio-dictionary and began underlining the words on paper edition of pocket sized Oxford English dictionary. Next was to transfer the words from paper to PC by manually typing(note that I had chosen only few words under each alphabet). I had two concerns about this methods namely: I was using abridged version of dictionary with subset of words and pocket dictionary did not have examples; I believe that best way to keep a word in memory is to listen or read it being used.

e-Dictionary instead of paper version..
I started searching again for different source of words, a dictionary, and this new slightly ambitious agenda required me to transfer plenty more words from paper to PC and that looked impractical, and that is when I focused on e-Dictionary so there would not be need for me to type words in to PC.

Web-based dictionary or Text-based dictionary....
I started with www.m-w.org, a Web-based dictionary and thought of programmatically querying the web server for word's meaning, and this strategy did not go anywhere, moreover www.m-w.org lacked examples so gave up the idea of using www.m-w.org. No other online dictionaries,like www.dictionary.com, were worth pursuing because they lacked history of publishing any paperback editions.

Cleaning up Gutenberg's project dictionary....
Then reluctantly, I browsed Gutenberg project site and found text-based version of 1903 edition of Merriam Webster dictionary. There were 26 files, one for every word, and these files were big, with elaborate information on every word, including examples, and also found that these files had syntax similar to XML. I decided to use Gutenberg's files as source of words and started deleting the unnecessary details about words, using my C# scripts. In addition to this I also deleted hundreds of words completely.

Regular Expression
Cleaning dictionary was all about searching for patterns and replacing, and Regular Expression is made for such kind of work. I used Visual Studio 2005 editor to search & replace based on patterns.

At present I have done this for file containing the words starting with "A" and need to repeat for other files. So I have A.xml and A.wma

Right Text-To-Speech(TTS) voice....
WindowsXP comes with a synthetic voice, sounds too robotic to comprehend. Then I came across sweet sounding AT&T natural voices, at first I thought they are expensive but costed just 40$, and I think cheap for so much natural sounding voices named: Mike16 and Crystal16.

Program using TAPI ....
Now I had both TTS voice, A.xml ready and just needed a program to take the lines from A.xml and feed them to TTS engine. I reading TAPI, a COM based Microsoft's voice library and wrote an application in C++. TAPI is scalable in sense one can set different volume to words contained in same sentence, and TAPI also has XML syntax to spell the words, and my program is able use these features of TAPI. Last surprising thing was TAPI is able to save the voice in to a WAV file and had not expected this from TAPI. Now I can add TAPI to my resume.

Is this all useful?
I have listened to WAV file several time and surprisingly I find the XML file more use full than the WAV file. XSLT made a.xml very much readable, here is snapshot of it.

View of a.xml

asnowfall

Audio-dictionary project

Blog Archive

Friends who Write