DLL functions Options Text-driven Controls

DLL FUNCTIONS:
  The three TTS DLL functions below must be invoked in sequence within a thread. InitTTS() must be invoked first, GenSpeechFile() can be invoked several times, and StopTTS() should be invoked finally.
  Since these functions support multi-threaded applications, it is o.k. to invoke InitTTS() many times in different threads before GenSpeechFile() is invoked.
   
  Long InitTTS(long *Hndl, char *InitCondition);
  Long GenSpeechFile(long Hndl, char *InputCharString, char *OutputFilePathAndName, long *Prcnt);
  Long StopTTS(long Hndl);
   
InitTTS( )  is used to initialize the environment, to load parameters and signal files, and to assign a processing handle. If succeeded, its return value is zero. Otherwise, a error code is returned.
  The first parameter, Hndl, can carry in a desired handle number between 1 to 255. If you don't care which number is assigned, Hndl must be set to zero before invoking InitTTS(). If succeeded, the assigned handle number is carried out also by this parameter.
  The second parameter, InitCondition, can carry in the desired options. The details of the acceptable options will be explained below.
   
GenSpeechFile( ) is used to synthesize speech signal file according to the text given. If succeeded, its return value is zero. Otherwise, a error code is returned.
  The parameter, Hndl, is used to carry in handle number. 
  The parameter, InputCharString, is used to carry in the text string to be synthesized.
  The parameter, OutputFilePathAndName, is used to designate the output path and file name to store the synthesized speech signal.
  The parameter, Prcnt, is currently not used but its value must be set to NULL before invoking GenSpeechFile().
   
StopTTS( )  is used to free the memory space allocated for a handle. The parameter, Hndl, is used to carry in handle number.

 
 
 
 
 

OPTIONS:
  The options that can be put into the parameter, InitCondition, before invoking GenSpeechFile().
"pathpr=...", e.g. "pathpr=c:\gtswin\"
  set working directory path. There must be no blanks in the path name because blank is the delimiter for options. checked before "gts.ini".
   
"gts.ini"  if used, the other options will be checked from the first line of the file, gts.ini, except "pathpr=..."
   
"sndsrc=#", # can be 0, 1, 2, or 3. select a source signal file pair.
"sndsrc=0" selects the pair, base.dic (11,025Hz sampled male signal waveform), basepit.dic (pitch peak position information).
"sndsrc=1" 
(the default)
selects the pair, base_.dic (22,050Hz sampled male voice), basepit_.dic (pitch peak position information).
"sndsrc=2" selects the pair, basef.dic (22,050Hz sampled female voice but not provided for down-load), basepitf.dic (pitch peak position information).
"sndsrc=3" selects the pair, basef8.dic (11,025Hz sampled female voice), basepitf8.dic (pitch peak position information).
   
"msspkr=Mary"
(the default)
or "msspkr=Mike"
  select English source signal speaker.
   
"timeout=#", # is a floating point number.
  set time-out value in seconds. A time-out error code is returned if time elapsed exceed the time-out value specified. The default is 7.8 sec. if not specified.
   
"noanti" no anti-aliasing processing to have higher synthesis speed but signal quality will be slightly degraded.
   
"interl"  to promote synthesis speed for female recorded source signal. Make use of pitch periods in interleaving manner.
   
"fast"  not new but implies both "noanti" and "interl". Note that use "sndsrc=0" or "sndsrc=3" together to obtain fast synthesis speed.
   
"-16"  output synthesized signal sample in 16 bits/sample PCM format. The default is 8 bits/sample mu-law format.
   
"outhead"  place a "wav" header record at the start of the synthesized speech file.
   
"outrt=#", # should be a number between 6000 and 24000.
  set the sampling rate of the synthesized signal. The default is 8000 if not specified but 11025 at least is suggested. Also, if outrt=11025 and sndsrc=(0 or 3) are specified, then no down-sampling processing is required, which results in higher synthesis speed.
"adpcm"  store synthesized-signal sample in ADPCM format. This implies "-16" and "outhead" because an utility program will be invoked last to do format conversion.

 
 
 
 
 

Text Driven Control
@>0,  Comma is needed. Reset all parameters to their default values.
@>dxxx  xxx represent three decimal digits. Value range is from 160 to 990. This value controls speaking rate, i.e. xxx mini-seconds in duration per syllable.
@>txxx  xxx represent three decimal digits. Value range is from 060 to 400. This value controls tone height, i.e. xxx Hz in average.
@>vxxx  xxx represent three decimal digits. Value range is from 050 to 200. This value controls vocal-track shortening/lengthening to have distinct timbres.
@>cxxx  xxx represent three decimal digits. Value range is from 060 to 400. This is a combined control of tone height and vocal-track length to have distinct timbres according to a single tone-height value, xxx.
@>Bxxx  xxx represent three decimal digits. Value range is from 050 to 250. This value controls the rate of pitch bending. Values less than 100 means flating. Values larger than 100 means sharpening.