http://voiceglue.org/wiki/doku.php?id=voiceglue_0.11_installation_instructions
http://www.i6net.com/support/install/
http://www.w3.org/TR/voicexml21/
需要安装库Xerces-C++ ,SpiderMonkey
Xerces-C++ http://xerces.apache.org/xerces-c/index.html
http://xerces.apache.org/xerces-c/build-winunix-2.html#UNIX
SpiderMonkey : https://developer.mozilla.org/En/SpiderMonkey/Build_Documentation
编程资料。
http://docs.voxeo.com/voicexml/2.0/frame.jsp?page=learningvoicexml.htm
The assign element is used to explicitly assign a value to a variable. The Prophecy implementation ignores the requirement where a variable must be pre-declared in order to assign a value to it, for ease of development. See the documentation on ‘variables’ for detailed information on scooping, assignment, and session variables. |
The audio element allows you to play an audio sound file in your application, assuming that your file is in the following formats:
In the event that your audio URL cannot be located, you can nest backup text to speech within the audio tags to ensure that a message of some sort will get played. See Appendix F in the Voxeo documentation for further information and helpful hints. |
The block element is simply a form-item container element for executable content, which executes if the condition of the item is equal to ‘true’. |
The break element is used to designate a pause in the TTS output, with the length being a user-specified time value in either milliseconds, or of a predetermined ‘size’ variable. |
The catch element is used to intercept application and user-defined errors, conditions, and messages, thereby allowing the developer to assign event handlers on a scoped basis. The content nested within the catch element will only be executed when its particular event is thrown. |
The choice element is used in conjunction with the menu element to create robust voice menus that allow the caller multiple navigational choices without the need for traditional grammar structures. The menu tag acts as the container element, while the choice element defines the available menu items. |
The clear element is used to set any existing VoiceXML variable or element guard variables to an undefined value, such as any user-defined variable set with the var or assign tags. The clear element can also be used to clear any block, subdialog, menu, record, transfer, or field guard variables, thus, programmatically allowing a caller to revisit these form items. |
The <data> element is a new addition to the VoiceXML2.1 specification that allows the developer to fetch content from an XML source without having to use any server side logic, and without having to transition to a new dialog. |
The disconnect element is used to programmatically disconnect the caller from the voice application, which also throws an connection.disconnect event to the interpreter to allow the the developer to submit any existing variables to his webserver. |
The else element is used as the final logic constructor in an array of conditional statements. The content nested within the else element will get executed when all other if and elseif cond attributes evaluate to ‘false’. |
The elseif element is used to specify additional content when all other else or if statements equate to ‘false’. When a series of if-elseif-else statements are encountered, the application will execute the first one which evaluates to ‘true’. |
The emphasis element allows the developer to specify the level, or stress, of any specified TTS enclosed within the emphasis element. Do note that SSML tags are ignored by some of the available TTS engines, while some engines use their own non-compliant markup. Check the Text-To-Speech appendices for additional details. |
The enumerate element is used to read the list of menu choices to the caller, using either TTS or a user-defined audio file. |
The error element is the shorthand for the <catch event=”error”> handler, and will catch all events whose value begins with ‘error’. See the documentation for 'exceptions' to learn more. |
The example element is part of the Speech Recognition Grammar Specification, and is only usable within XML grammar constructs. This element defines an 'example' phrase, useful to the developer, which can be used to create a successful recognition result for the grammar in question. |
The exit element is similar to the disconnect tag, in that it will terminate the current dialog and return control to the interpreter; i.e., the browser will release the call. The difference is, that the exit element allows the developer to specify both a namelist and an expression that can be sent back to the browser |
The field element facilitates a dialog which allows the interpreter to collect information from the user. Caller utterances are matched against any active grammars, until the element’s filled condition is executed. |
The filled element allows the developer to specify actions to take when a grammar match has occurred. Once the interpreter has recognized a valid match specified by the filled elements namelist attribute, the filled element will then execute any code contained within it, (such as a conditional if/else statement). |
The VoiceXML 2.1 specification introduces the new <foreach> element, which allows the developer an inherent browser method for looping through items in an array, and outputting them via TTS, or via <audio> . This element must specify both the 'array', as well as the 'item' attributes, else an error.semantic will be thrown. |
The form element acts as a container for all field-items, (such as a field or a subdialog), and for all control items, (such as a block or an initial element). The form element is considered the basic component of any VoiceXML dialog, and is part of a balanced breakfast. |
The goto tag is used to transition application execution to a specific form within the current document, or to an entirely separate document. Additionally, it can transition execution to a a specific form within an entirely separate document. |
The grammar element allows the developer to specify a GSL or XML recognition grammar for either voice or DTMF input. Grammars may be defined inline (within the vxml document itself), or as a standalone, external file. |
The help element provides a syntactic shorthand for <catch event=”help”>. This element is used as an error handler for the help event which is contained in the Universal Grammar of any voiceXML application. |
The if element, (in conjunction with the else/elseif elements), provides a method to utilize conditional logic expressions which allow the developer to change the control flow within the application based on user utterances, variable values, or events. |
The initial element is used in mixed-initiative dialogs, (where the caller’s first utterances dictate application flow), allowing the caller to fill in form-wide information, (thus allowing the user to skip over the field prompts), with but one utterance. |
The item element defines a valid utterance match within an XML grammar. This element also allows repeats and probability weighting to be added to the utterance in question. |
The link tag allows the developer to easily implement a document or application scoped grammar and transition/event handler for a caller’s input. This element is usually placed in the application root document at the vxml level in order to implement global grammars/event handlers. |
The log element allows the developer to output debug messages to the Voxeo Logger. Generous use of log statements placed within code can greatly assist when tracking variable values, and errors that occur in the application. |
Inserts markers into output streams for asynchronous notification. |
The menu tag is another ‘shortcut element’ that emulates the field, grammar, and goto elements. In addition, it also allows for both implied DTMF and Voice grammars. |
The meta element functions in VoiceXML as it does in HTML. It denotes meta information about the document, and its http headers. Do note that when <meta> is used in a root document, all leaf documents will inherit the actions of <meta>. |
The noinput tag acts as a syntactic shorthand for the expression <catch event=”noinput”>. It allows the developer to assign event handlers when the application expects voice or DTMF input, but has received none from the caller. |
The nomatch element is a syntactic shorthand for the expression <catch event=”nomatch”>. This attribute allows the developer to assign handlers when the caller inputs a value that is not recognized by any of the active grammars. |
The one-of element allows the grammar to be constructed with a series of alternate phrases or rule expansions, each of which is contained within an Item element. This element is the basic building block of any XML grammar that defines more than one valid utterance. |
The option element is a convenient grammar shortcut, which allows the developer to list simple choices to the caller rather than constructing a complete grammar with return slots. |
Nesting TTS within the paragraph element, (or simply, the p element), specifies that the particular block of TTS should br read out in paragraph structure. |
The param element is used when submitting data to a subdialog or an object. Note that any params sent to an external subdialog file must have the param set as a form-level variable within the target page |
The phoneme element essentially allows the developer to specify a phoenetic pronounciation for any TTS nested within.Do note that SSML tags are ignored by some of the available TTS engines, while some engines use their own non-compliant markup. Check the Text-To-Speech appendices for additional details. |
The prompt element allows the developer to output synthesized text-to-speech content to the caller. |
The property element defines any platform-defined setting which impacts the behavior of the application. A property can be set to encompass the entire application, (when set at the vxml level of an application root document), or a property can be defined for a specific form item. |
The prosody element replaces the non-compliant pros element from VXML 1.0, and allows the developer to change several aspects of the TTS output to allow for custom tailoring of prompts to sound more naturalistic to the caller. Do note that SSML tags are ignored by some of the available TTS engines, while some engines use their own non-compliant markup. Check the Text-To-Speech appendices for additional details. |
The record element is an input item which records audio from the caller, and stores the resultant audio file in it’s namespace variable. |
The reprompt element, upon execution, will increase the FIA prompt counter by one, and then replay the most recent prompt before listening again for caller input. Note the fact that the reprompt element will have no effect when placed within a filled construct, as the field item variable will have been already filled with a value. |
The return element is used to terminate a subdialog and return control to the main dialog. |
The rule element defines the named rule expansion of an XML grammar. |
The ruleref element is an SRGS addition to VXML 2.0 that allows the developer to specify an existing rule for inclusion within the current grammar. |
Sometimes a developer needs to control how the TTS engine interprets text, especially numbers. The say-as element allows you to potentially define whether the text should be interpreted as time , boolean , date , digits , currency , number , phone , or time . Note that the actual available values may vary, depending on the TTS engine in use. |
The script element is used for enclosing ECMAScript code to execute on the client side. Note that unlike the HTML element, the script element in VoiceXML does not specify a type, as it must be ECMAScript only. Also note that either a src or inline content may be specified, but not both. |
The sentence element, (or s element, for those into brevity), is used to format a particular region of TTS to be read out in sentence structure, rather than paragraph structure. Do note that SSML tags are ignored by some of the available TTS engines, while some engines use their own non-compliant markup. Check the Text-To-Speech appendices for additional details. |
The text nested within this element will be replaced by the value specified within the alias attribute. Do note that SSML tags are ignored by some of the available TTS engines, while some engines use their own non-compliant markup. Check the Text-To-Speech appendices for additional details. |
The subdialog element is a method for reusing common dialogs within an independent application context. The VoiceXML content indicated by the src attribute runs in a new execution context which includes all state information and declarations from the invoking application. |
The use of the submit tag will transition control to a new document, either via the GET or POST methods. Dissimilar to the goto element, submit allows the developer to submit a space-delimited list of variables to the destination document. |
The tag element is another SRGS addition to VWS2.0 that allows the developer to specify an arbitrary string that is used for semantic interpretation. Essentially, what the tag element does is to define the returned value from a valid user utterance. |
The throw element triggers an event which can be caught and handled by using the catch element. Either platform defined events, such as ‘nomatch’, or user-defined events, such as ‘MyCoolEvent’, may be thrown using this tag. |
The token element defines a word or phrases that may be spoken by the caller in an XML grammar. A token may contain whitespace, (if properly quoted), and may contain uppercase characters as well. |
The transfer element is used to transition the caller to another destination. The destination can either be another document, form, or, more commonly, to initiate an outbound call to another phone number. Note: This tutorial requires the use of outbound dialing priveleges, which must be provisioned by voxeo support. If you have not contacted us to get these permissions, click here to learn how you can get hooked up with this feature. |
The value element can be used to insert a variable value for use as a TTS prompt. |
The var element is used to declare a VoiceXML variable within the scope specified by its parent element. Additionally, variables declared with the var element are accessible from any JavaScript which resides within the script element, providing that they are within the same scope. |
Allows you to control logging, for example turning off logging temporarily when you collect a credit card number. Please remember that you want to use this sparingly, as disabling logging does limit Voxeo Supports ability to troubleshoot any application issues. |
The <voxeo:recordcall> element is a proprietary extension to the VoiceXML specification that allows the developer to record both sides of a call, recording the human and the application interaction to a wav file that is stored in the developer's Voxeo File Manager in the 'recordings' subdirectory. On a premise install these recordings will be saved to the voxeo/webapps/www/MRCP/Recordings folder. |
The vxml element is the initial declaration that defines a document as a VoiceXML application. |
The assign element is used to explicitly assign a value to a variable. The Prophecy implementation ignores the requirement where a variable must be pre-declared in order to assign a value to it, for ease of development. See the documentation on ‘variables’ for detailed information on scooping, assignment, and session variables. |
The audio element allows you to play an audio sound file in your application, assuming that your file is in the following formats:
In the event that your audio URL cannot be located, you can nest backup text to speech within the audio tags to ensure that a message of some sort will get played. See Appendix F in the Voxeo documentation for further information and helpful hints. |
The block element is simply a form-item container element for executable content, which executes if the condition of the item is equal to ‘true’. |
The break element is used to designate a pause in the TTS output, with the length being a user-specified time value in either milliseconds, or of a predetermined ‘size’ variable. |
The catch element is used to intercept application and user-defined errors, conditions, and messages, thereby allowing the developer to assign event handlers on a scoped basis. The content nested within the catch element will only be executed when its particular event is thrown. |
The choice element is used in conjunction with the menu element to create robust voice menus that allow the caller multiple navigational choices without the need for traditional grammar structures. The menu tag acts as the container element, while the choice element defines the available menu items. |
The clear element is used to set any existing VoiceXML variable or element guard variables to an undefined value, such as any user-defined variable set with the var or assign tags. The clear element can also be used to clear any block, subdialog, menu, record, transfer, or field guard variables, thus, programmatically allowing a caller to revisit these form items. |
The <data> element is a new addition to the VoiceXML2.1 specification that allows the developer to fetch content from an XML source without having to use any server side logic, and without having to transition to a new dialog. |
The disconnect element is used to programmatically disconnect the caller from the voice application, which also throws an connection.disconnect event to the interpreter to allow the the developer to submit any existing variables to his webserver. |
The else element is used as the final logic constructor in an array of conditional statements. The content nested within the else element will get executed when all other if and elseif cond attributes evaluate to ‘false’. |
The elseif element is used to specify additional content when all other else or if statements equate to ‘false’. When a series of if-elseif-else statements are encountered, the application will execute the first one which evaluates to ‘true’. |
The emphasis element allows the developer to specify the level, or stress, of any specified TTS enclosed within the emphasis element. Do note that SSML tags are ignored by some of the available TTS engines, while some engines use their own non-compliant markup. Check the Text-To-Speech appendices for additional details. |
The enumerate element is used to read the list of menu choices to the caller, using either TTS or a user-defined audio file. |
The error element is the shorthand for the <catch event=”error”> handler, and will catch all events whose value begins with ‘error’. See the documentation for 'exceptions' to learn more. |
The example element is part of the Speech Recognition Grammar Specification, and is only usable within XML grammar constructs. This element defines an 'example' phrase, useful to the developer, which can be used to create a successful recognition result for the grammar in question. |
The exit element is similar to the disconnect tag, in that it will terminate the current dialog and return control to the interpreter; i.e., the browser will release the call. The difference is, that the exit element allows the developer to specify both a namelist and an expression that can be sent back to the browser |
The field element facilitates a dialog which allows the interpreter to collect information from the user. Caller utterances are matched against any active grammars, until the element’s filled condition is executed. |
The filled element allows the developer to specify actions to take when a grammar match has occurred. Once the interpreter has recognized a valid match specified by the filled elements namelist attribute, the filled element will then execute any code contained within it, (such as a conditional if/else statement). |
The VoiceXML 2.1 specification introduces the new <foreach> element, which allows the developer an inherent browser method for looping through items in an array, and outputting them via TTS, or via <audio> . This element must specify both the 'array', as well as the 'item' attributes, else an error.semantic will be thrown. |
The form element acts as a container for all field-items, (such as a field or a subdialog), and for all control items, (such as a block or an initial element). The form element is considered the basic component of any VoiceXML dialog, and is part of a balanced breakfast. |
The goto tag is used to transition application execution to a specific form within the current document, or to an entirely separate document. Additionally, it can transition execution to a a specific form within an entirely separate document. |
The grammar element allows the developer to specify a GSL or XML recognition grammar for either voice or DTMF input. Grammars may be defined inline (within the vxml document itself), or as a standalone, external file. |
The help element provides a syntactic shorthand for <catch event=”help”>. This element is used as an error handler for the help event which is contained in the Universal Grammar of any voiceXML application. |
The if element, (in conjunction with the else/elseif elements), provides a method to utilize conditional logic expressions which allow the developer to change the control flow within the application based on user utterances, variable values, or events. |
The initial element is used in mixed-initiative dialogs, (where the caller’s first utterances dictate application flow), allowing the caller to fill in form-wide information, (thus allowing the user to skip over the field prompts), with but one utterance. |
The item element defines a valid utterance match within an XML grammar. This element also allows repeats and probability weighting to be added to the utterance in question. |
The link tag allows the developer to easily implement a document or application scoped grammar and transition/event handler for a caller’s input. This element is usually placed in the application root document at the vxml level in order to implement global grammars/event handlers. |
The log element allows the developer to output debug messages to the Voxeo Logger. Generous use of log statements placed within code can greatly assist when tracking variable values, and errors that occur in the application. |
Inserts markers into output streams for asynchronous notification. |
The menu tag is another ‘shortcut element’ that emulates the field, grammar, and goto elements. In addition, it also allows for both implied DTMF and Voice grammars. |
The meta element functions in VoiceXML as it does in HTML. It denotes meta information about the document, and its http headers. Do note that when <meta> is used in a root document, all leaf documents will inherit the actions of <meta>. |
The noinput tag acts as a syntactic shorthand for the expression <catch event=”noinput”>. It allows the developer to assign event handlers when the application expects voice or DTMF input, but has received none from the caller. |
The nomatch element is a syntactic shorthand for the expression <catch event=”nomatch”>. This attribute allows the developer to assign handlers when the caller inputs a value that is not recognized by any of the active grammars. |
The one-of element allows the grammar to be constructed with a series of alternate phrases or rule expansions, each of which is contained within an Item element. This element is the basic building block of any XML grammar that defines more than one valid utterance. |
The option element is a convenient grammar shortcut, which allows the developer to list simple choices to the caller rather than constructing a complete grammar with return slots. |
Nesting TTS within the paragraph element, (or simply, the p element), specifies that the particular block of TTS should br read out in paragraph structure. |
The param element is used when submitting data to a subdialog or an object. Note that any params sent to an external subdialog file must have the param set as a form-level variable within the target page |
The phoneme element essentially allows the developer to specify a phoenetic pronounciation for any TTS nested within.Do note that SSML tags are ignored by some of the available TTS engines, while some engines use their own non-compliant markup. Check the Text-To-Speech appendices for additional details. |
The prompt element allows the developer to output synthesized text-to-speech content to the caller. |
The property element defines any platform-defined setting which impacts the behavior of the application. A property can be set to encompass the entire application, (when set at the vxml level of an application root document), or a property can be defined for a specific form item. |
The prosody element replaces the non-compliant pros element from VXML 1.0, and allows the developer to change several aspects of the TTS output to allow for custom tailoring of prompts to sound more naturalistic to the caller. Do note that SSML tags are ignored by some of the available TTS engines, while some engines use their own non-compliant markup. Check the Text-To-Speech appendices for additional details. |
The record element is an input item which records audio from the caller, and stores the resultant audio file in it’s namespace variable. |
The reprompt element, upon execution, will increase the FIA prompt counter by one, and then replay the most recent prompt before listening again for caller input. Note the fact that the reprompt element will have no effect when placed within a filled construct, as the field item variable will have been already filled with a value. |
The return element is used to terminate a subdialog and return control to the main dialog. |
The rule element defines the named rule expansion of an XML grammar. |
The ruleref element is an SRGS addition to VXML 2.0 that allows the developer to specify an existing rule for inclusion within the current grammar. |
Sometimes a developer needs to control how the TTS engine interprets text, especially numbers. The say-as element allows you to potentially define whether the text should be interpreted as time , boolean , date , digits , currency , number , phone , or time . Note that the actual available values may vary, depending on the TTS engine in use. |
The script element is used for enclosing ECMAScript code to execute on the client side. Note that unlike the HTML element, the script element in VoiceXML does not specify a type, as it must be ECMAScript only. Also note that either a src or inline content may be specified, but not both. |
The sentence element, (or s element, for those into brevity), is used to format a particular region of TTS to be read out in sentence structure, rather than paragraph structure. Do note that SSML tags are ignored by some of the available TTS engines, while some engines use their own non-compliant markup. Check the Text-To-Speech appendices for additional details. |
The text nested within this element will be replaced by the value specified within the alias attribute. Do note that SSML tags are ignored by some of the available TTS engines, while some engines use their own non-compliant markup. Check the Text-To-Speech appendices for additional details. |
The subdialog element is a method for reusing common dialogs within an independent application context. The VoiceXML content indicated by the src attribute runs in a new execution context which includes all state information and declarations from the invoking application. |
The use of the submit tag will transition control to a new document, either via the GET or POST methods. Dissimilar to the goto element, submit allows the developer to submit a space-delimited list of variables to the destination document. |
The tag element is another SRGS addition to VWS2.0 that allows the developer to specify an arbitrary string that is used for semantic interpretation. Essentially, what the tag element does is to define the returned value from a valid user utterance. |
The throw element triggers an event which can be caught and handled by using the catch element. Either platform defined events, such as ‘nomatch’, or user-defined events, such as ‘MyCoolEvent’, may be thrown using this tag. |
The token element defines a word or phrases that may be spoken by the caller in an XML grammar. A token may contain whitespace, (if properly quoted), and may contain uppercase characters as well. |
The transfer element is used to transition the caller to another destination. The destination can either be another document, form, or, more commonly, to initiate an outbound call to another phone number. Note: This tutorial requires the use of outbound dialing priveleges, which must be provisioned by voxeo support. If you have not contacted us to get these permissions, click here to learn how you can get hooked up with this feature. |
The value element can be used to insert a variable value for use as a TTS prompt. |
The var element is used to declare a VoiceXML variable within the scope specified by its parent element. Additionally, variables declared with the var element are accessible from any JavaScript which resides within the script element, providing that they are within the same scope. |
Allows you to control logging, for example turning off logging temporarily when you collect a credit card number. Please remember that you want to use this sparingly, as disabling logging does limit Voxeo Supports ability to troubleshoot any application issues. |
The <voxeo:recordcall> element is a proprietary extension to the VoiceXML specification that allows the developer to record both sides of a call, recording the human and the application interaction to a wav file that is stored in the developer's Voxeo File Manager in the 'recordings' subdirectory. On a premise install these recordings will be saved to the voxeo/webapps/www/MRCP/Recordings folder. |
The vxml element is the initial declaration that defines a document as a VoiceXML application. |