SpeechLion accepts various commands spoken by the user which produce fake keyboard and mouse actions on the desktop. Because of this, SpeechLion can in theory perform any actions a user sitting at the keyboard can. This does not require any modification of the application programs being controlled.
While SpeechLion's lowest-level keyboard and mouse commands are useful for ad-hoc circumstances, a usable speech application needs customized commands for various applications. For example, to switch to the next window, the user could say "key alt tab" and SpeechLion would generate the alt-tab key sequence, which would cause the windowing system to bring the next window to the foreground. However, it is much easier for the user to say "next window" for that action. An even better example is Emacs: saying "key control x-ray, key control foxtrot" to open a file is much more tedious than simply "open file".
SpeechLion runs from its main directory with the following command:
Place your mouse pointer in your Firefox window and speak the following commands. Be sure to leave a short pause in between to allow them to be recognized and acted upon by your browser:
- "browse google"
- "show help"
- "new tab"
- "browse slashdot"
- "page down"
- "browse yahoo"
- "scroll down"
- "close tab"
Hover your mouse pointer over a link and say "mouse click". Let go of your mouse and say "mouse down". Put your mouse in a text entry box and say "key alpha". Now try "key back space".
The grammar files themselves are the best documentation. They are in JSGF (Java Speech Grammar Format). At any time, you may say "show help" or "display help" and SpeechLion will display a random set of example sentences that are valid in the current mode. It is not an exhaustive list.
There are several modes available. Each mode is entered by saying its name followed by mode. For example, "command mode", "spelling mode". The exception is off mode, which is entered by saying "microphone off". To return to command mode from microphone mode, say "microphone on".
- command mode - the workhorse mode, all commands available, including browsing commands. Emacs commands are not available.
- emacs mode - similar to command mode, but emacs commands are available instead of browsing commands.
- shell mode - provides some additional commands for use of the Unix shell.
- spelling mode - great for spelling words quickly
- mousing mode - faster multiple mouse commands
- off mode (microphone off) - limited recognition, goes back to command mode when "microphone on" is said
- dictation mode - not implemented yet
Displays helpful representative list of possible sentences that can be recognized in the current mode.
- show help
- display help
Examples of the browser grammar:
- browse google
- scroll up
- scroll down
- page up
- new tab
- close tab
Examples of the windows grammar:
- next window
- previous window
- minimize window
- close window
Examples of the emacs grammar:
- open file
- save file
- kill word
- kill line
- line up
- line down
- page up
- page down
Examples of the shell grammar:
- long directory
- dot dot
These are examples from the keys grammar:
- key alpha -> a
- key shift alpha -> A
- key up -> up arrow
- key alt tab -> alt-tab
- key control charlie -> ctrl-c
- spell capital bravo oscar x-ray -> Box
For each key command, a single logical keypress is generated which may consist of several modifiers and a normal key. For each spell command, an entire list of alphabetic and whitespace keys may be given, in addition to a "capital" modifier for any of them.
As is apparent, SpeechLion uses the NATO Phonetic Alphabet for specifying alphabet characters. This is much more accurate than trying to recognize the difference between "m" and "n". SpeechLion can also recognize the standard letter names, though this has a price in accuracy since many letters sound similar.
Spelling mode allows the same letters as the "spell" command, but "spell" is not needed. It is convenient for spelling names, etc.
These are examples from the mouse grammar:
- mouse click -> left button click
- mouse right click -> right button click
- mouse double click -> double click left button
- mouse up -> move pointer up 15 pixels
- mouse north -> move pointer up 15 pixels
- mouse southeast -> move pointer diagonally 15 pixels
- mouse press -> left button click (useful for dragging)
- mouse right release -> right button release (end drag)
- mouse scroll up -> scroll wheel up (away from user)
- mouse scroll down -> scroll wheel down (toward user)
Mousing mode allows a sequence of mouse commands to be given without the "mouse" prefix.
- volume up
- volume down
- volume mute
- volume unmute
- ssh server
- ssh theater
Say "repeat n" where n is a number from 2 to 9. Then the next command will be repeated that many times. This is mainly useful for repeated keys or larger mouse movements.
TTS Grammar (Acknowledgements)
Say "acknowledge off" to turn off spoken acknowledgements via the FreeTTS text-to-speech synthesizer.
Say "acknowledge on" to turn spoken acknowledgements back on.