USAGE

Overview

SpeechLion accepts various commands spoken by the user which produce fake keyboard and mouse actions on the desktop. Because of this, SpeechLion can in theory perform any actions a user sitting at the keyboard can. This does not require any modification of the application programs being controlled.

While SpeechLion's lowest-level keyboard and mouse commands are useful for ad-hoc circumstances, a usable speech application needs customized commands for various applications. For example, to switch to the next window, the user could say "key alt tab" and SpeechLion would generate the alt-tab key sequence, which would cause the windowing system to bring the next window to the foreground. However, it is much easier for the user to say "next window" for that action. An even better example is Emacs: saying "key control x-ray, key control foxtrot" to open a file is much more tedious than simply "open file".

Quick Start

SpeechLion runs from its main directory with the following command:

./speechlion

Place your mouse pointer in your Firefox window and speak the following commands. Be sure to leave a short pause in between to allow them to be recognized and acted upon by your browser:

  • "browse google"
  • "show help"
  • "new tab"
  • "browse slashdot"
  • "page down"
  • "browse yahoo"
  • "scroll down"
  • "back"
  • "forward"
  • "close tab"

Hover your mouse pointer over a link and say "mouse click". Let go of your mouse and say "mouse down". Put your mouse in a text entry box and say "key alpha". Now try "key back space".

Cheat Sheet

The grammar files themselves are the best documentation. They are in JSGF (Java Speech Grammar Format). At any time, you may say "show help" or "display help" and SpeechLion will display a random set of example sentences that are valid in the current mode. It is not an exhaustive list.

Modes

There are several modes available. Each mode is entered by saying its name followed by mode. For example, "command mode", "spelling mode". The exception is off mode, which is entered by saying "microphone off". To return to command mode from microphone mode, say "microphone on".

  • command mode - the workhorse mode, all commands available, including browsing commands. Emacs commands are not available.
  • emacs mode - similar to command mode, but emacs commands are available instead of browsing commands.
  • shell mode - provides some additional commands for use of the Unix shell.
  • spelling mode - great for spelling words quickly
  • mousing mode - faster multiple mouse commands
  • off mode (microphone off) - limited recognition, goes back to command mode when "microphone on" is said
  • dictation mode - not implemented yet

Help Grammar

Displays helpful representative list of possible sentences that can be recognized in the current mode.

  • show help
  • display help

Browser Grammar

Examples of the browser grammar:

  • browse google
  • forward
  • back
  • reload
  • scroll up
  • scroll down
  • page up
  • new tab
  • close tab
  • bookmark

Windows Grammar

Examples of the windows grammar:

  • next window
  • previous window
  • minimize window
  • close window

Emacs Grammar

Examples of the emacs grammar:

  • open file
  • save file
  • kill word
  • kill line
  • line up
  • line down
  • page up
  • page down
  • undo
  • cancel

Shell Grammar

Examples of the shell grammar:

  • directory
  • long directory
  • cd
  • vi
  • slash
  • dot
  • dot dot
  • home
  • usr
  • bin

Keys Grammar

These are examples from the keys grammar:

  • key alpha -> a
  • key shift alpha -> A
  • key up -> up arrow
  • key alt tab -> alt-tab
  • key control charlie -> ctrl-c
  • spell capital bravo oscar x-ray -> Box

For each key command, a single logical keypress is generated which may consist of several modifiers and a normal key. For each spell command, an entire list of alphabetic and whitespace keys may be given, in addition to a "capital" modifier for any of them.

As is apparent, SpeechLion uses the NATO Phonetic Alphabet for specifying alphabet characters. This is much more accurate than trying to recognize the difference between "m" and "n". SpeechLion can also recognize the standard letter names, though this has a price in accuracy since many letters sound similar.

Spelling mode allows the same letters as the "spell" command, but "spell" is not needed. It is convenient for spelling names, etc.

Mouse Grammar

These are examples from the mouse grammar:

  • mouse click -> left button click
  • mouse right click -> right button click
  • mouse double click -> double click left button
  • mouse up -> move pointer up 15 pixels
  • mouse north -> move pointer up 15 pixels
  • mouse southeast -> move pointer diagonally 15 pixels
  • mouse press -> left button click (useful for dragging)
  • mouse right release -> right button release (end drag)
  • mouse scroll up -> scroll wheel up (away from user)
  • mouse scroll down -> scroll wheel down (toward user)

Mousing mode allows a sequence of mouse commands to be given without the "mouse" prefix.

Volume Grammar

  • volume up
  • volume down
  • volume mute
  • volume unmute

SSH Grammar

  • ssh server
  • ssh theater

Repeat Grammar

Say "repeat n" where n is a number from 2 to 9. Then the next command will be repeated that many times. This is mainly useful for repeated keys or larger mouse movements.

TTS Grammar (Acknowledgements)

Say "acknowledge off" to turn off spoken acknowledgements via the FreeTTS text-to-speech synthesizer.

Say "acknowledge on" to turn spoken acknowledgements back on.