Say 'start listening' to start capturing what you say and 'stop listening' to stop.
Recognizing text: