Package Summary
Tags | No category tags. |
Version | 2.1.31 |
License | BSD |
Build type | CATKIN |
Use | RECOMMENDED |
Repository Summary
Checkout URI | https://github.com/jsk-ros-pkg/jsk_3rdparty.git |
VCS Type | git |
VCS Version | master |
Last Updated | 2025-05-13 |
Dev Status | DEVELOPED |
CI status | No Continuous Integration |
Released | RELEASED |
Tags | No category tags. |
Contributing |
Help Wanted (0)
Good First Issues (0) Pull Requests to Review (0) |
Package Description
Additional Links
Maintainers
- Yuki Furuta
Authors
- Yuki Furuta
ros_speech_recognition
A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.
Tutorials
Normal tutorial
- Install this package and SpeechReconition
sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
- Launch speech recognition node
roslaunch ros_speech_recognition speech_recognition.launch
- Echo
/speech_to_text
rostopic echo /speech_to_text
# you can get the recognition result
Parrotry tutorial
Parrotry mean オウム返し in Japanese
# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP
speech_recognition_node.py
Interface
Publishing Topics
-
~voice_topic
(speech_recognition_msgs/SpeechRecognitionCandidates
)Speech recognition candidates topic name.
Topic name is set by parameter
~voice_topic
, and default value isspeech_to_text
. -
sound_play
(sound_play/SoundRequestAction
)Action client to play sound on events. If the action server is not available or
~enable_sound_effect
isFalse
, no sound is played.
Subscribing Topics
-
~audio_topic
(audio_common_msgs/AudioData
)Audio stream data to be recognized.
Topis name is set by parameter
~audio_topic
and default value isaudio
.
Advertising Services
-
speech_recognition
(speech_recognition_msgs/SpeechRecognition
)Service for speech recognition
-
speech_recognition/start
(std_srvs/Empty
)Start service for speech recognition
This service is available when parameter
~contiunous
isTrue
. -
speech_recognition/start
(std_srvs/Empty
)Stop service for speech recognition
This service is available when parameter
~contiunous
isTrue
.
Parameters
-
~voice_topic
(String
, default:speech_to_text
)Publishing voice topic name
-
~audio_topic
(String
, default:audio
)Subscribing audio topic name
-
~enable_sound_effect
(Bool
, default:True
)Flag to enable or disable sound to play sound on recognition.
-
~language
(String
, default:en-US
)Language to be recognized
-
~engine
(Enum[String]
, default:Google
)Speech-to-text engine (To see full options use
dynamic_reconfigure
)
File truncated at 100 lines see the full file
Changelog for package ros_speech_recognition
2.1.31 (2025-05-13)
2.1.30 (2025-05-10)
2.1.29 (2025-01-05)
- [doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
- Contributors: Yukina Iwata
2.1.28 (2023-07-24)
2.1.27 (2023-06-24)
- fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
- Contributors: Kei Okada
2.1.26 (2023-06-14)
- add LICENSE files (#476)
- Contributors: Kei Okada
2.1.25 (2023-06-08)
- [ros_speech_recognition] Add vosk engine (#474)
- Pr/use sound themes freedesktop (#472)
- add test to check if ros node is loadable (#463)
- add self.conf_thresh in __init_ function (#457)
- [ros_speech_recognition] add ubuntu-sounds dependency (#453)
- [ros_speech_recognition] Return if result is empty (#443)
- [ros_speece_recognition] Set confidence value of google (#434)
- [ros_speech_recognition] add parrotry.launch (#414)
- [ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
- [ros_speech_recogniton, respeaker_ros] add confidence field (#411)
- [ros_speech_recognition] add self cancellation for speech recogntion (#413)
- [#405 and #410] Fix CI (#415)
- add ROS interface for https://cloud.google.com/natural-language (#304)
- GithubAction: add test for aarch64(melodic) / indigo (arm64)
(#365)
- pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
- Explicit python interpreter in catkin_virtualenv (#367)
- .github/workflow: integrate all yaml to one (#338)
- [ros_speech_recognition] Fixed the behavior of launch file (#336)
- [ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
- [ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
- Enable sound play flag (#315)
- Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura
2.1.24 (2021-07-26)
2.1.23 (2021-07-21)
2.1.22 (2021-06-10)
- enable to change topic name from speech_recognition.launch (#254)
- support SpeakerDiarization, see
https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative
(#244)
- [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
- more exception message for self.recognize
- Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
- Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa
File truncated at 100 lines see the full file
Wiki Tutorials
Package Dependencies
Deps | Name |
---|---|
catkin_virtualenv | |
dynamic_reconfigure | |
jsk_data | |
speech_recognition_msgs | |
catkin | |
audio_capture | |
audio_common_msgs | |
sound_play | |
rostest | |
roslaunch |
System Dependencies
Dependant Packages
Name | Deps |
---|---|
jsk_3rdparty |
Launch files
- launch/parrotry.launch
-
- use_google [default: true]
- language [default: en-US]
- confidence_threshold [default: 0.8]
- launch/speech_recognition.launch
-
- launch_sound_play [default: true] — Launch sound_play node to speak
- launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
- audio_topic [default: /audio] — Name of audio topic captured from microphone
- voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
- n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
- engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
- language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
- continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
- auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
- self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
- tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
- tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition
Messages
Services
Plugins
Recent questions tagged ros_speech_recognition at Robotics Stack Exchange
Package Summary
Tags | No category tags. |
Version | 2.1.31 |
License | BSD |
Build type | CATKIN |
Use | RECOMMENDED |
Repository Summary
Checkout URI | https://github.com/jsk-ros-pkg/jsk_3rdparty.git |
VCS Type | git |
VCS Version | master |
Last Updated | 2025-05-13 |
Dev Status | DEVELOPED |
CI status | No Continuous Integration |
Released | RELEASED |
Tags | No category tags. |
Contributing |
Help Wanted (0)
Good First Issues (0) Pull Requests to Review (0) |
Package Description
Additional Links
Maintainers
- Yuki Furuta
Authors
- Yuki Furuta
ros_speech_recognition
A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.
Tutorials
Normal tutorial
- Install this package and SpeechReconition
sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
- Launch speech recognition node
roslaunch ros_speech_recognition speech_recognition.launch
- Echo
/speech_to_text
rostopic echo /speech_to_text
# you can get the recognition result
Parrotry tutorial
Parrotry mean オウム返し in Japanese
# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP
speech_recognition_node.py
Interface
Publishing Topics
-
~voice_topic
(speech_recognition_msgs/SpeechRecognitionCandidates
)Speech recognition candidates topic name.
Topic name is set by parameter
~voice_topic
, and default value isspeech_to_text
. -
sound_play
(sound_play/SoundRequestAction
)Action client to play sound on events. If the action server is not available or
~enable_sound_effect
isFalse
, no sound is played.
Subscribing Topics
-
~audio_topic
(audio_common_msgs/AudioData
)Audio stream data to be recognized.
Topis name is set by parameter
~audio_topic
and default value isaudio
.
Advertising Services
-
speech_recognition
(speech_recognition_msgs/SpeechRecognition
)Service for speech recognition
-
speech_recognition/start
(std_srvs/Empty
)Start service for speech recognition
This service is available when parameter
~contiunous
isTrue
. -
speech_recognition/start
(std_srvs/Empty
)Stop service for speech recognition
This service is available when parameter
~contiunous
isTrue
.
Parameters
-
~voice_topic
(String
, default:speech_to_text
)Publishing voice topic name
-
~audio_topic
(String
, default:audio
)Subscribing audio topic name
-
~enable_sound_effect
(Bool
, default:True
)Flag to enable or disable sound to play sound on recognition.
-
~language
(String
, default:en-US
)Language to be recognized
-
~engine
(Enum[String]
, default:Google
)Speech-to-text engine (To see full options use
dynamic_reconfigure
)
File truncated at 100 lines see the full file
Changelog for package ros_speech_recognition
2.1.31 (2025-05-13)
2.1.30 (2025-05-10)
2.1.29 (2025-01-05)
- [doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
- Contributors: Yukina Iwata
2.1.28 (2023-07-24)
2.1.27 (2023-06-24)
- fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
- Contributors: Kei Okada
2.1.26 (2023-06-14)
- add LICENSE files (#476)
- Contributors: Kei Okada
2.1.25 (2023-06-08)
- [ros_speech_recognition] Add vosk engine (#474)
- Pr/use sound themes freedesktop (#472)
- add test to check if ros node is loadable (#463)
- add self.conf_thresh in __init_ function (#457)
- [ros_speech_recognition] add ubuntu-sounds dependency (#453)
- [ros_speech_recognition] Return if result is empty (#443)
- [ros_speece_recognition] Set confidence value of google (#434)
- [ros_speech_recognition] add parrotry.launch (#414)
- [ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
- [ros_speech_recogniton, respeaker_ros] add confidence field (#411)
- [ros_speech_recognition] add self cancellation for speech recogntion (#413)
- [#405 and #410] Fix CI (#415)
- add ROS interface for https://cloud.google.com/natural-language (#304)
- GithubAction: add test for aarch64(melodic) / indigo (arm64)
(#365)
- pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
- Explicit python interpreter in catkin_virtualenv (#367)
- .github/workflow: integrate all yaml to one (#338)
- [ros_speech_recognition] Fixed the behavior of launch file (#336)
- [ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
- [ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
- Enable sound play flag (#315)
- Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura
2.1.24 (2021-07-26)
2.1.23 (2021-07-21)
2.1.22 (2021-06-10)
- enable to change topic name from speech_recognition.launch (#254)
- support SpeakerDiarization, see
https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative
(#244)
- [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
- more exception message for self.recognize
- Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
- Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa
File truncated at 100 lines see the full file
Wiki Tutorials
Package Dependencies
Deps | Name |
---|---|
catkin_virtualenv | |
dynamic_reconfigure | |
jsk_data | |
speech_recognition_msgs | |
catkin | |
audio_capture | |
audio_common_msgs | |
sound_play | |
rostest | |
roslaunch |
System Dependencies
Dependant Packages
Name | Deps |
---|---|
jsk_3rdparty |
Launch files
- launch/parrotry.launch
-
- use_google [default: true]
- language [default: en-US]
- confidence_threshold [default: 0.8]
- launch/speech_recognition.launch
-
- launch_sound_play [default: true] — Launch sound_play node to speak
- launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
- audio_topic [default: /audio] — Name of audio topic captured from microphone
- voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
- n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
- engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
- language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
- continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
- auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
- self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
- tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
- tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition
Messages
Services
Plugins
Recent questions tagged ros_speech_recognition at Robotics Stack Exchange
Package Summary
Tags | No category tags. |
Version | 2.1.31 |
License | BSD |
Build type | CATKIN |
Use | RECOMMENDED |
Repository Summary
Checkout URI | https://github.com/jsk-ros-pkg/jsk_3rdparty.git |
VCS Type | git |
VCS Version | master |
Last Updated | 2025-05-13 |
Dev Status | DEVELOPED |
CI status | No Continuous Integration |
Released | RELEASED |
Tags | No category tags. |
Contributing |
Help Wanted (0)
Good First Issues (0) Pull Requests to Review (0) |
Package Description
Additional Links
Maintainers
- Yuki Furuta
Authors
- Yuki Furuta
ros_speech_recognition
A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.
Tutorials
Normal tutorial
- Install this package and SpeechReconition
sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
- Launch speech recognition node
roslaunch ros_speech_recognition speech_recognition.launch
- Echo
/speech_to_text
rostopic echo /speech_to_text
# you can get the recognition result
Parrotry tutorial
Parrotry mean オウム返し in Japanese
# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP
speech_recognition_node.py
Interface
Publishing Topics
-
~voice_topic
(speech_recognition_msgs/SpeechRecognitionCandidates
)Speech recognition candidates topic name.
Topic name is set by parameter
~voice_topic
, and default value isspeech_to_text
. -
sound_play
(sound_play/SoundRequestAction
)Action client to play sound on events. If the action server is not available or
~enable_sound_effect
isFalse
, no sound is played.
Subscribing Topics
-
~audio_topic
(audio_common_msgs/AudioData
)Audio stream data to be recognized.
Topis name is set by parameter
~audio_topic
and default value isaudio
.
Advertising Services
-
speech_recognition
(speech_recognition_msgs/SpeechRecognition
)Service for speech recognition
-
speech_recognition/start
(std_srvs/Empty
)Start service for speech recognition
This service is available when parameter
~contiunous
isTrue
. -
speech_recognition/start
(std_srvs/Empty
)Stop service for speech recognition
This service is available when parameter
~contiunous
isTrue
.
Parameters
-
~voice_topic
(String
, default:speech_to_text
)Publishing voice topic name
-
~audio_topic
(String
, default:audio
)Subscribing audio topic name
-
~enable_sound_effect
(Bool
, default:True
)Flag to enable or disable sound to play sound on recognition.
-
~language
(String
, default:en-US
)Language to be recognized
-
~engine
(Enum[String]
, default:Google
)Speech-to-text engine (To see full options use
dynamic_reconfigure
)
File truncated at 100 lines see the full file
Changelog for package ros_speech_recognition
2.1.31 (2025-05-13)
2.1.30 (2025-05-10)
2.1.29 (2025-01-05)
- [doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
- Contributors: Yukina Iwata
2.1.28 (2023-07-24)
2.1.27 (2023-06-24)
- fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
- Contributors: Kei Okada
2.1.26 (2023-06-14)
- add LICENSE files (#476)
- Contributors: Kei Okada
2.1.25 (2023-06-08)
- [ros_speech_recognition] Add vosk engine (#474)
- Pr/use sound themes freedesktop (#472)
- add test to check if ros node is loadable (#463)
- add self.conf_thresh in __init_ function (#457)
- [ros_speech_recognition] add ubuntu-sounds dependency (#453)
- [ros_speech_recognition] Return if result is empty (#443)
- [ros_speece_recognition] Set confidence value of google (#434)
- [ros_speech_recognition] add parrotry.launch (#414)
- [ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
- [ros_speech_recogniton, respeaker_ros] add confidence field (#411)
- [ros_speech_recognition] add self cancellation for speech recogntion (#413)
- [#405 and #410] Fix CI (#415)
- add ROS interface for https://cloud.google.com/natural-language (#304)
- GithubAction: add test for aarch64(melodic) / indigo (arm64)
(#365)
- pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
- Explicit python interpreter in catkin_virtualenv (#367)
- .github/workflow: integrate all yaml to one (#338)
- [ros_speech_recognition] Fixed the behavior of launch file (#336)
- [ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
- [ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
- Enable sound play flag (#315)
- Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura
2.1.24 (2021-07-26)
2.1.23 (2021-07-21)
2.1.22 (2021-06-10)
- enable to change topic name from speech_recognition.launch (#254)
- support SpeakerDiarization, see
https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative
(#244)
- [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
- more exception message for self.recognize
- Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
- Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa
File truncated at 100 lines see the full file
Wiki Tutorials
Package Dependencies
Deps | Name |
---|---|
catkin_virtualenv | |
dynamic_reconfigure | |
jsk_data | |
speech_recognition_msgs | |
catkin | |
audio_capture | |
audio_common_msgs | |
sound_play | |
rostest | |
roslaunch |
System Dependencies
Dependant Packages
Name | Deps |
---|---|
jsk_3rdparty |
Launch files
- launch/parrotry.launch
-
- use_google [default: true]
- language [default: en-US]
- confidence_threshold [default: 0.8]
- launch/speech_recognition.launch
-
- launch_sound_play [default: true] — Launch sound_play node to speak
- launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
- audio_topic [default: /audio] — Name of audio topic captured from microphone
- voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
- n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
- engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
- language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
- continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
- auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
- self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
- tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
- tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition
Messages
Services
Plugins
Recent questions tagged ros_speech_recognition at Robotics Stack Exchange
Package Summary
Tags | No category tags. |
Version | 2.1.31 |
License | BSD |
Build type | CATKIN |
Use | RECOMMENDED |
Repository Summary
Checkout URI | https://github.com/jsk-ros-pkg/jsk_3rdparty.git |
VCS Type | git |
VCS Version | master |
Last Updated | 2025-05-13 |
Dev Status | DEVELOPED |
CI status | No Continuous Integration |
Released | RELEASED |
Tags | No category tags. |
Contributing |
Help Wanted (0)
Good First Issues (0) Pull Requests to Review (0) |
Package Description
Additional Links
Maintainers
- Yuki Furuta
Authors
- Yuki Furuta
ros_speech_recognition
A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.
Tutorials
Normal tutorial
- Install this package and SpeechReconition
sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
- Launch speech recognition node
roslaunch ros_speech_recognition speech_recognition.launch
- Echo
/speech_to_text
rostopic echo /speech_to_text
# you can get the recognition result
Parrotry tutorial
Parrotry mean オウム返し in Japanese
# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP
speech_recognition_node.py
Interface
Publishing Topics
-
~voice_topic
(speech_recognition_msgs/SpeechRecognitionCandidates
)Speech recognition candidates topic name.
Topic name is set by parameter
~voice_topic
, and default value isspeech_to_text
. -
sound_play
(sound_play/SoundRequestAction
)Action client to play sound on events. If the action server is not available or
~enable_sound_effect
isFalse
, no sound is played.
Subscribing Topics
-
~audio_topic
(audio_common_msgs/AudioData
)Audio stream data to be recognized.
Topis name is set by parameter
~audio_topic
and default value isaudio
.
Advertising Services
-
speech_recognition
(speech_recognition_msgs/SpeechRecognition
)Service for speech recognition
-
speech_recognition/start
(std_srvs/Empty
)Start service for speech recognition
This service is available when parameter
~contiunous
isTrue
. -
speech_recognition/start
(std_srvs/Empty
)Stop service for speech recognition
This service is available when parameter
~contiunous
isTrue
.
Parameters
-
~voice_topic
(String
, default:speech_to_text
)Publishing voice topic name
-
~audio_topic
(String
, default:audio
)Subscribing audio topic name
-
~enable_sound_effect
(Bool
, default:True
)Flag to enable or disable sound to play sound on recognition.
-
~language
(String
, default:en-US
)Language to be recognized
-
~engine
(Enum[String]
, default:Google
)Speech-to-text engine (To see full options use
dynamic_reconfigure
)
File truncated at 100 lines see the full file
Changelog for package ros_speech_recognition
2.1.31 (2025-05-13)
2.1.30 (2025-05-10)
2.1.29 (2025-01-05)
- [doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
- Contributors: Yukina Iwata
2.1.28 (2023-07-24)
2.1.27 (2023-06-24)
- fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
- Contributors: Kei Okada
2.1.26 (2023-06-14)
- add LICENSE files (#476)
- Contributors: Kei Okada
2.1.25 (2023-06-08)
- [ros_speech_recognition] Add vosk engine (#474)
- Pr/use sound themes freedesktop (#472)
- add test to check if ros node is loadable (#463)
- add self.conf_thresh in __init_ function (#457)
- [ros_speech_recognition] add ubuntu-sounds dependency (#453)
- [ros_speech_recognition] Return if result is empty (#443)
- [ros_speece_recognition] Set confidence value of google (#434)
- [ros_speech_recognition] add parrotry.launch (#414)
- [ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
- [ros_speech_recogniton, respeaker_ros] add confidence field (#411)
- [ros_speech_recognition] add self cancellation for speech recogntion (#413)
- [#405 and #410] Fix CI (#415)
- add ROS interface for https://cloud.google.com/natural-language (#304)
- GithubAction: add test for aarch64(melodic) / indigo (arm64)
(#365)
- pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
- Explicit python interpreter in catkin_virtualenv (#367)
- .github/workflow: integrate all yaml to one (#338)
- [ros_speech_recognition] Fixed the behavior of launch file (#336)
- [ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
- [ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
- Enable sound play flag (#315)
- Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura
2.1.24 (2021-07-26)
2.1.23 (2021-07-21)
2.1.22 (2021-06-10)
- enable to change topic name from speech_recognition.launch (#254)
- support SpeakerDiarization, see
https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative
(#244)
- [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
- more exception message for self.recognize
- Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
- Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa
File truncated at 100 lines see the full file
Wiki Tutorials
Package Dependencies
Deps | Name |
---|---|
catkin_virtualenv | |
dynamic_reconfigure | |
jsk_data | |
speech_recognition_msgs | |
catkin | |
audio_capture | |
audio_common_msgs | |
sound_play | |
rostest | |
roslaunch |
System Dependencies
Dependant Packages
Name | Deps |
---|---|
jsk_3rdparty |
Launch files
- launch/parrotry.launch
-
- use_google [default: true]
- language [default: en-US]
- confidence_threshold [default: 0.8]
- launch/speech_recognition.launch
-
- launch_sound_play [default: true] — Launch sound_play node to speak
- launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
- audio_topic [default: /audio] — Name of audio topic captured from microphone
- voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
- n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
- engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
- language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
- continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
- auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
- self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
- tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
- tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition
Messages
Services
Plugins
Recent questions tagged ros_speech_recognition at Robotics Stack Exchange
Package Summary
Tags | No category tags. |
Version | 2.1.31 |
License | BSD |
Build type | CATKIN |
Use | RECOMMENDED |
Repository Summary
Checkout URI | https://github.com/jsk-ros-pkg/jsk_3rdparty.git |
VCS Type | git |
VCS Version | master |
Last Updated | 2025-05-13 |
Dev Status | DEVELOPED |
CI status | No Continuous Integration |
Released | RELEASED |
Tags | No category tags. |
Contributing |
Help Wanted (0)
Good First Issues (0) Pull Requests to Review (0) |
Package Description
Additional Links
Maintainers
- Yuki Furuta
Authors
- Yuki Furuta
ros_speech_recognition
A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.
Tutorials
Normal tutorial
- Install this package and SpeechReconition
sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
- Launch speech recognition node
roslaunch ros_speech_recognition speech_recognition.launch
- Echo
/speech_to_text
rostopic echo /speech_to_text
# you can get the recognition result
Parrotry tutorial
Parrotry mean オウム返し in Japanese
# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP
speech_recognition_node.py
Interface
Publishing Topics
-
~voice_topic
(speech_recognition_msgs/SpeechRecognitionCandidates
)Speech recognition candidates topic name.
Topic name is set by parameter
~voice_topic
, and default value isspeech_to_text
. -
sound_play
(sound_play/SoundRequestAction
)Action client to play sound on events. If the action server is not available or
~enable_sound_effect
isFalse
, no sound is played.
Subscribing Topics
-
~audio_topic
(audio_common_msgs/AudioData
)Audio stream data to be recognized.
Topis name is set by parameter
~audio_topic
and default value isaudio
.
Advertising Services
-
speech_recognition
(speech_recognition_msgs/SpeechRecognition
)Service for speech recognition
-
speech_recognition/start
(std_srvs/Empty
)Start service for speech recognition
This service is available when parameter
~contiunous
isTrue
. -
speech_recognition/start
(std_srvs/Empty
)Stop service for speech recognition
This service is available when parameter
~contiunous
isTrue
.
Parameters
-
~voice_topic
(String
, default:speech_to_text
)Publishing voice topic name
-
~audio_topic
(String
, default:audio
)Subscribing audio topic name
-
~enable_sound_effect
(Bool
, default:True
)Flag to enable or disable sound to play sound on recognition.
-
~language
(String
, default:en-US
)Language to be recognized
-
~engine
(Enum[String]
, default:Google
)Speech-to-text engine (To see full options use
dynamic_reconfigure
)
File truncated at 100 lines see the full file
Changelog for package ros_speech_recognition
2.1.31 (2025-05-13)
2.1.30 (2025-05-10)
2.1.29 (2025-01-05)
- [doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
- Contributors: Yukina Iwata
2.1.28 (2023-07-24)
2.1.27 (2023-06-24)
- fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
- Contributors: Kei Okada
2.1.26 (2023-06-14)
- add LICENSE files (#476)
- Contributors: Kei Okada
2.1.25 (2023-06-08)
- [ros_speech_recognition] Add vosk engine (#474)
- Pr/use sound themes freedesktop (#472)
- add test to check if ros node is loadable (#463)
- add self.conf_thresh in __init_ function (#457)
- [ros_speech_recognition] add ubuntu-sounds dependency (#453)
- [ros_speech_recognition] Return if result is empty (#443)
- [ros_speece_recognition] Set confidence value of google (#434)
- [ros_speech_recognition] add parrotry.launch (#414)
- [ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
- [ros_speech_recogniton, respeaker_ros] add confidence field (#411)
- [ros_speech_recognition] add self cancellation for speech recogntion (#413)
- [#405 and #410] Fix CI (#415)
- add ROS interface for https://cloud.google.com/natural-language (#304)
- GithubAction: add test for aarch64(melodic) / indigo (arm64)
(#365)
- pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
- Explicit python interpreter in catkin_virtualenv (#367)
- .github/workflow: integrate all yaml to one (#338)
- [ros_speech_recognition] Fixed the behavior of launch file (#336)
- [ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
- [ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
- Enable sound play flag (#315)
- Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura
2.1.24 (2021-07-26)
2.1.23 (2021-07-21)
2.1.22 (2021-06-10)
- enable to change topic name from speech_recognition.launch (#254)
- support SpeakerDiarization, see
https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative
(#244)
- [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
- more exception message for self.recognize
- Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
- Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa
File truncated at 100 lines see the full file
Wiki Tutorials
Package Dependencies
Deps | Name |
---|---|
catkin_virtualenv | |
dynamic_reconfigure | |
jsk_data | |
speech_recognition_msgs | |
catkin | |
audio_capture | |
audio_common_msgs | |
sound_play | |
rostest | |
roslaunch |
System Dependencies
Dependant Packages
Name | Deps |
---|---|
jsk_3rdparty |
Launch files
- launch/parrotry.launch
-
- use_google [default: true]
- language [default: en-US]
- confidence_threshold [default: 0.8]
- launch/speech_recognition.launch
-
- launch_sound_play [default: true] — Launch sound_play node to speak
- launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
- audio_topic [default: /audio] — Name of audio topic captured from microphone
- voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
- n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
- engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
- language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
- continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
- auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
- self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
- tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
- tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition
Messages
Services
Plugins
Recent questions tagged ros_speech_recognition at Robotics Stack Exchange
Package Summary
Tags | No category tags. |
Version | 2.1.31 |
License | BSD |
Build type | CATKIN |
Use | RECOMMENDED |
Repository Summary
Checkout URI | https://github.com/jsk-ros-pkg/jsk_3rdparty.git |
VCS Type | git |
VCS Version | master |
Last Updated | 2025-05-13 |
Dev Status | DEVELOPED |
CI status | No Continuous Integration |
Released | RELEASED |
Tags | No category tags. |
Contributing |
Help Wanted (0)
Good First Issues (0) Pull Requests to Review (0) |
Package Description
Additional Links
Maintainers
- Yuki Furuta
Authors
- Yuki Furuta
ros_speech_recognition
A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.
Tutorials
Normal tutorial
- Install this package and SpeechReconition
sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
- Launch speech recognition node
roslaunch ros_speech_recognition speech_recognition.launch
- Echo
/speech_to_text
rostopic echo /speech_to_text
# you can get the recognition result
Parrotry tutorial
Parrotry mean オウム返し in Japanese
# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP
speech_recognition_node.py
Interface
Publishing Topics
-
~voice_topic
(speech_recognition_msgs/SpeechRecognitionCandidates
)Speech recognition candidates topic name.
Topic name is set by parameter
~voice_topic
, and default value isspeech_to_text
. -
sound_play
(sound_play/SoundRequestAction
)Action client to play sound on events. If the action server is not available or
~enable_sound_effect
isFalse
, no sound is played.
Subscribing Topics
-
~audio_topic
(audio_common_msgs/AudioData
)Audio stream data to be recognized.
Topis name is set by parameter
~audio_topic
and default value isaudio
.
Advertising Services
-
speech_recognition
(speech_recognition_msgs/SpeechRecognition
)Service for speech recognition
-
speech_recognition/start
(std_srvs/Empty
)Start service for speech recognition
This service is available when parameter
~contiunous
isTrue
. -
speech_recognition/start
(std_srvs/Empty
)Stop service for speech recognition
This service is available when parameter
~contiunous
isTrue
.
Parameters
-
~voice_topic
(String
, default:speech_to_text
)Publishing voice topic name
-
~audio_topic
(String
, default:audio
)Subscribing audio topic name
-
~enable_sound_effect
(Bool
, default:True
)Flag to enable or disable sound to play sound on recognition.
-
~language
(String
, default:en-US
)Language to be recognized
-
~engine
(Enum[String]
, default:Google
)Speech-to-text engine (To see full options use
dynamic_reconfigure
)
File truncated at 100 lines see the full file
Changelog for package ros_speech_recognition
2.1.31 (2025-05-13)
2.1.30 (2025-05-10)
2.1.29 (2025-01-05)
- [doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
- Contributors: Yukina Iwata
2.1.28 (2023-07-24)
2.1.27 (2023-06-24)
- fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
- Contributors: Kei Okada
2.1.26 (2023-06-14)
- add LICENSE files (#476)
- Contributors: Kei Okada
2.1.25 (2023-06-08)
- [ros_speech_recognition] Add vosk engine (#474)
- Pr/use sound themes freedesktop (#472)
- add test to check if ros node is loadable (#463)
- add self.conf_thresh in __init_ function (#457)
- [ros_speech_recognition] add ubuntu-sounds dependency (#453)
- [ros_speech_recognition] Return if result is empty (#443)
- [ros_speece_recognition] Set confidence value of google (#434)
- [ros_speech_recognition] add parrotry.launch (#414)
- [ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
- [ros_speech_recogniton, respeaker_ros] add confidence field (#411)
- [ros_speech_recognition] add self cancellation for speech recogntion (#413)
- [#405 and #410] Fix CI (#415)
- add ROS interface for https://cloud.google.com/natural-language (#304)
- GithubAction: add test for aarch64(melodic) / indigo (arm64)
(#365)
- pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
- Explicit python interpreter in catkin_virtualenv (#367)
- .github/workflow: integrate all yaml to one (#338)
- [ros_speech_recognition] Fixed the behavior of launch file (#336)
- [ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
- [ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
- Enable sound play flag (#315)
- Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura
2.1.24 (2021-07-26)
2.1.23 (2021-07-21)
2.1.22 (2021-06-10)
- enable to change topic name from speech_recognition.launch (#254)
- support SpeakerDiarization, see
https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative
(#244)
- [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
- more exception message for self.recognize
- Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
- Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa
File truncated at 100 lines see the full file
Wiki Tutorials
Package Dependencies
Deps | Name |
---|---|
catkin_virtualenv | |
dynamic_reconfigure | |
jsk_data | |
speech_recognition_msgs | |
catkin | |
audio_capture | |
audio_common_msgs | |
sound_play | |
rostest | |
roslaunch |
System Dependencies
Dependant Packages
Name | Deps |
---|---|
jsk_3rdparty |
Launch files
- launch/parrotry.launch
-
- use_google [default: true]
- language [default: en-US]
- confidence_threshold [default: 0.8]
- launch/speech_recognition.launch
-
- launch_sound_play [default: true] — Launch sound_play node to speak
- launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
- audio_topic [default: /audio] — Name of audio topic captured from microphone
- voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
- n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
- engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
- language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
- continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
- auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
- self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
- tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
- tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition
Messages
Services
Plugins
Recent questions tagged ros_speech_recognition at Robotics Stack Exchange
Package Summary
Tags | No category tags. |
Version | 2.1.31 |
License | BSD |
Build type | CATKIN |
Use | RECOMMENDED |
Repository Summary
Checkout URI | https://github.com/jsk-ros-pkg/jsk_3rdparty.git |
VCS Type | git |
VCS Version | master |
Last Updated | 2025-05-13 |
Dev Status | DEVELOPED |
CI status | No Continuous Integration |
Released | RELEASED |
Tags | No category tags. |
Contributing |
Help Wanted (0)
Good First Issues (0) Pull Requests to Review (0) |
Package Description
Additional Links
Maintainers
- Yuki Furuta
Authors
- Yuki Furuta
ros_speech_recognition
A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.
Tutorials
Normal tutorial
- Install this package and SpeechReconition
sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
- Launch speech recognition node
roslaunch ros_speech_recognition speech_recognition.launch
- Echo
/speech_to_text
rostopic echo /speech_to_text
# you can get the recognition result
Parrotry tutorial
Parrotry mean オウム返し in Japanese
# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP
speech_recognition_node.py
Interface
Publishing Topics
-
~voice_topic
(speech_recognition_msgs/SpeechRecognitionCandidates
)Speech recognition candidates topic name.
Topic name is set by parameter
~voice_topic
, and default value isspeech_to_text
. -
sound_play
(sound_play/SoundRequestAction
)Action client to play sound on events. If the action server is not available or
~enable_sound_effect
isFalse
, no sound is played.
Subscribing Topics
-
~audio_topic
(audio_common_msgs/AudioData
)Audio stream data to be recognized.
Topis name is set by parameter
~audio_topic
and default value isaudio
.
Advertising Services
-
speech_recognition
(speech_recognition_msgs/SpeechRecognition
)Service for speech recognition
-
speech_recognition/start
(std_srvs/Empty
)Start service for speech recognition
This service is available when parameter
~contiunous
isTrue
. -
speech_recognition/start
(std_srvs/Empty
)Stop service for speech recognition
This service is available when parameter
~contiunous
isTrue
.
Parameters
-
~voice_topic
(String
, default:speech_to_text
)Publishing voice topic name
-
~audio_topic
(String
, default:audio
)Subscribing audio topic name
-
~enable_sound_effect
(Bool
, default:True
)Flag to enable or disable sound to play sound on recognition.
-
~language
(String
, default:en-US
)Language to be recognized
-
~engine
(Enum[String]
, default:Google
)Speech-to-text engine (To see full options use
dynamic_reconfigure
)
File truncated at 100 lines see the full file
Changelog for package ros_speech_recognition
2.1.31 (2025-05-13)
2.1.30 (2025-05-10)
2.1.29 (2025-01-05)
- [doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
- Contributors: Yukina Iwata
2.1.28 (2023-07-24)
2.1.27 (2023-06-24)
- fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
- Contributors: Kei Okada
2.1.26 (2023-06-14)
- add LICENSE files (#476)
- Contributors: Kei Okada
2.1.25 (2023-06-08)
- [ros_speech_recognition] Add vosk engine (#474)
- Pr/use sound themes freedesktop (#472)
- add test to check if ros node is loadable (#463)
- add self.conf_thresh in __init_ function (#457)
- [ros_speech_recognition] add ubuntu-sounds dependency (#453)
- [ros_speech_recognition] Return if result is empty (#443)
- [ros_speece_recognition] Set confidence value of google (#434)
- [ros_speech_recognition] add parrotry.launch (#414)
- [ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
- [ros_speech_recogniton, respeaker_ros] add confidence field (#411)
- [ros_speech_recognition] add self cancellation for speech recogntion (#413)
- [#405 and #410] Fix CI (#415)
- add ROS interface for https://cloud.google.com/natural-language (#304)
- GithubAction: add test for aarch64(melodic) / indigo (arm64)
(#365)
- pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
- Explicit python interpreter in catkin_virtualenv (#367)
- .github/workflow: integrate all yaml to one (#338)
- [ros_speech_recognition] Fixed the behavior of launch file (#336)
- [ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
- [ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
- Enable sound play flag (#315)
- Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura
2.1.24 (2021-07-26)
2.1.23 (2021-07-21)
2.1.22 (2021-06-10)
- enable to change topic name from speech_recognition.launch (#254)
- support SpeakerDiarization, see
https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative
(#244)
- [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
- more exception message for self.recognize
- Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
- Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa
File truncated at 100 lines see the full file
Wiki Tutorials
Package Dependencies
Deps | Name |
---|---|
catkin_virtualenv | |
dynamic_reconfigure | |
jsk_data | |
speech_recognition_msgs | |
catkin | |
audio_capture | |
audio_common_msgs | |
sound_play | |
rostest | |
roslaunch |
System Dependencies
Dependant Packages
Name | Deps |
---|---|
jsk_3rdparty |
Launch files
- launch/parrotry.launch
-
- use_google [default: true]
- language [default: en-US]
- confidence_threshold [default: 0.8]
- launch/speech_recognition.launch
-
- launch_sound_play [default: true] — Launch sound_play node to speak
- launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
- audio_topic [default: /audio] — Name of audio topic captured from microphone
- voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
- n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
- engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
- language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
- continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
- auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
- self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
- tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
- tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition
Messages
Services
Plugins
Recent questions tagged ros_speech_recognition at Robotics Stack Exchange
Package Summary
Tags | No category tags. |
Version | 2.1.31 |
License | BSD |
Build type | CATKIN |
Use | RECOMMENDED |
Repository Summary
Checkout URI | https://github.com/jsk-ros-pkg/jsk_3rdparty.git |
VCS Type | git |
VCS Version | master |
Last Updated | 2025-05-13 |
Dev Status | DEVELOPED |
CI status | No Continuous Integration |
Released | RELEASED |
Tags | No category tags. |
Contributing |
Help Wanted (0)
Good First Issues (0) Pull Requests to Review (0) |
Package Description
Additional Links
Maintainers
- Yuki Furuta
Authors
- Yuki Furuta
ros_speech_recognition
A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.
Tutorials
Normal tutorial
- Install this package and SpeechReconition
sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
- Launch speech recognition node
roslaunch ros_speech_recognition speech_recognition.launch
- Echo
/speech_to_text
rostopic echo /speech_to_text
# you can get the recognition result
Parrotry tutorial
Parrotry mean オウム返し in Japanese
# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP
speech_recognition_node.py
Interface
Publishing Topics
-
~voice_topic
(speech_recognition_msgs/SpeechRecognitionCandidates
)Speech recognition candidates topic name.
Topic name is set by parameter
~voice_topic
, and default value isspeech_to_text
. -
sound_play
(sound_play/SoundRequestAction
)Action client to play sound on events. If the action server is not available or
~enable_sound_effect
isFalse
, no sound is played.
Subscribing Topics
-
~audio_topic
(audio_common_msgs/AudioData
)Audio stream data to be recognized.
Topis name is set by parameter
~audio_topic
and default value isaudio
.
Advertising Services
-
speech_recognition
(speech_recognition_msgs/SpeechRecognition
)Service for speech recognition
-
speech_recognition/start
(std_srvs/Empty
)Start service for speech recognition
This service is available when parameter
~contiunous
isTrue
. -
speech_recognition/start
(std_srvs/Empty
)Stop service for speech recognition
This service is available when parameter
~contiunous
isTrue
.
Parameters
-
~voice_topic
(String
, default:speech_to_text
)Publishing voice topic name
-
~audio_topic
(String
, default:audio
)Subscribing audio topic name
-
~enable_sound_effect
(Bool
, default:True
)Flag to enable or disable sound to play sound on recognition.
-
~language
(String
, default:en-US
)Language to be recognized
-
~engine
(Enum[String]
, default:Google
)Speech-to-text engine (To see full options use
dynamic_reconfigure
)
File truncated at 100 lines see the full file
Changelog for package ros_speech_recognition
2.1.31 (2025-05-13)
2.1.30 (2025-05-10)
2.1.29 (2025-01-05)
- [doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
- Contributors: Yukina Iwata
2.1.28 (2023-07-24)
2.1.27 (2023-06-24)
- fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
- Contributors: Kei Okada
2.1.26 (2023-06-14)
- add LICENSE files (#476)
- Contributors: Kei Okada
2.1.25 (2023-06-08)
- [ros_speech_recognition] Add vosk engine (#474)
- Pr/use sound themes freedesktop (#472)
- add test to check if ros node is loadable (#463)
- add self.conf_thresh in __init_ function (#457)
- [ros_speech_recognition] add ubuntu-sounds dependency (#453)
- [ros_speech_recognition] Return if result is empty (#443)
- [ros_speece_recognition] Set confidence value of google (#434)
- [ros_speech_recognition] add parrotry.launch (#414)
- [ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
- [ros_speech_recogniton, respeaker_ros] add confidence field (#411)
- [ros_speech_recognition] add self cancellation for speech recogntion (#413)
- [#405 and #410] Fix CI (#415)
- add ROS interface for https://cloud.google.com/natural-language (#304)
- GithubAction: add test for aarch64(melodic) / indigo (arm64)
(#365)
- pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
- Explicit python interpreter in catkin_virtualenv (#367)
- .github/workflow: integrate all yaml to one (#338)
- [ros_speech_recognition] Fixed the behavior of launch file (#336)
- [ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
- [ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
- Enable sound play flag (#315)
- Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura
2.1.24 (2021-07-26)
2.1.23 (2021-07-21)
2.1.22 (2021-06-10)
- enable to change topic name from speech_recognition.launch (#254)
- support SpeakerDiarization, see
https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative
(#244)
- [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
- more exception message for self.recognize
- Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
- Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa
File truncated at 100 lines see the full file
Wiki Tutorials
Package Dependencies
Deps | Name |
---|---|
catkin_virtualenv | |
dynamic_reconfigure | |
jsk_data | |
speech_recognition_msgs | |
catkin | |
audio_capture | |
audio_common_msgs | |
sound_play | |
rostest | |
roslaunch |
System Dependencies
Dependant Packages
Name | Deps |
---|---|
jsk_3rdparty |
Launch files
- launch/parrotry.launch
-
- use_google [default: true]
- language [default: en-US]
- confidence_threshold [default: 0.8]
- launch/speech_recognition.launch
-
- launch_sound_play [default: true] — Launch sound_play node to speak
- launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
- audio_topic [default: /audio] — Name of audio topic captured from microphone
- voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
- n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
- engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
- language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
- continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
- auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
- self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
- tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
- tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition
Messages
Services
Plugins
Recent questions tagged ros_speech_recognition at Robotics Stack Exchange
Package Summary
Tags | No category tags. |
Version | 2.1.31 |
License | BSD |
Build type | CATKIN |
Use | RECOMMENDED |
Repository Summary
Checkout URI | https://github.com/jsk-ros-pkg/jsk_3rdparty.git |
VCS Type | git |
VCS Version | master |
Last Updated | 2025-05-13 |
Dev Status | DEVELOPED |
CI status | No Continuous Integration |
Released | RELEASED |
Tags | No category tags. |
Contributing |
Help Wanted (0)
Good First Issues (0) Pull Requests to Review (0) |
Package Description
Additional Links
Maintainers
- Yuki Furuta
Authors
- Yuki Furuta
ros_speech_recognition
A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.
Tutorials
Normal tutorial
- Install this package and SpeechReconition
sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
- Launch speech recognition node
roslaunch ros_speech_recognition speech_recognition.launch
- Echo
/speech_to_text
rostopic echo /speech_to_text
# you can get the recognition result
Parrotry tutorial
Parrotry mean オウム返し in Japanese
# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP
speech_recognition_node.py
Interface
Publishing Topics
-
~voice_topic
(speech_recognition_msgs/SpeechRecognitionCandidates
)Speech recognition candidates topic name.
Topic name is set by parameter
~voice_topic
, and default value isspeech_to_text
. -
sound_play
(sound_play/SoundRequestAction
)Action client to play sound on events. If the action server is not available or
~enable_sound_effect
isFalse
, no sound is played.
Subscribing Topics
-
~audio_topic
(audio_common_msgs/AudioData
)Audio stream data to be recognized.
Topis name is set by parameter
~audio_topic
and default value isaudio
.
Advertising Services
-
speech_recognition
(speech_recognition_msgs/SpeechRecognition
)Service for speech recognition
-
speech_recognition/start
(std_srvs/Empty
)Start service for speech recognition
This service is available when parameter
~contiunous
isTrue
. -
speech_recognition/start
(std_srvs/Empty
)Stop service for speech recognition
This service is available when parameter
~contiunous
isTrue
.
Parameters
-
~voice_topic
(String
, default:speech_to_text
)Publishing voice topic name
-
~audio_topic
(String
, default:audio
)Subscribing audio topic name
-
~enable_sound_effect
(Bool
, default:True
)Flag to enable or disable sound to play sound on recognition.
-
~language
(String
, default:en-US
)Language to be recognized
-
~engine
(Enum[String]
, default:Google
)Speech-to-text engine (To see full options use
dynamic_reconfigure
)
File truncated at 100 lines see the full file
Changelog for package ros_speech_recognition
2.1.31 (2025-05-13)
2.1.30 (2025-05-10)
2.1.29 (2025-01-05)
- [doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
- Contributors: Yukina Iwata
2.1.28 (2023-07-24)
2.1.27 (2023-06-24)
- fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
- Contributors: Kei Okada
2.1.26 (2023-06-14)
- add LICENSE files (#476)
- Contributors: Kei Okada
2.1.25 (2023-06-08)
- [ros_speech_recognition] Add vosk engine (#474)
- Pr/use sound themes freedesktop (#472)
- add test to check if ros node is loadable (#463)
- add self.conf_thresh in __init_ function (#457)
- [ros_speech_recognition] add ubuntu-sounds dependency (#453)
- [ros_speech_recognition] Return if result is empty (#443)
- [ros_speece_recognition] Set confidence value of google (#434)
- [ros_speech_recognition] add parrotry.launch (#414)
- [ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
- [ros_speech_recogniton, respeaker_ros] add confidence field (#411)
- [ros_speech_recognition] add self cancellation for speech recogntion (#413)
- [#405 and #410] Fix CI (#415)
- add ROS interface for https://cloud.google.com/natural-language (#304)
- GithubAction: add test for aarch64(melodic) / indigo (arm64)
(#365)
- pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
- Explicit python interpreter in catkin_virtualenv (#367)
- .github/workflow: integrate all yaml to one (#338)
- [ros_speech_recognition] Fixed the behavior of launch file (#336)
- [ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
- [ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
- Enable sound play flag (#315)
- Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura
2.1.24 (2021-07-26)
2.1.23 (2021-07-21)
2.1.22 (2021-06-10)
- enable to change topic name from speech_recognition.launch (#254)
- support SpeakerDiarization, see
https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative
(#244)
- [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
- more exception message for self.recognize
- Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
- Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa
File truncated at 100 lines see the full file
Wiki Tutorials
Package Dependencies
Deps | Name |
---|---|
catkin_virtualenv | |
dynamic_reconfigure | |
jsk_data | |
speech_recognition_msgs | |
catkin | |
audio_capture | |
audio_common_msgs | |
sound_play | |
rostest | |
roslaunch |
System Dependencies
Dependant Packages
Name | Deps |
---|---|
jsk_3rdparty |
Launch files
- launch/parrotry.launch
-
- use_google [default: true]
- language [default: en-US]
- confidence_threshold [default: 0.8]
- launch/speech_recognition.launch
-
- launch_sound_play [default: true] — Launch sound_play node to speak
- launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
- audio_topic [default: /audio] — Name of audio topic captured from microphone
- voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
- n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
- engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
- language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
- continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
- auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
- self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
- tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
- tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition
Messages
Services
Plugins
Recent questions tagged ros_speech_recognition at Robotics Stack Exchange
Package Summary
Tags | No category tags. |
Version | 2.1.31 |
License | BSD |
Build type | CATKIN |
Use | RECOMMENDED |
Repository Summary
Checkout URI | https://github.com/jsk-ros-pkg/jsk_3rdparty.git |
VCS Type | git |
VCS Version | master |
Last Updated | 2025-05-13 |
Dev Status | DEVELOPED |
CI status | No Continuous Integration |
Released | RELEASED |
Tags | No category tags. |
Contributing |
Help Wanted (0)
Good First Issues (0) Pull Requests to Review (0) |
Package Description
Additional Links
Maintainers
- Yuki Furuta
Authors
- Yuki Furuta
ros_speech_recognition
A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.
Tutorials
Normal tutorial
- Install this package and SpeechReconition
sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
- Launch speech recognition node
roslaunch ros_speech_recognition speech_recognition.launch
- Echo
/speech_to_text
rostopic echo /speech_to_text
# you can get the recognition result
Parrotry tutorial
Parrotry mean オウム返し in Japanese
# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP
speech_recognition_node.py
Interface
Publishing Topics
-
~voice_topic
(speech_recognition_msgs/SpeechRecognitionCandidates
)Speech recognition candidates topic name.
Topic name is set by parameter
~voice_topic
, and default value isspeech_to_text
. -
sound_play
(sound_play/SoundRequestAction
)Action client to play sound on events. If the action server is not available or
~enable_sound_effect
isFalse
, no sound is played.
Subscribing Topics
-
~audio_topic
(audio_common_msgs/AudioData
)Audio stream data to be recognized.
Topis name is set by parameter
~audio_topic
and default value isaudio
.
Advertising Services
-
speech_recognition
(speech_recognition_msgs/SpeechRecognition
)Service for speech recognition
-
speech_recognition/start
(std_srvs/Empty
)Start service for speech recognition
This service is available when parameter
~contiunous
isTrue
. -
speech_recognition/start
(std_srvs/Empty
)Stop service for speech recognition
This service is available when parameter
~contiunous
isTrue
.
Parameters
-
~voice_topic
(String
, default:speech_to_text
)Publishing voice topic name
-
~audio_topic
(String
, default:audio
)Subscribing audio topic name
-
~enable_sound_effect
(Bool
, default:True
)Flag to enable or disable sound to play sound on recognition.
-
~language
(String
, default:en-US
)Language to be recognized
-
~engine
(Enum[String]
, default:Google
)Speech-to-text engine (To see full options use
dynamic_reconfigure
)
File truncated at 100 lines see the full file
Changelog for package ros_speech_recognition
2.1.31 (2025-05-13)
2.1.30 (2025-05-10)
2.1.29 (2025-01-05)
- [doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
- Contributors: Yukina Iwata
2.1.28 (2023-07-24)
2.1.27 (2023-06-24)
- fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
- Contributors: Kei Okada
2.1.26 (2023-06-14)
- add LICENSE files (#476)
- Contributors: Kei Okada
2.1.25 (2023-06-08)
- [ros_speech_recognition] Add vosk engine (#474)
- Pr/use sound themes freedesktop (#472)
- add test to check if ros node is loadable (#463)
- add self.conf_thresh in __init_ function (#457)
- [ros_speech_recognition] add ubuntu-sounds dependency (#453)
- [ros_speech_recognition] Return if result is empty (#443)
- [ros_speece_recognition] Set confidence value of google (#434)
- [ros_speech_recognition] add parrotry.launch (#414)
- [ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
- [ros_speech_recogniton, respeaker_ros] add confidence field (#411)
- [ros_speech_recognition] add self cancellation for speech recogntion (#413)
- [#405 and #410] Fix CI (#415)
- add ROS interface for https://cloud.google.com/natural-language (#304)
- GithubAction: add test for aarch64(melodic) / indigo (arm64)
(#365)
- pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
- Explicit python interpreter in catkin_virtualenv (#367)
- .github/workflow: integrate all yaml to one (#338)
- [ros_speech_recognition] Fixed the behavior of launch file (#336)
- [ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
- [ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
- Enable sound play flag (#315)
- Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura
2.1.24 (2021-07-26)
2.1.23 (2021-07-21)
2.1.22 (2021-06-10)
- enable to change topic name from speech_recognition.launch (#254)
- support SpeakerDiarization, see
https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative
(#244)
- [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
- more exception message for self.recognize
- Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
- Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa
File truncated at 100 lines see the full file
Wiki Tutorials
Package Dependencies
Deps | Name |
---|---|
catkin_virtualenv | |
dynamic_reconfigure | |
jsk_data | |
speech_recognition_msgs | |
catkin | |
audio_capture | |
audio_common_msgs | |
sound_play | |
rostest | |
roslaunch |
System Dependencies
Dependant Packages
Name | Deps |
---|---|
jsk_3rdparty |
Launch files
- launch/parrotry.launch
-
- use_google [default: true]
- language [default: en-US]
- confidence_threshold [default: 0.8]
- launch/speech_recognition.launch
-
- launch_sound_play [default: true] — Launch sound_play node to speak
- launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
- audio_topic [default: /audio] — Name of audio topic captured from microphone
- voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
- n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
- engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
- language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
- continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
- auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
- self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
- tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
- tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition
Messages
Services
Plugins
Recent questions tagged ros_speech_recognition at Robotics Stack Exchange
Package Summary
Tags | No category tags. |
Version | 2.1.31 |
License | BSD |
Build type | CATKIN |
Use | RECOMMENDED |
Repository Summary
Checkout URI | https://github.com/jsk-ros-pkg/jsk_3rdparty.git |
VCS Type | git |
VCS Version | master |
Last Updated | 2025-05-13 |
Dev Status | DEVELOPED |
CI status | No Continuous Integration |
Released | RELEASED |
Tags | No category tags. |
Contributing |
Help Wanted (0)
Good First Issues (0) Pull Requests to Review (0) |
Package Description
Additional Links
Maintainers
- Yuki Furuta
Authors
- Yuki Furuta
ros_speech_recognition
A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.
Tutorials
Normal tutorial
- Install this package and SpeechReconition
sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
- Launch speech recognition node
roslaunch ros_speech_recognition speech_recognition.launch
- Echo
/speech_to_text
rostopic echo /speech_to_text
# you can get the recognition result
Parrotry tutorial
Parrotry mean オウム返し in Japanese
# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP
speech_recognition_node.py
Interface
Publishing Topics
-
~voice_topic
(speech_recognition_msgs/SpeechRecognitionCandidates
)Speech recognition candidates topic name.
Topic name is set by parameter
~voice_topic
, and default value isspeech_to_text
. -
sound_play
(sound_play/SoundRequestAction
)Action client to play sound on events. If the action server is not available or
~enable_sound_effect
isFalse
, no sound is played.
Subscribing Topics
-
~audio_topic
(audio_common_msgs/AudioData
)Audio stream data to be recognized.
Topis name is set by parameter
~audio_topic
and default value isaudio
.
Advertising Services
-
speech_recognition
(speech_recognition_msgs/SpeechRecognition
)Service for speech recognition
-
speech_recognition/start
(std_srvs/Empty
)Start service for speech recognition
This service is available when parameter
~contiunous
isTrue
. -
speech_recognition/start
(std_srvs/Empty
)Stop service for speech recognition
This service is available when parameter
~contiunous
isTrue
.
Parameters
-
~voice_topic
(String
, default:speech_to_text
)Publishing voice topic name
-
~audio_topic
(String
, default:audio
)Subscribing audio topic name
-
~enable_sound_effect
(Bool
, default:True
)Flag to enable or disable sound to play sound on recognition.
-
~language
(String
, default:en-US
)Language to be recognized
-
~engine
(Enum[String]
, default:Google
)Speech-to-text engine (To see full options use
dynamic_reconfigure
)
File truncated at 100 lines see the full file
Changelog for package ros_speech_recognition
2.1.31 (2025-05-13)
2.1.30 (2025-05-10)
2.1.29 (2025-01-05)
- [doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
- Contributors: Yukina Iwata
2.1.28 (2023-07-24)
2.1.27 (2023-06-24)
- fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
- Contributors: Kei Okada
2.1.26 (2023-06-14)
- add LICENSE files (#476)
- Contributors: Kei Okada
2.1.25 (2023-06-08)
- [ros_speech_recognition] Add vosk engine (#474)
- Pr/use sound themes freedesktop (#472)
- add test to check if ros node is loadable (#463)
- add self.conf_thresh in __init_ function (#457)
- [ros_speech_recognition] add ubuntu-sounds dependency (#453)
- [ros_speech_recognition] Return if result is empty (#443)
- [ros_speece_recognition] Set confidence value of google (#434)
- [ros_speech_recognition] add parrotry.launch (#414)
- [ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
- [ros_speech_recogniton, respeaker_ros] add confidence field (#411)
- [ros_speech_recognition] add self cancellation for speech recogntion (#413)
- [#405 and #410] Fix CI (#415)
- add ROS interface for https://cloud.google.com/natural-language (#304)
- GithubAction: add test for aarch64(melodic) / indigo (arm64)
(#365)
- pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
- Explicit python interpreter in catkin_virtualenv (#367)
- .github/workflow: integrate all yaml to one (#338)
- [ros_speech_recognition] Fixed the behavior of launch file (#336)
- [ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
- [ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
- Enable sound play flag (#315)
- Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura
2.1.24 (2021-07-26)
2.1.23 (2021-07-21)
2.1.22 (2021-06-10)
- enable to change topic name from speech_recognition.launch (#254)
- support SpeakerDiarization, see
https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative
(#244)
- [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
- more exception message for self.recognize
- Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
- Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa
File truncated at 100 lines see the full file
Wiki Tutorials
Package Dependencies
Deps | Name |
---|---|
catkin_virtualenv | |
dynamic_reconfigure | |
jsk_data | |
speech_recognition_msgs | |
catkin | |
audio_capture | |
audio_common_msgs | |
sound_play | |
rostest | |
roslaunch |
System Dependencies
Dependant Packages
Name | Deps |
---|---|
jsk_3rdparty |
Launch files
- launch/parrotry.launch
-
- use_google [default: true]
- language [default: en-US]
- confidence_threshold [default: 0.8]
- launch/speech_recognition.launch
-
- launch_sound_play [default: true] — Launch sound_play node to speak
- launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
- audio_topic [default: /audio] — Name of audio topic captured from microphone
- voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
- n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
- engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
- language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
- continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
- auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
- self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
- tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
- tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition
Messages
Services
Plugins
Recent questions tagged ros_speech_recognition at Robotics Stack Exchange
Package Summary
Tags | No category tags. |
Version | 2.1.31 |
License | BSD |
Build type | CATKIN |
Use | RECOMMENDED |
Repository Summary
Checkout URI | https://github.com/jsk-ros-pkg/jsk_3rdparty.git |
VCS Type | git |
VCS Version | master |
Last Updated | 2025-05-13 |
Dev Status | DEVELOPED |
CI status | No Continuous Integration |
Released | RELEASED |
Tags | No category tags. |
Contributing |
Help Wanted (0)
Good First Issues (0) Pull Requests to Review (0) |
Package Description
Additional Links
Maintainers
- Yuki Furuta
Authors
- Yuki Furuta
ros_speech_recognition
A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.
Tutorials
Normal tutorial
- Install this package and SpeechReconition
sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
- Launch speech recognition node
roslaunch ros_speech_recognition speech_recognition.launch
- Echo
/speech_to_text
rostopic echo /speech_to_text
# you can get the recognition result
Parrotry tutorial
Parrotry mean オウム返し in Japanese
# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP
speech_recognition_node.py
Interface
Publishing Topics
-
~voice_topic
(speech_recognition_msgs/SpeechRecognitionCandidates
)Speech recognition candidates topic name.
Topic name is set by parameter
~voice_topic
, and default value isspeech_to_text
. -
sound_play
(sound_play/SoundRequestAction
)Action client to play sound on events. If the action server is not available or
~enable_sound_effect
isFalse
, no sound is played.
Subscribing Topics
-
~audio_topic
(audio_common_msgs/AudioData
)Audio stream data to be recognized.
Topis name is set by parameter
~audio_topic
and default value isaudio
.
Advertising Services
-
speech_recognition
(speech_recognition_msgs/SpeechRecognition
)Service for speech recognition
-
speech_recognition/start
(std_srvs/Empty
)Start service for speech recognition
This service is available when parameter
~contiunous
isTrue
. -
speech_recognition/start
(std_srvs/Empty
)Stop service for speech recognition
This service is available when parameter
~contiunous
isTrue
.
Parameters
-
~voice_topic
(String
, default:speech_to_text
)Publishing voice topic name
-
~audio_topic
(String
, default:audio
)Subscribing audio topic name
-
~enable_sound_effect
(Bool
, default:True
)Flag to enable or disable sound to play sound on recognition.
-
~language
(String
, default:en-US
)Language to be recognized
-
~engine
(Enum[String]
, default:Google
)Speech-to-text engine (To see full options use
dynamic_reconfigure
)
File truncated at 100 lines see the full file
Changelog for package ros_speech_recognition
2.1.31 (2025-05-13)
2.1.30 (2025-05-10)
2.1.29 (2025-01-05)
- [doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
- Contributors: Yukina Iwata
2.1.28 (2023-07-24)
2.1.27 (2023-06-24)
- fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
- Contributors: Kei Okada
2.1.26 (2023-06-14)
- add LICENSE files (#476)
- Contributors: Kei Okada
2.1.25 (2023-06-08)
- [ros_speech_recognition] Add vosk engine (#474)
- Pr/use sound themes freedesktop (#472)
- add test to check if ros node is loadable (#463)
- add self.conf_thresh in __init_ function (#457)
- [ros_speech_recognition] add ubuntu-sounds dependency (#453)
- [ros_speech_recognition] Return if result is empty (#443)
- [ros_speece_recognition] Set confidence value of google (#434)
- [ros_speech_recognition] add parrotry.launch (#414)
- [ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
- [ros_speech_recogniton, respeaker_ros] add confidence field (#411)
- [ros_speech_recognition] add self cancellation for speech recogntion (#413)
- [#405 and #410] Fix CI (#415)
- add ROS interface for https://cloud.google.com/natural-language (#304)
- GithubAction: add test for aarch64(melodic) / indigo (arm64)
(#365)
- pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
- Explicit python interpreter in catkin_virtualenv (#367)
- .github/workflow: integrate all yaml to one (#338)
- [ros_speech_recognition] Fixed the behavior of launch file (#336)
- [ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
- [ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
- Enable sound play flag (#315)
- Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura
2.1.24 (2021-07-26)
2.1.23 (2021-07-21)
2.1.22 (2021-06-10)
- enable to change topic name from speech_recognition.launch (#254)
- support SpeakerDiarization, see
https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative
(#244)
- [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
- more exception message for self.recognize
- Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
- Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa
File truncated at 100 lines see the full file
Wiki Tutorials
Package Dependencies
Deps | Name |
---|---|
catkin_virtualenv | |
dynamic_reconfigure | |
jsk_data | |
speech_recognition_msgs | |
catkin | |
audio_capture | |
audio_common_msgs | |
sound_play | |
rostest | |
roslaunch |
System Dependencies
Dependant Packages
Name | Deps |
---|---|
jsk_3rdparty |
Launch files
- launch/parrotry.launch
-
- use_google [default: true]
- language [default: en-US]
- confidence_threshold [default: 0.8]
- launch/speech_recognition.launch
-
- launch_sound_play [default: true] — Launch sound_play node to speak
- launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
- audio_topic [default: /audio] — Name of audio topic captured from microphone
- voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
- n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
- engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
- language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
- continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
- auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
- self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
- tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
- tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition
Messages
Services
Plugins
Recent questions tagged ros_speech_recognition at Robotics Stack Exchange
Package Summary
Tags | No category tags. |
Version | 2.1.31 |
License | BSD |
Build type | CATKIN |
Use | RECOMMENDED |
Repository Summary
Checkout URI | https://github.com/jsk-ros-pkg/jsk_3rdparty.git |
VCS Type | git |
VCS Version | master |
Last Updated | 2025-05-13 |
Dev Status | DEVELOPED |
CI status | No Continuous Integration |
Released | RELEASED |
Tags | No category tags. |
Contributing |
Help Wanted (0)
Good First Issues (0) Pull Requests to Review (0) |
Package Description
Additional Links
Maintainers
- Yuki Furuta
Authors
- Yuki Furuta
ros_speech_recognition
A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.
Tutorials
Normal tutorial
- Install this package and SpeechReconition
sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
- Launch speech recognition node
roslaunch ros_speech_recognition speech_recognition.launch
- Echo
/speech_to_text
rostopic echo /speech_to_text
# you can get the recognition result
Parrotry tutorial
Parrotry mean オウム返し in Japanese
# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP
speech_recognition_node.py
Interface
Publishing Topics
-
~voice_topic
(speech_recognition_msgs/SpeechRecognitionCandidates
)Speech recognition candidates topic name.
Topic name is set by parameter
~voice_topic
, and default value isspeech_to_text
. -
sound_play
(sound_play/SoundRequestAction
)Action client to play sound on events. If the action server is not available or
~enable_sound_effect
isFalse
, no sound is played.
Subscribing Topics
-
~audio_topic
(audio_common_msgs/AudioData
)Audio stream data to be recognized.
Topis name is set by parameter
~audio_topic
and default value isaudio
.
Advertising Services
-
speech_recognition
(speech_recognition_msgs/SpeechRecognition
)Service for speech recognition
-
speech_recognition/start
(std_srvs/Empty
)Start service for speech recognition
This service is available when parameter
~contiunous
isTrue
. -
speech_recognition/start
(std_srvs/Empty
)Stop service for speech recognition
This service is available when parameter
~contiunous
isTrue
.
Parameters
-
~voice_topic
(String
, default:speech_to_text
)Publishing voice topic name
-
~audio_topic
(String
, default:audio
)Subscribing audio topic name
-
~enable_sound_effect
(Bool
, default:True
)Flag to enable or disable sound to play sound on recognition.
-
~language
(String
, default:en-US
)Language to be recognized
-
~engine
(Enum[String]
, default:Google
)Speech-to-text engine (To see full options use
dynamic_reconfigure
)
File truncated at 100 lines see the full file
Changelog for package ros_speech_recognition
2.1.31 (2025-05-13)
2.1.30 (2025-05-10)
2.1.29 (2025-01-05)
- [doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
- Contributors: Yukina Iwata
2.1.28 (2023-07-24)
2.1.27 (2023-06-24)
- fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
- Contributors: Kei Okada
2.1.26 (2023-06-14)
- add LICENSE files (#476)
- Contributors: Kei Okada
2.1.25 (2023-06-08)
- [ros_speech_recognition] Add vosk engine (#474)
- Pr/use sound themes freedesktop (#472)
- add test to check if ros node is loadable (#463)
- add self.conf_thresh in __init_ function (#457)
- [ros_speech_recognition] add ubuntu-sounds dependency (#453)
- [ros_speech_recognition] Return if result is empty (#443)
- [ros_speece_recognition] Set confidence value of google (#434)
- [ros_speech_recognition] add parrotry.launch (#414)
- [ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
- [ros_speech_recogniton, respeaker_ros] add confidence field (#411)
- [ros_speech_recognition] add self cancellation for speech recogntion (#413)
- [#405 and #410] Fix CI (#415)
- add ROS interface for https://cloud.google.com/natural-language (#304)
- GithubAction: add test for aarch64(melodic) / indigo (arm64)
(#365)
- pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
- Explicit python interpreter in catkin_virtualenv (#367)
- .github/workflow: integrate all yaml to one (#338)
- [ros_speech_recognition] Fixed the behavior of launch file (#336)
- [ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
- [ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
- Enable sound play flag (#315)
- Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura
2.1.24 (2021-07-26)
2.1.23 (2021-07-21)
2.1.22 (2021-06-10)
- enable to change topic name from speech_recognition.launch (#254)
- support SpeakerDiarization, see
https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative
(#244)
- [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
- more exception message for self.recognize
- Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
- Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa
File truncated at 100 lines see the full file
Wiki Tutorials
Package Dependencies
Deps | Name |
---|---|
catkin_virtualenv | |
dynamic_reconfigure | |
jsk_data | |
speech_recognition_msgs | |
catkin | |
audio_capture | |
audio_common_msgs | |
sound_play | |
rostest | |
roslaunch |
System Dependencies
Dependant Packages
Name | Deps |
---|---|
jsk_3rdparty |
Launch files
- launch/parrotry.launch
-
- use_google [default: true]
- language [default: en-US]
- confidence_threshold [default: 0.8]
- launch/speech_recognition.launch
-
- launch_sound_play [default: true] — Launch sound_play node to speak
- launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
- audio_topic [default: /audio] — Name of audio topic captured from microphone
- voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
- n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
- engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
- language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
- continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
- auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
- self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
- tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
- tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition
Messages
Services
Plugins
Recent questions tagged ros_speech_recognition at Robotics Stack Exchange
Package Summary
Tags | No category tags. |
Version | 2.1.31 |
License | BSD |
Build type | CATKIN |
Use | RECOMMENDED |
Repository Summary
Checkout URI | https://github.com/jsk-ros-pkg/jsk_3rdparty.git |
VCS Type | git |
VCS Version | master |
Last Updated | 2025-05-13 |
Dev Status | DEVELOPED |
CI status | Continuous Integration |
Released | RELEASED |
Tags | No category tags. |
Contributing |
Help Wanted (0)
Good First Issues (0) Pull Requests to Review (0) |
Package Description
Additional Links
Maintainers
- Yuki Furuta
Authors
- Yuki Furuta
ros_speech_recognition
A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.
Tutorials
Normal tutorial
- Install this package and SpeechReconition
sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
- Launch speech recognition node
roslaunch ros_speech_recognition speech_recognition.launch
- Echo
/speech_to_text
rostopic echo /speech_to_text
# you can get the recognition result
Parrotry tutorial
Parrotry mean オウム返し in Japanese
# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP
speech_recognition_node.py
Interface
Publishing Topics
-
~voice_topic
(speech_recognition_msgs/SpeechRecognitionCandidates
)Speech recognition candidates topic name.
Topic name is set by parameter
~voice_topic
, and default value isspeech_to_text
. -
sound_play
(sound_play/SoundRequestAction
)Action client to play sound on events. If the action server is not available or
~enable_sound_effect
isFalse
, no sound is played.
Subscribing Topics
-
~audio_topic
(audio_common_msgs/AudioData
)Audio stream data to be recognized.
Topis name is set by parameter
~audio_topic
and default value isaudio
.
Advertising Services
-
speech_recognition
(speech_recognition_msgs/SpeechRecognition
)Service for speech recognition
-
speech_recognition/start
(std_srvs/Empty
)Start service for speech recognition
This service is available when parameter
~contiunous
isTrue
. -
speech_recognition/start
(std_srvs/Empty
)Stop service for speech recognition
This service is available when parameter
~contiunous
isTrue
.
Parameters
-
~voice_topic
(String
, default:speech_to_text
)Publishing voice topic name
-
~audio_topic
(String
, default:audio
)Subscribing audio topic name
-
~enable_sound_effect
(Bool
, default:True
)Flag to enable or disable sound to play sound on recognition.
-
~language
(String
, default:en-US
)Language to be recognized
-
~engine
(Enum[String]
, default:Google
)Speech-to-text engine (To see full options use
dynamic_reconfigure
)
File truncated at 100 lines see the full file
Changelog for package ros_speech_recognition
2.1.31 (2025-05-13)
2.1.30 (2025-05-10)
2.1.29 (2025-01-05)
- [doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
- Contributors: Yukina Iwata
2.1.28 (2023-07-24)
2.1.27 (2023-06-24)
- fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
- Contributors: Kei Okada
2.1.26 (2023-06-14)
- add LICENSE files (#476)
- Contributors: Kei Okada
2.1.25 (2023-06-08)
- [ros_speech_recognition] Add vosk engine (#474)
- Pr/use sound themes freedesktop (#472)
- add test to check if ros node is loadable (#463)
- add self.conf_thresh in __init_ function (#457)
- [ros_speech_recognition] add ubuntu-sounds dependency (#453)
- [ros_speech_recognition] Return if result is empty (#443)
- [ros_speece_recognition] Set confidence value of google (#434)
- [ros_speech_recognition] add parrotry.launch (#414)
- [ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
- [ros_speech_recogniton, respeaker_ros] add confidence field (#411)
- [ros_speech_recognition] add self cancellation for speech recogntion (#413)
- [#405 and #410] Fix CI (#415)
- add ROS interface for https://cloud.google.com/natural-language (#304)
- GithubAction: add test for aarch64(melodic) / indigo (arm64)
(#365)
- pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
- Explicit python interpreter in catkin_virtualenv (#367)
- .github/workflow: integrate all yaml to one (#338)
- [ros_speech_recognition] Fixed the behavior of launch file (#336)
- [ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
- [ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
- Enable sound play flag (#315)
- Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura
2.1.24 (2021-07-26)
2.1.23 (2021-07-21)
2.1.22 (2021-06-10)
- enable to change topic name from speech_recognition.launch (#254)
- support SpeakerDiarization, see
https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative
(#244)
- [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
- more exception message for self.recognize
- Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
- Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa
File truncated at 100 lines see the full file
Wiki Tutorials
Package Dependencies
Deps | Name |
---|---|
catkin_virtualenv | |
dynamic_reconfigure | |
jsk_data | |
speech_recognition_msgs | |
catkin | |
audio_capture | |
audio_common_msgs | |
sound_play | |
rostest | |
roslaunch |
System Dependencies
Dependant Packages
Name | Deps |
---|---|
jsk_3rdparty |
Launch files
- launch/parrotry.launch
-
- use_google [default: true]
- language [default: en-US]
- confidence_threshold [default: 0.8]
- launch/speech_recognition.launch
-
- launch_sound_play [default: true] — Launch sound_play node to speak
- launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
- audio_topic [default: /audio] — Name of audio topic captured from microphone
- voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
- n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
- engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
- language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
- continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
- auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
- self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
- tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
- tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition
Messages
Services
Plugins
Recent questions tagged ros_speech_recognition at Robotics Stack Exchange
Package Summary
Tags | No category tags. |
Version | 2.1.31 |
License | BSD |
Build type | CATKIN |
Use | RECOMMENDED |
Repository Summary
Checkout URI | https://github.com/jsk-ros-pkg/jsk_3rdparty.git |
VCS Type | git |
VCS Version | master |
Last Updated | 2025-05-13 |
Dev Status | DEVELOPED |
CI status | Continuous Integration |
Released | RELEASED |
Tags | No category tags. |
Contributing |
Help Wanted (0)
Good First Issues (0) Pull Requests to Review (0) |
Package Description
Additional Links
Maintainers
- Yuki Furuta
Authors
- Yuki Furuta
ros_speech_recognition
A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.
Tutorials
Normal tutorial
- Install this package and SpeechReconition
sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
- Launch speech recognition node
roslaunch ros_speech_recognition speech_recognition.launch
- Echo
/speech_to_text
rostopic echo /speech_to_text
# you can get the recognition result
Parrotry tutorial
Parrotry mean オウム返し in Japanese
# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP
speech_recognition_node.py
Interface
Publishing Topics
-
~voice_topic
(speech_recognition_msgs/SpeechRecognitionCandidates
)Speech recognition candidates topic name.
Topic name is set by parameter
~voice_topic
, and default value isspeech_to_text
. -
sound_play
(sound_play/SoundRequestAction
)Action client to play sound on events. If the action server is not available or
~enable_sound_effect
isFalse
, no sound is played.
Subscribing Topics
-
~audio_topic
(audio_common_msgs/AudioData
)Audio stream data to be recognized.
Topis name is set by parameter
~audio_topic
and default value isaudio
.
Advertising Services
-
speech_recognition
(speech_recognition_msgs/SpeechRecognition
)Service for speech recognition
-
speech_recognition/start
(std_srvs/Empty
)Start service for speech recognition
This service is available when parameter
~contiunous
isTrue
. -
speech_recognition/start
(std_srvs/Empty
)Stop service for speech recognition
This service is available when parameter
~contiunous
isTrue
.
Parameters
-
~voice_topic
(String
, default:speech_to_text
)Publishing voice topic name
-
~audio_topic
(String
, default:audio
)Subscribing audio topic name
-
~enable_sound_effect
(Bool
, default:True
)Flag to enable or disable sound to play sound on recognition.
-
~language
(String
, default:en-US
)Language to be recognized
-
~engine
(Enum[String]
, default:Google
)Speech-to-text engine (To see full options use
dynamic_reconfigure
)
File truncated at 100 lines see the full file
Changelog for package ros_speech_recognition
2.1.31 (2025-05-13)
2.1.30 (2025-05-10)
2.1.29 (2025-01-05)
- [doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
- Contributors: Yukina Iwata
2.1.28 (2023-07-24)
2.1.27 (2023-06-24)
- fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
- Contributors: Kei Okada
2.1.26 (2023-06-14)
- add LICENSE files (#476)
- Contributors: Kei Okada
2.1.25 (2023-06-08)
- [ros_speech_recognition] Add vosk engine (#474)
- Pr/use sound themes freedesktop (#472)
- add test to check if ros node is loadable (#463)
- add self.conf_thresh in __init_ function (#457)
- [ros_speech_recognition] add ubuntu-sounds dependency (#453)
- [ros_speech_recognition] Return if result is empty (#443)
- [ros_speece_recognition] Set confidence value of google (#434)
- [ros_speech_recognition] add parrotry.launch (#414)
- [ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
- [ros_speech_recogniton, respeaker_ros] add confidence field (#411)
- [ros_speech_recognition] add self cancellation for speech recogntion (#413)
- [#405 and #410] Fix CI (#415)
- add ROS interface for https://cloud.google.com/natural-language (#304)
- GithubAction: add test for aarch64(melodic) / indigo (arm64)
(#365)
- pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
- Explicit python interpreter in catkin_virtualenv (#367)
- .github/workflow: integrate all yaml to one (#338)
- [ros_speech_recognition] Fixed the behavior of launch file (#336)
- [ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
- [ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
- Enable sound play flag (#315)
- Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura
2.1.24 (2021-07-26)
2.1.23 (2021-07-21)
2.1.22 (2021-06-10)
- enable to change topic name from speech_recognition.launch (#254)
- support SpeakerDiarization, see
https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative
(#244)
- [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
- more exception message for self.recognize
- Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
- Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa
File truncated at 100 lines see the full file
Wiki Tutorials
Package Dependencies
Deps | Name |
---|---|
catkin_virtualenv | |
dynamic_reconfigure | |
jsk_data | |
speech_recognition_msgs | |
catkin | |
audio_capture | |
audio_common_msgs | |
sound_play | |
rostest | |
roslaunch |
System Dependencies
Dependant Packages
Name | Deps |
---|---|
jsk_3rdparty | |
jsk_nao_startup |
Launch files
- launch/parrotry.launch
-
- use_google [default: true]
- language [default: en-US]
- confidence_threshold [default: 0.8]
- launch/speech_recognition.launch
-
- launch_sound_play [default: true] — Launch sound_play node to speak
- launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
- audio_topic [default: /audio] — Name of audio topic captured from microphone
- voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
- n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
- engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
- language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
- continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
- auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
- self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
- tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
- tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition
Messages
Services
Plugins
Recent questions tagged ros_speech_recognition at Robotics Stack Exchange
Package Summary
Tags | No category tags. |
Version | 2.1.31 |
License | BSD |
Build type | CATKIN |
Use | RECOMMENDED |
Repository Summary
Checkout URI | https://github.com/jsk-ros-pkg/jsk_3rdparty.git |
VCS Type | git |
VCS Version | master |
Last Updated | 2025-05-13 |
Dev Status | DEVELOPED |
CI status | No Continuous Integration |
Released | RELEASED |
Tags | No category tags. |
Contributing |
Help Wanted (0)
Good First Issues (0) Pull Requests to Review (0) |
Package Description
Additional Links
Maintainers
- Yuki Furuta
Authors
- Yuki Furuta
ros_speech_recognition
A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.
Tutorials
Normal tutorial
- Install this package and SpeechReconition
sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
- Launch speech recognition node
roslaunch ros_speech_recognition speech_recognition.launch
- Echo
/speech_to_text
rostopic echo /speech_to_text
# you can get the recognition result
Parrotry tutorial
Parrotry mean オウム返し in Japanese
# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP
speech_recognition_node.py
Interface
Publishing Topics
-
~voice_topic
(speech_recognition_msgs/SpeechRecognitionCandidates
)Speech recognition candidates topic name.
Topic name is set by parameter
~voice_topic
, and default value isspeech_to_text
. -
sound_play
(sound_play/SoundRequestAction
)Action client to play sound on events. If the action server is not available or
~enable_sound_effect
isFalse
, no sound is played.
Subscribing Topics
-
~audio_topic
(audio_common_msgs/AudioData
)Audio stream data to be recognized.
Topis name is set by parameter
~audio_topic
and default value isaudio
.
Advertising Services
-
speech_recognition
(speech_recognition_msgs/SpeechRecognition
)Service for speech recognition
-
speech_recognition/start
(std_srvs/Empty
)Start service for speech recognition
This service is available when parameter
~contiunous
isTrue
. -
speech_recognition/start
(std_srvs/Empty
)Stop service for speech recognition
This service is available when parameter
~contiunous
isTrue
.
Parameters
-
~voice_topic
(String
, default:speech_to_text
)Publishing voice topic name
-
~audio_topic
(String
, default:audio
)Subscribing audio topic name
-
~enable_sound_effect
(Bool
, default:True
)Flag to enable or disable sound to play sound on recognition.
-
~language
(String
, default:en-US
)Language to be recognized
-
~engine
(Enum[String]
, default:Google
)Speech-to-text engine (To see full options use
dynamic_reconfigure
)
File truncated at 100 lines see the full file
Changelog for package ros_speech_recognition
2.1.31 (2025-05-13)
2.1.30 (2025-05-10)
2.1.29 (2025-01-05)
- [doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
- Contributors: Yukina Iwata
2.1.28 (2023-07-24)
2.1.27 (2023-06-24)
- fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
- Contributors: Kei Okada
2.1.26 (2023-06-14)
- add LICENSE files (#476)
- Contributors: Kei Okada
2.1.25 (2023-06-08)
- [ros_speech_recognition] Add vosk engine (#474)
- Pr/use sound themes freedesktop (#472)
- add test to check if ros node is loadable (#463)
- add self.conf_thresh in __init_ function (#457)
- [ros_speech_recognition] add ubuntu-sounds dependency (#453)
- [ros_speech_recognition] Return if result is empty (#443)
- [ros_speece_recognition] Set confidence value of google (#434)
- [ros_speech_recognition] add parrotry.launch (#414)
- [ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
- [ros_speech_recogniton, respeaker_ros] add confidence field (#411)
- [ros_speech_recognition] add self cancellation for speech recogntion (#413)
- [#405 and #410] Fix CI (#415)
- add ROS interface for https://cloud.google.com/natural-language (#304)
- GithubAction: add test for aarch64(melodic) / indigo (arm64)
(#365)
- pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
- Explicit python interpreter in catkin_virtualenv (#367)
- .github/workflow: integrate all yaml to one (#338)
- [ros_speech_recognition] Fixed the behavior of launch file (#336)
- [ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
- [ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
- Enable sound play flag (#315)
- Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura
2.1.24 (2021-07-26)
2.1.23 (2021-07-21)
2.1.22 (2021-06-10)
- enable to change topic name from speech_recognition.launch (#254)
- support SpeakerDiarization, see
https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative
(#244)
- [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
- more exception message for self.recognize
- Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
- Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa
File truncated at 100 lines see the full file
Wiki Tutorials
Package Dependencies
Deps | Name |
---|---|
catkin_virtualenv | |
dynamic_reconfigure | |
jsk_data | |
speech_recognition_msgs | |
catkin | |
audio_capture | |
audio_common_msgs | |
sound_play | |
rostest | |
roslaunch |
System Dependencies
Dependant Packages
Name | Deps |
---|---|
jsk_3rdparty | |
jsk_nao_startup |
Launch files
- launch/parrotry.launch
-
- use_google [default: true]
- language [default: en-US]
- confidence_threshold [default: 0.8]
- launch/speech_recognition.launch
-
- launch_sound_play [default: true] — Launch sound_play node to speak
- launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
- audio_topic [default: /audio] — Name of audio topic captured from microphone
- voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
- n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
- engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
- language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
- continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
- auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
- self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
- tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
- tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition
Messages
Services
Plugins
Recent questions tagged ros_speech_recognition at Robotics Stack Exchange
Package Summary
Tags | No category tags. |
Version | 2.1.31 |
License | BSD |
Build type | CATKIN |
Use | RECOMMENDED |
Repository Summary
Checkout URI | https://github.com/jsk-ros-pkg/jsk_3rdparty.git |
VCS Type | git |
VCS Version | master |
Last Updated | 2025-05-13 |
Dev Status | DEVELOPED |
CI status | Continuous Integration |
Released | RELEASED |
Tags | No category tags. |
Contributing |
Help Wanted (0)
Good First Issues (0) Pull Requests to Review (0) |
Package Description
Additional Links
Maintainers
- Yuki Furuta
Authors
- Yuki Furuta
ros_speech_recognition
A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.
Tutorials
Normal tutorial
- Install this package and SpeechReconition
sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
- Launch speech recognition node
roslaunch ros_speech_recognition speech_recognition.launch
- Echo
/speech_to_text
rostopic echo /speech_to_text
# you can get the recognition result
Parrotry tutorial
Parrotry mean オウム返し in Japanese
# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP
speech_recognition_node.py
Interface
Publishing Topics
-
~voice_topic
(speech_recognition_msgs/SpeechRecognitionCandidates
)Speech recognition candidates topic name.
Topic name is set by parameter
~voice_topic
, and default value isspeech_to_text
. -
sound_play
(sound_play/SoundRequestAction
)Action client to play sound on events. If the action server is not available or
~enable_sound_effect
isFalse
, no sound is played.
Subscribing Topics
-
~audio_topic
(audio_common_msgs/AudioData
)Audio stream data to be recognized.
Topis name is set by parameter
~audio_topic
and default value isaudio
.
Advertising Services
-
speech_recognition
(speech_recognition_msgs/SpeechRecognition
)Service for speech recognition
-
speech_recognition/start
(std_srvs/Empty
)Start service for speech recognition
This service is available when parameter
~contiunous
isTrue
. -
speech_recognition/start
(std_srvs/Empty
)Stop service for speech recognition
This service is available when parameter
~contiunous
isTrue
.
Parameters
-
~voice_topic
(String
, default:speech_to_text
)Publishing voice topic name
-
~audio_topic
(String
, default:audio
)Subscribing audio topic name
-
~enable_sound_effect
(Bool
, default:True
)Flag to enable or disable sound to play sound on recognition.
-
~language
(String
, default:en-US
)Language to be recognized
-
~engine
(Enum[String]
, default:Google
)Speech-to-text engine (To see full options use
dynamic_reconfigure
)
File truncated at 100 lines see the full file
Changelog for package ros_speech_recognition
2.1.31 (2025-05-13)
2.1.30 (2025-05-10)
2.1.29 (2025-01-05)
- [doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
- Contributors: Yukina Iwata
2.1.28 (2023-07-24)
2.1.27 (2023-06-24)
- fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
- Contributors: Kei Okada
2.1.26 (2023-06-14)
- add LICENSE files (#476)
- Contributors: Kei Okada
2.1.25 (2023-06-08)
- [ros_speech_recognition] Add vosk engine (#474)
- Pr/use sound themes freedesktop (#472)
- add test to check if ros node is loadable (#463)
- add self.conf_thresh in __init_ function (#457)
- [ros_speech_recognition] add ubuntu-sounds dependency (#453)
- [ros_speech_recognition] Return if result is empty (#443)
- [ros_speece_recognition] Set confidence value of google (#434)
- [ros_speech_recognition] add parrotry.launch (#414)
- [ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
- [ros_speech_recogniton, respeaker_ros] add confidence field (#411)
- [ros_speech_recognition] add self cancellation for speech recogntion (#413)
- [#405 and #410] Fix CI (#415)
- add ROS interface for https://cloud.google.com/natural-language (#304)
- GithubAction: add test for aarch64(melodic) / indigo (arm64)
(#365)
- pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
- Explicit python interpreter in catkin_virtualenv (#367)
- .github/workflow: integrate all yaml to one (#338)
- [ros_speech_recognition] Fixed the behavior of launch file (#336)
- [ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
- [ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
- Enable sound play flag (#315)
- Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura
2.1.24 (2021-07-26)
2.1.23 (2021-07-21)
2.1.22 (2021-06-10)
- enable to change topic name from speech_recognition.launch (#254)
- support SpeakerDiarization, see
https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative
(#244)
- [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
- more exception message for self.recognize
- Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
- Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa
File truncated at 100 lines see the full file
Wiki Tutorials
Package Dependencies
Deps | Name |
---|---|
catkin_virtualenv | |
dynamic_reconfigure | |
jsk_data | |
speech_recognition_msgs | |
catkin | |
audio_capture | |
audio_common_msgs | |
sound_play | |
rostest | |
roslaunch |
System Dependencies
Dependant Packages
Name | Deps |
---|---|
jsk_3rdparty |
Launch files
- launch/parrotry.launch
-
- use_google [default: true]
- language [default: en-US]
- confidence_threshold [default: 0.8]
- launch/speech_recognition.launch
-
- launch_sound_play [default: true] — Launch sound_play node to speak
- launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
- audio_topic [default: /audio] — Name of audio topic captured from microphone
- voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
- n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
- engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
- language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
- continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
- auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
- self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
- tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
- tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition
Messages
Services
Plugins
Recent questions tagged ros_speech_recognition at Robotics Stack Exchange
Package Summary
Tags | No category tags. |
Version | 2.1.31 |
License | BSD |
Build type | CATKIN |
Use | RECOMMENDED |
Repository Summary
Checkout URI | https://github.com/jsk-ros-pkg/jsk_3rdparty.git |
VCS Type | git |
VCS Version | master |
Last Updated | 2025-05-13 |
Dev Status | DEVELOPED |
CI status | Continuous Integration |
Released | RELEASED |
Tags | No category tags. |
Contributing |
Help Wanted (0)
Good First Issues (0) Pull Requests to Review (0) |
Package Description
Additional Links
Maintainers
- Yuki Furuta
Authors
- Yuki Furuta
ros_speech_recognition
A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.
Tutorials
Normal tutorial
- Install this package and SpeechReconition
sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
- Launch speech recognition node
roslaunch ros_speech_recognition speech_recognition.launch
- Echo
/speech_to_text
rostopic echo /speech_to_text
# you can get the recognition result
Parrotry tutorial
Parrotry mean オウム返し in Japanese
# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP
speech_recognition_node.py
Interface
Publishing Topics
-
~voice_topic
(speech_recognition_msgs/SpeechRecognitionCandidates
)Speech recognition candidates topic name.
Topic name is set by parameter
~voice_topic
, and default value isspeech_to_text
. -
sound_play
(sound_play/SoundRequestAction
)Action client to play sound on events. If the action server is not available or
~enable_sound_effect
isFalse
, no sound is played.
Subscribing Topics
-
~audio_topic
(audio_common_msgs/AudioData
)Audio stream data to be recognized.
Topis name is set by parameter
~audio_topic
and default value isaudio
.
Advertising Services
-
speech_recognition
(speech_recognition_msgs/SpeechRecognition
)Service for speech recognition
-
speech_recognition/start
(std_srvs/Empty
)Start service for speech recognition
This service is available when parameter
~contiunous
isTrue
. -
speech_recognition/start
(std_srvs/Empty
)Stop service for speech recognition
This service is available when parameter
~contiunous
isTrue
.
Parameters
-
~voice_topic
(String
, default:speech_to_text
)Publishing voice topic name
-
~audio_topic
(String
, default:audio
)Subscribing audio topic name
-
~enable_sound_effect
(Bool
, default:True
)Flag to enable or disable sound to play sound on recognition.
-
~language
(String
, default:en-US
)Language to be recognized
-
~engine
(Enum[String]
, default:Google
)Speech-to-text engine (To see full options use
dynamic_reconfigure
)
File truncated at 100 lines see the full file
Changelog for package ros_speech_recognition
2.1.31 (2025-05-13)
2.1.30 (2025-05-10)
2.1.29 (2025-01-05)
- [doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
- Contributors: Yukina Iwata
2.1.28 (2023-07-24)
2.1.27 (2023-06-24)
- fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
- Contributors: Kei Okada
2.1.26 (2023-06-14)
- add LICENSE files (#476)
- Contributors: Kei Okada
2.1.25 (2023-06-08)
- [ros_speech_recognition] Add vosk engine (#474)
- Pr/use sound themes freedesktop (#472)
- add test to check if ros node is loadable (#463)
- add self.conf_thresh in __init_ function (#457)
- [ros_speech_recognition] add ubuntu-sounds dependency (#453)
- [ros_speech_recognition] Return if result is empty (#443)
- [ros_speece_recognition] Set confidence value of google (#434)
- [ros_speech_recognition] add parrotry.launch (#414)
- [ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
- [ros_speech_recogniton, respeaker_ros] add confidence field (#411)
- [ros_speech_recognition] add self cancellation for speech recogntion (#413)
- [#405 and #410] Fix CI (#415)
- add ROS interface for https://cloud.google.com/natural-language (#304)
- GithubAction: add test for aarch64(melodic) / indigo (arm64)
(#365)
- pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
- Explicit python interpreter in catkin_virtualenv (#367)
- .github/workflow: integrate all yaml to one (#338)
- [ros_speech_recognition] Fixed the behavior of launch file (#336)
- [ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
- [ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
- Enable sound play flag (#315)
- Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura
2.1.24 (2021-07-26)
2.1.23 (2021-07-21)
2.1.22 (2021-06-10)
- enable to change topic name from speech_recognition.launch (#254)
- support SpeakerDiarization, see
https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative
(#244)
- [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
- more exception message for self.recognize
- Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
- Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa
File truncated at 100 lines see the full file
Wiki Tutorials
Package Dependencies
Deps | Name |
---|---|
catkin_virtualenv | |
dynamic_reconfigure | |
jsk_data | |
speech_recognition_msgs | |
catkin | |
audio_capture | |
audio_common_msgs | |
sound_play | |
rostest | |
roslaunch |
System Dependencies
Dependant Packages
Name | Deps |
---|---|
jsk_3rdparty |
Launch files
- launch/parrotry.launch
-
- use_google [default: true]
- language [default: en-US]
- confidence_threshold [default: 0.8]
- launch/speech_recognition.launch
-
- launch_sound_play [default: true] — Launch sound_play node to speak
- launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
- audio_topic [default: /audio] — Name of audio topic captured from microphone
- voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
- n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
- engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
- language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
- continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
- auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
- self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
- tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
- tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition
Messages
Services
Plugins
Recent questions tagged ros_speech_recognition at Robotics Stack Exchange
Package Summary
Tags | No category tags. |
Version | 2.1.31 |
License | BSD |
Build type | CATKIN |
Use | RECOMMENDED |
Repository Summary
Checkout URI | https://github.com/jsk-ros-pkg/jsk_3rdparty.git |
VCS Type | git |
VCS Version | master |
Last Updated | 2025-05-13 |
Dev Status | DEVELOPED |
CI status | No Continuous Integration |
Released | RELEASED |
Tags | No category tags. |
Contributing |
Help Wanted (0)
Good First Issues (0) Pull Requests to Review (0) |
Package Description
Additional Links
Maintainers
- Yuki Furuta
Authors
- Yuki Furuta
ros_speech_recognition
A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.
Tutorials
Normal tutorial
- Install this package and SpeechReconition
sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
- Launch speech recognition node
roslaunch ros_speech_recognition speech_recognition.launch
- Echo
/speech_to_text
rostopic echo /speech_to_text
# you can get the recognition result
Parrotry tutorial
Parrotry mean オウム返し in Japanese
# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP
speech_recognition_node.py
Interface
Publishing Topics
-
~voice_topic
(speech_recognition_msgs/SpeechRecognitionCandidates
)Speech recognition candidates topic name.
Topic name is set by parameter
~voice_topic
, and default value isspeech_to_text
. -
sound_play
(sound_play/SoundRequestAction
)Action client to play sound on events. If the action server is not available or
~enable_sound_effect
isFalse
, no sound is played.
Subscribing Topics
-
~audio_topic
(audio_common_msgs/AudioData
)Audio stream data to be recognized.
Topis name is set by parameter
~audio_topic
and default value isaudio
.
Advertising Services
-
speech_recognition
(speech_recognition_msgs/SpeechRecognition
)Service for speech recognition
-
speech_recognition/start
(std_srvs/Empty
)Start service for speech recognition
This service is available when parameter
~contiunous
isTrue
. -
speech_recognition/start
(std_srvs/Empty
)Stop service for speech recognition
This service is available when parameter
~contiunous
isTrue
.
Parameters
-
~voice_topic
(String
, default:speech_to_text
)Publishing voice topic name
-
~audio_topic
(String
, default:audio
)Subscribing audio topic name
-
~enable_sound_effect
(Bool
, default:True
)Flag to enable or disable sound to play sound on recognition.
-
~language
(String
, default:en-US
)Language to be recognized
-
~engine
(Enum[String]
, default:Google
)Speech-to-text engine (To see full options use
dynamic_reconfigure
)
File truncated at 100 lines see the full file
Changelog for package ros_speech_recognition
2.1.31 (2025-05-13)
2.1.30 (2025-05-10)
2.1.29 (2025-01-05)
- [doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
- Contributors: Yukina Iwata
2.1.28 (2023-07-24)
2.1.27 (2023-06-24)
- fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
- Contributors: Kei Okada
2.1.26 (2023-06-14)
- add LICENSE files (#476)
- Contributors: Kei Okada
2.1.25 (2023-06-08)
- [ros_speech_recognition] Add vosk engine (#474)
- Pr/use sound themes freedesktop (#472)
- add test to check if ros node is loadable (#463)
- add self.conf_thresh in __init_ function (#457)
- [ros_speech_recognition] add ubuntu-sounds dependency (#453)
- [ros_speech_recognition] Return if result is empty (#443)
- [ros_speece_recognition] Set confidence value of google (#434)
- [ros_speech_recognition] add parrotry.launch (#414)
- [ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
- [ros_speech_recogniton, respeaker_ros] add confidence field (#411)
- [ros_speech_recognition] add self cancellation for speech recogntion (#413)
- [#405 and #410] Fix CI (#415)
- add ROS interface for https://cloud.google.com/natural-language (#304)
- GithubAction: add test for aarch64(melodic) / indigo (arm64)
(#365)
- pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
- Explicit python interpreter in catkin_virtualenv (#367)
- .github/workflow: integrate all yaml to one (#338)
- [ros_speech_recognition] Fixed the behavior of launch file (#336)
- [ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
- [ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
- Enable sound play flag (#315)
- Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura
2.1.24 (2021-07-26)
2.1.23 (2021-07-21)
2.1.22 (2021-06-10)
- enable to change topic name from speech_recognition.launch (#254)
- support SpeakerDiarization, see
https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative
(#244)
- [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
- more exception message for self.recognize
- Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
- Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa
File truncated at 100 lines see the full file
Wiki Tutorials
Package Dependencies
Deps | Name |
---|---|
catkin_virtualenv | |
dynamic_reconfigure | |
jsk_data | |
speech_recognition_msgs | |
catkin | |
audio_capture | |
audio_common_msgs | |
sound_play | |
rostest | |
roslaunch |
System Dependencies
Dependant Packages
Name | Deps |
---|---|
jsk_3rdparty |
Launch files
- launch/parrotry.launch
-
- use_google [default: true]
- language [default: en-US]
- confidence_threshold [default: 0.8]
- launch/speech_recognition.launch
-
- launch_sound_play [default: true] — Launch sound_play node to speak
- launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
- audio_topic [default: /audio] — Name of audio topic captured from microphone
- voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
- n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
- device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
- engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
- language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
- continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
- auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
- self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
- tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
- tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition