ros_speech_recognition

Package Summary

Tags	No category tags.
Version	2.1.31
License	BSD
Build type	CATKIN
Use	RECOMMENDED

Repository Summary

Checkout URI	https://github.com/jsk-ros-pkg/jsk_3rdparty.git
VCS Type	git
VCS Version	master
Last Updated	2025-05-13
Dev Status	DEVELOPED
Released	RELEASED
Tags	No category tags.
Contributing	Help Wanted (-) Good First Issues (-) Pull Requests to Review (-)

Package Description

ROS wrapper for Python SpeechRecognition library

Additional Links

Website

Maintainers

Yuki Furuta

Authors

Yuki Furuta

ros_speech_recognition

A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.

Tutorials

Normal tutorial

Install this package and SpeechReconition

  sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
  

Launch speech recognition node

  roslaunch ros_speech_recognition speech_recognition.launch

Echo /speech_to_text

  rostopic echo /speech_to_text
  # you can get the recognition result
  

Parrotry tutorial

Parrotry mean オウム返し in Japanese

# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP

`speech_recognition_node.py` Interface

Publishing Topics

~voice_topic (speech_recognition_msgs/SpeechRecognitionCandidates)

Speech recognition candidates topic name.

Topic name is set by parameter ~voice_topic, and default value is speech_to_text.
sound_play (sound_play/SoundRequestAction)

Action client to play sound on events. If the action server is not available or ~enable_sound_effect is False, no sound is played.

Subscribing Topics

~audio_topic (audio_common_msgs/AudioData)

Audio stream data to be recognized.

Topis name is set by parameter ~audio_topic and default value is audio.

Advertising Services

speech_recognition (speech_recognition_msgs/SpeechRecognition)

Service for speech recognition
speech_recognition/start (std_srvs/Empty)

Start service for speech recognition

This service is available when parameter ~contiunous is True.
speech_recognition/start (std_srvs/Empty)

Stop service for speech recognition

This service is available when parameter ~contiunous is True.

Parameters

~voice_topic (String, default: speech_to_text)

Publishing voice topic name
~audio_topic (String, default: audio)

Subscribing audio topic name
~enable_sound_effect (Bool, default: True)

Flag to enable or disable sound to play sound on recognition.
~language (String, default: en-US)

Language to be recognized
~engine (Enum[String], default: Google)

Speech-to-text engine (To see full options use dynamic_reconfigure)

File truncated at 100 lines see the full file

CHANGELOG

Changelog for package ros_speech_recognition

2.1.31 (2025-05-13)

2.1.30 (2025-05-10)

2.1.29 (2025-01-05)

[doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
Contributors: Yukina Iwata

2.1.28 (2023-07-24)

2.1.27 (2023-06-24)

fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
Contributors: Kei Okada

2.1.26 (2023-06-14)

add LICENSE files (#476)
Contributors: Kei Okada

2.1.25 (2023-06-08)

[ros_speech_recognition] Add vosk engine (#474)
Pr/use sound themes freedesktop (#472)
add test to check if ros node is loadable (#463)
add self.conf_thresh in __init_ function (#457)
[ros_speech_recognition] add ubuntu-sounds dependency (#453)
[ros_speech_recognition] Return if result is empty (#443)
[ros_speece_recognition] Set confidence value of google (#434)
[ros_speech_recognition] add parrotry.launch (#414)
[ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
[ros_speech_recogniton, respeaker_ros] add confidence field (#411)
[ros_speech_recognition] add self cancellation for speech recogntion (#413)
[#405 and #410] Fix CI (#415)
add ROS interface for https://cloud.google.com/natural-language (#304)
GithubAction: add test for aarch64(melodic) / indigo (arm64) (#365)
- pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
Explicit python interpreter in catkin_virtualenv (#367)
.github/workflow: integrate all yaml to one (#338)
[ros_speech_recognition] Fixed the behavior of launch file (#336)
[ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
[ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
Enable sound play flag (#315)
Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura

2.1.24 (2021-07-26)

2.1.23 (2021-07-21)

2.1.22 (2021-06-10)

enable to change topic name from speech_recognition.launch (#254)
support SpeakerDiarization, see https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative (#244)
- [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
- more exception message for self.recognize
Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa

File truncated at 100 lines see the full file

Package Dependencies

Deps	Name
	catkin_virtualenv
	dynamic_reconfigure
	jsk_data
	speech_recognition_msgs
	catkin
	audio_capture
	audio_common_msgs
	sound_play
	rostest
	roslaunch

System Dependencies

Name
g++-static
flac
sound-theme-freedesktop
python3-pycryptodome
python3-gnupg

Dependant Packages

Name	Deps
jsk_3rdparty

Launch files

launch/parrotry.launch
- - use_google [default: true]
  - language [default: en-US]
  - confidence_threshold [default: 0.8]
launch/speech_recognition.launch
- - launch_sound_play [default: true] — Launch sound_play node to speak
  - launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
  - audio_topic [default: /audio] — Name of audio topic captured from microphone
  - voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
  - n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
  - engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
  - language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
  - continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
  - auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
  - self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
  - tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
  - tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged `ros_speech_recognition` at Robotics Stack Exchange

No version for distro jazzy showing lunar. Known supported distros are highlighted in the buttons above.

ros_speech_recognition package from jsk_3rdparty repo

aques_talk assimp_devel downward ff ffha google_cloud_texttospeech julius libcmt libsiftfast lpg_planner mini_maxwell nlopt osqp slic voice_text voicevox zdepth bayesian_belief_networks chaplus_ros dialogflow_task_executive emotion_analyzer gdrive_ros google_chat_ros influxdb_store jsk_3rdparty collada_urdf_jsk_patch laser_filters_jsk_patch julius_ros nfc_ros opt_camera pgm_learner respeaker_ros ros_google_cloud_language ros_speech_recognition rospatlite rosping rostwitter sesame_ros switchbot_ros webrtcvad_ros zdepth_image_transport

ROS Distro
lunar

Package Summary

Tags	No category tags.
Version	2.1.31
License	BSD
Build type	CATKIN
Use	RECOMMENDED

Repository Summary

Checkout URI	https://github.com/jsk-ros-pkg/jsk_3rdparty.git
VCS Type	git
VCS Version	master
Last Updated	2025-05-13
Dev Status	DEVELOPED
Released	RELEASED
Tags	No category tags.
Contributing	Help Wanted (-) Good First Issues (-) Pull Requests to Review (-)

Package Description

ROS wrapper for Python SpeechRecognition library

Additional Links

Website

Maintainers

Yuki Furuta

Authors

Yuki Furuta

ros_speech_recognition

A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.

Tutorials

Normal tutorial

Install this package and SpeechReconition

  sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
  

Launch speech recognition node

  roslaunch ros_speech_recognition speech_recognition.launch

Echo /speech_to_text

  rostopic echo /speech_to_text
  # you can get the recognition result
  

Parrotry tutorial

Parrotry mean オウム返し in Japanese

# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP

`speech_recognition_node.py` Interface

Publishing Topics

~voice_topic (speech_recognition_msgs/SpeechRecognitionCandidates)

Speech recognition candidates topic name.

Topic name is set by parameter ~voice_topic, and default value is speech_to_text.
sound_play (sound_play/SoundRequestAction)

Action client to play sound on events. If the action server is not available or ~enable_sound_effect is False, no sound is played.

Subscribing Topics

~audio_topic (audio_common_msgs/AudioData)

Audio stream data to be recognized.

Topis name is set by parameter ~audio_topic and default value is audio.

Advertising Services

speech_recognition (speech_recognition_msgs/SpeechRecognition)

Service for speech recognition
speech_recognition/start (std_srvs/Empty)

Start service for speech recognition

This service is available when parameter ~contiunous is True.
speech_recognition/start (std_srvs/Empty)

Stop service for speech recognition

This service is available when parameter ~contiunous is True.

Parameters

~voice_topic (String, default: speech_to_text)

Publishing voice topic name
~audio_topic (String, default: audio)

Subscribing audio topic name
~enable_sound_effect (Bool, default: True)

Flag to enable or disable sound to play sound on recognition.
~language (String, default: en-US)

Language to be recognized
~engine (Enum[String], default: Google)

Speech-to-text engine (To see full options use dynamic_reconfigure)

File truncated at 100 lines see the full file

CHANGELOG

Changelog for package ros_speech_recognition

2.1.31 (2025-05-13)

2.1.30 (2025-05-10)

2.1.29 (2025-01-05)

[doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
Contributors: Yukina Iwata

2.1.28 (2023-07-24)

2.1.27 (2023-06-24)

fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
Contributors: Kei Okada

2.1.26 (2023-06-14)

add LICENSE files (#476)
Contributors: Kei Okada

2.1.25 (2023-06-08)

[ros_speech_recognition] Add vosk engine (#474)
Pr/use sound themes freedesktop (#472)
add test to check if ros node is loadable (#463)
add self.conf_thresh in __init_ function (#457)
[ros_speech_recognition] add ubuntu-sounds dependency (#453)
[ros_speech_recognition] Return if result is empty (#443)
[ros_speece_recognition] Set confidence value of google (#434)
[ros_speech_recognition] add parrotry.launch (#414)
[ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
[ros_speech_recogniton, respeaker_ros] add confidence field (#411)
[ros_speech_recognition] add self cancellation for speech recogntion (#413)
[#405 and #410] Fix CI (#415)
add ROS interface for https://cloud.google.com/natural-language (#304)
GithubAction: add test for aarch64(melodic) / indigo (arm64) (#365)
- pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
Explicit python interpreter in catkin_virtualenv (#367)
.github/workflow: integrate all yaml to one (#338)
[ros_speech_recognition] Fixed the behavior of launch file (#336)
[ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
[ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
Enable sound play flag (#315)
Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura

2.1.24 (2021-07-26)

2.1.23 (2021-07-21)

2.1.22 (2021-06-10)

enable to change topic name from speech_recognition.launch (#254)
support SpeakerDiarization, see https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative (#244)
- [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
- more exception message for self.recognize
Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa

File truncated at 100 lines see the full file

Package Dependencies

Deps	Name
	catkin_virtualenv
	dynamic_reconfigure
	jsk_data
	speech_recognition_msgs
	catkin
	audio_capture
	audio_common_msgs
	sound_play
	rostest
	roslaunch

System Dependencies

Name
g++-static
flac
sound-theme-freedesktop
python3-pycryptodome
python3-gnupg

Dependant Packages

Name	Deps
jsk_3rdparty

Launch files

launch/parrotry.launch
- - use_google [default: true]
  - language [default: en-US]
  - confidence_threshold [default: 0.8]
launch/speech_recognition.launch
- - launch_sound_play [default: true] — Launch sound_play node to speak
  - launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
  - audio_topic [default: /audio] — Name of audio topic captured from microphone
  - voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
  - n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
  - engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
  - language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
  - continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
  - auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
  - self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
  - tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
  - tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged `ros_speech_recognition` at Robotics Stack Exchange

No version for distro kilted showing lunar. Known supported distros are highlighted in the buttons above.

ros_speech_recognition package from jsk_3rdparty repo

aques_talk assimp_devel downward ff ffha google_cloud_texttospeech julius libcmt libsiftfast lpg_planner mini_maxwell nlopt osqp slic voice_text voicevox zdepth bayesian_belief_networks chaplus_ros dialogflow_task_executive emotion_analyzer gdrive_ros google_chat_ros influxdb_store jsk_3rdparty collada_urdf_jsk_patch laser_filters_jsk_patch julius_ros nfc_ros opt_camera pgm_learner respeaker_ros ros_google_cloud_language ros_speech_recognition rospatlite rosping rostwitter sesame_ros switchbot_ros webrtcvad_ros zdepth_image_transport

ROS Distro
lunar

Package Summary

Tags	No category tags.
Version	2.1.31
License	BSD
Build type	CATKIN
Use	RECOMMENDED

Repository Summary

Checkout URI	https://github.com/jsk-ros-pkg/jsk_3rdparty.git
VCS Type	git
VCS Version	master
Last Updated	2025-05-13
Dev Status	DEVELOPED
Released	RELEASED
Tags	No category tags.
Contributing	Help Wanted (-) Good First Issues (-) Pull Requests to Review (-)

Package Description

ROS wrapper for Python SpeechRecognition library

Additional Links

Website

Maintainers

Yuki Furuta

Authors

Yuki Furuta

ros_speech_recognition

A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.

Tutorials

Normal tutorial

Install this package and SpeechReconition

  sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
  

Launch speech recognition node

  roslaunch ros_speech_recognition speech_recognition.launch

Echo /speech_to_text

  rostopic echo /speech_to_text
  # you can get the recognition result
  

Parrotry tutorial

Parrotry mean オウム返し in Japanese

# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP

`speech_recognition_node.py` Interface

Publishing Topics

~voice_topic (speech_recognition_msgs/SpeechRecognitionCandidates)

Speech recognition candidates topic name.

Topic name is set by parameter ~voice_topic, and default value is speech_to_text.
sound_play (sound_play/SoundRequestAction)

Action client to play sound on events. If the action server is not available or ~enable_sound_effect is False, no sound is played.

Subscribing Topics

~audio_topic (audio_common_msgs/AudioData)

Audio stream data to be recognized.

Topis name is set by parameter ~audio_topic and default value is audio.

Advertising Services

speech_recognition (speech_recognition_msgs/SpeechRecognition)

Service for speech recognition
speech_recognition/start (std_srvs/Empty)

Start service for speech recognition

This service is available when parameter ~contiunous is True.
speech_recognition/start (std_srvs/Empty)

Stop service for speech recognition

This service is available when parameter ~contiunous is True.

Parameters

~voice_topic (String, default: speech_to_text)

Publishing voice topic name
~audio_topic (String, default: audio)

Subscribing audio topic name
~enable_sound_effect (Bool, default: True)

Flag to enable or disable sound to play sound on recognition.
~language (String, default: en-US)

Language to be recognized
~engine (Enum[String], default: Google)

Speech-to-text engine (To see full options use dynamic_reconfigure)

File truncated at 100 lines see the full file

CHANGELOG

Changelog for package ros_speech_recognition

2.1.31 (2025-05-13)

2.1.30 (2025-05-10)

2.1.29 (2025-01-05)

[doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
Contributors: Yukina Iwata

2.1.28 (2023-07-24)

2.1.27 (2023-06-24)

fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
Contributors: Kei Okada

2.1.26 (2023-06-14)

add LICENSE files (#476)
Contributors: Kei Okada

2.1.25 (2023-06-08)

[ros_speech_recognition] Add vosk engine (#474)
Pr/use sound themes freedesktop (#472)
add test to check if ros node is loadable (#463)
add self.conf_thresh in __init_ function (#457)
[ros_speech_recognition] add ubuntu-sounds dependency (#453)
[ros_speech_recognition] Return if result is empty (#443)
[ros_speece_recognition] Set confidence value of google (#434)
[ros_speech_recognition] add parrotry.launch (#414)
[ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
[ros_speech_recogniton, respeaker_ros] add confidence field (#411)
[ros_speech_recognition] add self cancellation for speech recogntion (#413)
[#405 and #410] Fix CI (#415)
add ROS interface for https://cloud.google.com/natural-language (#304)
GithubAction: add test for aarch64(melodic) / indigo (arm64) (#365)
- pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
Explicit python interpreter in catkin_virtualenv (#367)
.github/workflow: integrate all yaml to one (#338)
[ros_speech_recognition] Fixed the behavior of launch file (#336)
[ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
[ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
Enable sound play flag (#315)
Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura

2.1.24 (2021-07-26)

2.1.23 (2021-07-21)

2.1.22 (2021-06-10)

enable to change topic name from speech_recognition.launch (#254)
support SpeakerDiarization, see https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative (#244)
- [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
- more exception message for self.recognize
Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa

File truncated at 100 lines see the full file

Package Dependencies

Deps	Name
	catkin_virtualenv
	dynamic_reconfigure
	jsk_data
	speech_recognition_msgs
	catkin
	audio_capture
	audio_common_msgs
	sound_play
	rostest
	roslaunch

System Dependencies

Name
g++-static
flac
sound-theme-freedesktop
python3-pycryptodome
python3-gnupg

Dependant Packages

Name	Deps
jsk_3rdparty

Launch files

launch/parrotry.launch
- - use_google [default: true]
  - language [default: en-US]
  - confidence_threshold [default: 0.8]
launch/speech_recognition.launch
- - launch_sound_play [default: true] — Launch sound_play node to speak
  - launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
  - audio_topic [default: /audio] — Name of audio topic captured from microphone
  - voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
  - n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
  - engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
  - language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
  - continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
  - auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
  - self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
  - tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
  - tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged `ros_speech_recognition` at Robotics Stack Exchange

No version for distro rolling showing lunar. Known supported distros are highlighted in the buttons above.

ros_speech_recognition package from jsk_3rdparty repo

aques_talk assimp_devel downward ff ffha google_cloud_texttospeech julius libcmt libsiftfast lpg_planner mini_maxwell nlopt osqp slic voice_text voicevox zdepth bayesian_belief_networks chaplus_ros dialogflow_task_executive emotion_analyzer gdrive_ros google_chat_ros influxdb_store jsk_3rdparty collada_urdf_jsk_patch laser_filters_jsk_patch julius_ros nfc_ros opt_camera pgm_learner respeaker_ros ros_google_cloud_language ros_speech_recognition rospatlite rosping rostwitter sesame_ros switchbot_ros webrtcvad_ros zdepth_image_transport

ROS Distro
lunar

Package Summary

Tags	No category tags.
Version	2.1.31
License	BSD
Build type	CATKIN
Use	RECOMMENDED

Repository Summary

Checkout URI	https://github.com/jsk-ros-pkg/jsk_3rdparty.git
VCS Type	git
VCS Version	master
Last Updated	2025-05-13
Dev Status	DEVELOPED
Released	RELEASED
Tags	No category tags.
Contributing	Help Wanted (-) Good First Issues (-) Pull Requests to Review (-)

Package Description

ROS wrapper for Python SpeechRecognition library

Additional Links

Website

Maintainers

Yuki Furuta

Authors

Yuki Furuta

ros_speech_recognition

A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.

Tutorials

Normal tutorial

Install this package and SpeechReconition

  sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
  

Launch speech recognition node

  roslaunch ros_speech_recognition speech_recognition.launch

Echo /speech_to_text

  rostopic echo /speech_to_text
  # you can get the recognition result
  

Parrotry tutorial

Parrotry mean オウム返し in Japanese

# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP

`speech_recognition_node.py` Interface

Publishing Topics

~voice_topic (speech_recognition_msgs/SpeechRecognitionCandidates)

Speech recognition candidates topic name.

Topic name is set by parameter ~voice_topic, and default value is speech_to_text.
sound_play (sound_play/SoundRequestAction)

Action client to play sound on events. If the action server is not available or ~enable_sound_effect is False, no sound is played.

Subscribing Topics

~audio_topic (audio_common_msgs/AudioData)

Audio stream data to be recognized.

Topis name is set by parameter ~audio_topic and default value is audio.

Advertising Services

speech_recognition (speech_recognition_msgs/SpeechRecognition)

Service for speech recognition
speech_recognition/start (std_srvs/Empty)

Start service for speech recognition

This service is available when parameter ~contiunous is True.
speech_recognition/start (std_srvs/Empty)

Stop service for speech recognition

This service is available when parameter ~contiunous is True.

Parameters

~voice_topic (String, default: speech_to_text)

Publishing voice topic name
~audio_topic (String, default: audio)

Subscribing audio topic name
~enable_sound_effect (Bool, default: True)

Flag to enable or disable sound to play sound on recognition.
~language (String, default: en-US)

Language to be recognized
~engine (Enum[String], default: Google)

Speech-to-text engine (To see full options use dynamic_reconfigure)

File truncated at 100 lines see the full file

CHANGELOG

Changelog for package ros_speech_recognition

2.1.31 (2025-05-13)

2.1.30 (2025-05-10)

2.1.29 (2025-01-05)

[doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
Contributors: Yukina Iwata

2.1.28 (2023-07-24)

2.1.27 (2023-06-24)

fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
Contributors: Kei Okada

2.1.26 (2023-06-14)

add LICENSE files (#476)
Contributors: Kei Okada

2.1.25 (2023-06-08)

[ros_speech_recognition] Add vosk engine (#474)
Pr/use sound themes freedesktop (#472)
add test to check if ros node is loadable (#463)
add self.conf_thresh in __init_ function (#457)
[ros_speech_recognition] add ubuntu-sounds dependency (#453)
[ros_speech_recognition] Return if result is empty (#443)
[ros_speece_recognition] Set confidence value of google (#434)
[ros_speech_recognition] add parrotry.launch (#414)
[ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
[ros_speech_recogniton, respeaker_ros] add confidence field (#411)
[ros_speech_recognition] add self cancellation for speech recogntion (#413)
[#405 and #410] Fix CI (#415)
add ROS interface for https://cloud.google.com/natural-language (#304)
GithubAction: add test for aarch64(melodic) / indigo (arm64) (#365)
- pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
Explicit python interpreter in catkin_virtualenv (#367)
.github/workflow: integrate all yaml to one (#338)
[ros_speech_recognition] Fixed the behavior of launch file (#336)
[ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
[ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
Enable sound play flag (#315)
Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura

2.1.24 (2021-07-26)

2.1.23 (2021-07-21)

2.1.22 (2021-06-10)

enable to change topic name from speech_recognition.launch (#254)
support SpeakerDiarization, see https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative (#244)
- [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
- more exception message for self.recognize
Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa

File truncated at 100 lines see the full file

Package Dependencies

Deps	Name
	catkin_virtualenv
	dynamic_reconfigure
	jsk_data
	speech_recognition_msgs
	catkin
	audio_capture
	audio_common_msgs
	sound_play
	rostest
	roslaunch

System Dependencies

Name
g++-static
flac
sound-theme-freedesktop
python3-pycryptodome
python3-gnupg

Dependant Packages

Name	Deps
jsk_3rdparty

Launch files

launch/parrotry.launch
- - use_google [default: true]
  - language [default: en-US]
  - confidence_threshold [default: 0.8]
launch/speech_recognition.launch
- - launch_sound_play [default: true] — Launch sound_play node to speak
  - launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
  - audio_topic [default: /audio] — Name of audio topic captured from microphone
  - voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
  - n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
  - engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
  - language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
  - continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
  - auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
  - self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
  - tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
  - tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged `ros_speech_recognition` at Robotics Stack Exchange

No version for distro ardent showing lunar. Known supported distros are highlighted in the buttons above.

ros_speech_recognition package from jsk_3rdparty repo

aques_talk assimp_devel downward ff ffha google_cloud_texttospeech julius libcmt libsiftfast lpg_planner mini_maxwell nlopt osqp slic voice_text voicevox zdepth bayesian_belief_networks chaplus_ros dialogflow_task_executive emotion_analyzer gdrive_ros google_chat_ros influxdb_store jsk_3rdparty collada_urdf_jsk_patch laser_filters_jsk_patch julius_ros nfc_ros opt_camera pgm_learner respeaker_ros ros_google_cloud_language ros_speech_recognition rospatlite rosping rostwitter sesame_ros switchbot_ros webrtcvad_ros zdepth_image_transport

ROS Distro
lunar

Package Summary

Tags	No category tags.
Version	2.1.31
License	BSD
Build type	CATKIN
Use	RECOMMENDED

Repository Summary

Checkout URI	https://github.com/jsk-ros-pkg/jsk_3rdparty.git
VCS Type	git
VCS Version	master
Last Updated	2025-05-13
Dev Status	DEVELOPED
Released	RELEASED
Tags	No category tags.
Contributing	Help Wanted (-) Good First Issues (-) Pull Requests to Review (-)

Package Description

ROS wrapper for Python SpeechRecognition library

Additional Links

Website

Maintainers

Yuki Furuta

Authors

Yuki Furuta

ros_speech_recognition

A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.

Tutorials

Normal tutorial

Install this package and SpeechReconition

  sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
  

Launch speech recognition node

  roslaunch ros_speech_recognition speech_recognition.launch

Echo /speech_to_text

  rostopic echo /speech_to_text
  # you can get the recognition result
  

Parrotry tutorial

Parrotry mean オウム返し in Japanese

# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP

`speech_recognition_node.py` Interface

Publishing Topics

~voice_topic (speech_recognition_msgs/SpeechRecognitionCandidates)

Speech recognition candidates topic name.

Topic name is set by parameter ~voice_topic, and default value is speech_to_text.
sound_play (sound_play/SoundRequestAction)

Action client to play sound on events. If the action server is not available or ~enable_sound_effect is False, no sound is played.

Subscribing Topics

~audio_topic (audio_common_msgs/AudioData)

Audio stream data to be recognized.

Topis name is set by parameter ~audio_topic and default value is audio.

Advertising Services

speech_recognition (speech_recognition_msgs/SpeechRecognition)

Service for speech recognition
speech_recognition/start (std_srvs/Empty)

Start service for speech recognition

This service is available when parameter ~contiunous is True.
speech_recognition/start (std_srvs/Empty)

Stop service for speech recognition

This service is available when parameter ~contiunous is True.

Parameters

~voice_topic (String, default: speech_to_text)

Publishing voice topic name
~audio_topic (String, default: audio)

Subscribing audio topic name
~enable_sound_effect (Bool, default: True)

Flag to enable or disable sound to play sound on recognition.
~language (String, default: en-US)

Language to be recognized
~engine (Enum[String], default: Google)

Speech-to-text engine (To see full options use dynamic_reconfigure)

File truncated at 100 lines see the full file

CHANGELOG

Changelog for package ros_speech_recognition

2.1.31 (2025-05-13)

2.1.30 (2025-05-10)

2.1.29 (2025-01-05)

[doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
Contributors: Yukina Iwata

2.1.28 (2023-07-24)

2.1.27 (2023-06-24)

fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
Contributors: Kei Okada

2.1.26 (2023-06-14)

add LICENSE files (#476)
Contributors: Kei Okada

2.1.25 (2023-06-08)

[ros_speech_recognition] Add vosk engine (#474)
Pr/use sound themes freedesktop (#472)
add test to check if ros node is loadable (#463)
add self.conf_thresh in __init_ function (#457)
[ros_speech_recognition] add ubuntu-sounds dependency (#453)
[ros_speech_recognition] Return if result is empty (#443)
[ros_speece_recognition] Set confidence value of google (#434)
[ros_speech_recognition] add parrotry.launch (#414)
[ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
[ros_speech_recogniton, respeaker_ros] add confidence field (#411)
[ros_speech_recognition] add self cancellation for speech recogntion (#413)
[#405 and #410] Fix CI (#415)
add ROS interface for https://cloud.google.com/natural-language (#304)
GithubAction: add test for aarch64(melodic) / indigo (arm64) (#365)
- pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
Explicit python interpreter in catkin_virtualenv (#367)
.github/workflow: integrate all yaml to one (#338)
[ros_speech_recognition] Fixed the behavior of launch file (#336)
[ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
[ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
Enable sound play flag (#315)
Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura

2.1.24 (2021-07-26)

2.1.23 (2021-07-21)

2.1.22 (2021-06-10)

enable to change topic name from speech_recognition.launch (#254)
support SpeakerDiarization, see https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative (#244)
- [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
- more exception message for self.recognize
Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa

File truncated at 100 lines see the full file

Package Dependencies

Deps	Name
	catkin_virtualenv
	dynamic_reconfigure
	jsk_data
	speech_recognition_msgs
	catkin
	audio_capture
	audio_common_msgs
	sound_play
	rostest
	roslaunch

System Dependencies

Name
g++-static
flac
sound-theme-freedesktop
python3-pycryptodome
python3-gnupg

Dependant Packages

Name	Deps
jsk_3rdparty

Launch files

launch/parrotry.launch
- - use_google [default: true]
  - language [default: en-US]
  - confidence_threshold [default: 0.8]
launch/speech_recognition.launch
- - launch_sound_play [default: true] — Launch sound_play node to speak
  - launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
  - audio_topic [default: /audio] — Name of audio topic captured from microphone
  - voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
  - n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
  - engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
  - language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
  - continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
  - auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
  - self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
  - tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
  - tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged `ros_speech_recognition` at Robotics Stack Exchange

No version for distro bouncy showing lunar. Known supported distros are highlighted in the buttons above.

ros_speech_recognition package from jsk_3rdparty repo

aques_talk assimp_devel downward ff ffha google_cloud_texttospeech julius libcmt libsiftfast lpg_planner mini_maxwell nlopt osqp slic voice_text voicevox zdepth bayesian_belief_networks chaplus_ros dialogflow_task_executive emotion_analyzer gdrive_ros google_chat_ros influxdb_store jsk_3rdparty collada_urdf_jsk_patch laser_filters_jsk_patch julius_ros nfc_ros opt_camera pgm_learner respeaker_ros ros_google_cloud_language ros_speech_recognition rospatlite rosping rostwitter sesame_ros switchbot_ros webrtcvad_ros zdepth_image_transport

ROS Distro
lunar

Package Summary

Tags	No category tags.
Version	2.1.31
License	BSD
Build type	CATKIN
Use	RECOMMENDED

Repository Summary

Checkout URI	https://github.com/jsk-ros-pkg/jsk_3rdparty.git
VCS Type	git
VCS Version	master
Last Updated	2025-05-13
Dev Status	DEVELOPED
Released	RELEASED
Tags	No category tags.
Contributing	Help Wanted (-) Good First Issues (-) Pull Requests to Review (-)

Package Description

ROS wrapper for Python SpeechRecognition library

Additional Links

Website

Maintainers

Yuki Furuta

Authors

Yuki Furuta

ros_speech_recognition

A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.

Tutorials

Normal tutorial

Install this package and SpeechReconition

  sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
  

Launch speech recognition node

  roslaunch ros_speech_recognition speech_recognition.launch

Echo /speech_to_text

  rostopic echo /speech_to_text
  # you can get the recognition result
  

Parrotry tutorial

Parrotry mean オウム返し in Japanese

# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP

`speech_recognition_node.py` Interface

Publishing Topics

~voice_topic (speech_recognition_msgs/SpeechRecognitionCandidates)

Speech recognition candidates topic name.

Topic name is set by parameter ~voice_topic, and default value is speech_to_text.
sound_play (sound_play/SoundRequestAction)

Action client to play sound on events. If the action server is not available or ~enable_sound_effect is False, no sound is played.

Subscribing Topics

~audio_topic (audio_common_msgs/AudioData)

Audio stream data to be recognized.

Topis name is set by parameter ~audio_topic and default value is audio.

Advertising Services

speech_recognition (speech_recognition_msgs/SpeechRecognition)

Service for speech recognition
speech_recognition/start (std_srvs/Empty)

Start service for speech recognition

This service is available when parameter ~contiunous is True.
speech_recognition/start (std_srvs/Empty)

Stop service for speech recognition

This service is available when parameter ~contiunous is True.

Parameters

~voice_topic (String, default: speech_to_text)

Publishing voice topic name
~audio_topic (String, default: audio)

Subscribing audio topic name
~enable_sound_effect (Bool, default: True)

Flag to enable or disable sound to play sound on recognition.
~language (String, default: en-US)

Language to be recognized
~engine (Enum[String], default: Google)

Speech-to-text engine (To see full options use dynamic_reconfigure)

File truncated at 100 lines see the full file

CHANGELOG

Changelog for package ros_speech_recognition

2.1.31 (2025-05-13)

2.1.30 (2025-05-10)

2.1.29 (2025-01-05)

[doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
Contributors: Yukina Iwata

2.1.28 (2023-07-24)

2.1.27 (2023-06-24)

fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
Contributors: Kei Okada

2.1.26 (2023-06-14)

add LICENSE files (#476)
Contributors: Kei Okada

2.1.25 (2023-06-08)

[ros_speech_recognition] Add vosk engine (#474)
Pr/use sound themes freedesktop (#472)
add test to check if ros node is loadable (#463)
add self.conf_thresh in __init_ function (#457)
[ros_speech_recognition] add ubuntu-sounds dependency (#453)
[ros_speech_recognition] Return if result is empty (#443)
[ros_speece_recognition] Set confidence value of google (#434)
[ros_speech_recognition] add parrotry.launch (#414)
[ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
[ros_speech_recogniton, respeaker_ros] add confidence field (#411)
[ros_speech_recognition] add self cancellation for speech recogntion (#413)
[#405 and #410] Fix CI (#415)
add ROS interface for https://cloud.google.com/natural-language (#304)
GithubAction: add test for aarch64(melodic) / indigo (arm64) (#365)
- pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
Explicit python interpreter in catkin_virtualenv (#367)
.github/workflow: integrate all yaml to one (#338)
[ros_speech_recognition] Fixed the behavior of launch file (#336)
[ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
[ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
Enable sound play flag (#315)
Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura

2.1.24 (2021-07-26)

2.1.23 (2021-07-21)

2.1.22 (2021-06-10)

enable to change topic name from speech_recognition.launch (#254)
support SpeakerDiarization, see https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative (#244)
- [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
- more exception message for self.recognize
Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa

File truncated at 100 lines see the full file

Package Dependencies

Deps	Name
	catkin_virtualenv
	dynamic_reconfigure
	jsk_data
	speech_recognition_msgs
	catkin
	audio_capture
	audio_common_msgs
	sound_play
	rostest
	roslaunch

System Dependencies

Name
g++-static
flac
sound-theme-freedesktop
python3-pycryptodome
python3-gnupg

Dependant Packages

Name	Deps
jsk_3rdparty

Launch files

launch/parrotry.launch
- - use_google [default: true]
  - language [default: en-US]
  - confidence_threshold [default: 0.8]
launch/speech_recognition.launch
- - launch_sound_play [default: true] — Launch sound_play node to speak
  - launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
  - audio_topic [default: /audio] — Name of audio topic captured from microphone
  - voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
  - n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
  - engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
  - language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
  - continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
  - auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
  - self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
  - tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
  - tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged `ros_speech_recognition` at Robotics Stack Exchange

No version for distro crystal showing lunar. Known supported distros are highlighted in the buttons above.

ros_speech_recognition package from jsk_3rdparty repo

aques_talk assimp_devel downward ff ffha google_cloud_texttospeech julius libcmt libsiftfast lpg_planner mini_maxwell nlopt osqp slic voice_text voicevox zdepth bayesian_belief_networks chaplus_ros dialogflow_task_executive emotion_analyzer gdrive_ros google_chat_ros influxdb_store jsk_3rdparty collada_urdf_jsk_patch laser_filters_jsk_patch julius_ros nfc_ros opt_camera pgm_learner respeaker_ros ros_google_cloud_language ros_speech_recognition rospatlite rosping rostwitter sesame_ros switchbot_ros webrtcvad_ros zdepth_image_transport

ROS Distro
lunar

Package Summary

Tags	No category tags.
Version	2.1.31
License	BSD
Build type	CATKIN
Use	RECOMMENDED

Repository Summary

Checkout URI	https://github.com/jsk-ros-pkg/jsk_3rdparty.git
VCS Type	git
VCS Version	master
Last Updated	2025-05-13
Dev Status	DEVELOPED
Released	RELEASED
Tags	No category tags.
Contributing	Help Wanted (-) Good First Issues (-) Pull Requests to Review (-)

Package Description

ROS wrapper for Python SpeechRecognition library

Additional Links

Website

Maintainers

Yuki Furuta

Authors

Yuki Furuta

ros_speech_recognition

A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.

Tutorials

Normal tutorial

Install this package and SpeechReconition

  sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
  

Launch speech recognition node

  roslaunch ros_speech_recognition speech_recognition.launch

Echo /speech_to_text

  rostopic echo /speech_to_text
  # you can get the recognition result
  

Parrotry tutorial

Parrotry mean オウム返し in Japanese

# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP

`speech_recognition_node.py` Interface

Publishing Topics

~voice_topic (speech_recognition_msgs/SpeechRecognitionCandidates)

Speech recognition candidates topic name.

Topic name is set by parameter ~voice_topic, and default value is speech_to_text.
sound_play (sound_play/SoundRequestAction)

Action client to play sound on events. If the action server is not available or ~enable_sound_effect is False, no sound is played.

Subscribing Topics

~audio_topic (audio_common_msgs/AudioData)

Audio stream data to be recognized.

Topis name is set by parameter ~audio_topic and default value is audio.

Advertising Services

speech_recognition (speech_recognition_msgs/SpeechRecognition)

Service for speech recognition
speech_recognition/start (std_srvs/Empty)

Start service for speech recognition

This service is available when parameter ~contiunous is True.
speech_recognition/start (std_srvs/Empty)

Stop service for speech recognition

This service is available when parameter ~contiunous is True.

Parameters

~voice_topic (String, default: speech_to_text)

Publishing voice topic name
~audio_topic (String, default: audio)

Subscribing audio topic name
~enable_sound_effect (Bool, default: True)

Flag to enable or disable sound to play sound on recognition.
~language (String, default: en-US)

Language to be recognized
~engine (Enum[String], default: Google)

Speech-to-text engine (To see full options use dynamic_reconfigure)

File truncated at 100 lines see the full file

CHANGELOG

Changelog for package ros_speech_recognition

2.1.31 (2025-05-13)

2.1.30 (2025-05-10)

2.1.29 (2025-01-05)

[doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
Contributors: Yukina Iwata

2.1.28 (2023-07-24)

2.1.27 (2023-06-24)

fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
Contributors: Kei Okada

2.1.26 (2023-06-14)

add LICENSE files (#476)
Contributors: Kei Okada

2.1.25 (2023-06-08)

[ros_speech_recognition] Add vosk engine (#474)
Pr/use sound themes freedesktop (#472)
add test to check if ros node is loadable (#463)
add self.conf_thresh in __init_ function (#457)
[ros_speech_recognition] add ubuntu-sounds dependency (#453)
[ros_speech_recognition] Return if result is empty (#443)
[ros_speece_recognition] Set confidence value of google (#434)
[ros_speech_recognition] add parrotry.launch (#414)
[ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
[ros_speech_recogniton, respeaker_ros] add confidence field (#411)
[ros_speech_recognition] add self cancellation for speech recogntion (#413)
[#405 and #410] Fix CI (#415)
add ROS interface for https://cloud.google.com/natural-language (#304)
GithubAction: add test for aarch64(melodic) / indigo (arm64) (#365)
- pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
Explicit python interpreter in catkin_virtualenv (#367)
.github/workflow: integrate all yaml to one (#338)
[ros_speech_recognition] Fixed the behavior of launch file (#336)
[ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
[ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
Enable sound play flag (#315)
Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura

2.1.24 (2021-07-26)

2.1.23 (2021-07-21)

2.1.22 (2021-06-10)

enable to change topic name from speech_recognition.launch (#254)
support SpeakerDiarization, see https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative (#244)
- [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
- more exception message for self.recognize
Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa

File truncated at 100 lines see the full file

Package Dependencies

Deps	Name
	catkin_virtualenv
	dynamic_reconfigure
	jsk_data
	speech_recognition_msgs
	catkin
	audio_capture
	audio_common_msgs
	sound_play
	rostest
	roslaunch

System Dependencies

Name
g++-static
flac
sound-theme-freedesktop
python3-pycryptodome
python3-gnupg

Dependant Packages

Name	Deps
jsk_3rdparty

Launch files

launch/parrotry.launch
- - use_google [default: true]
  - language [default: en-US]
  - confidence_threshold [default: 0.8]
launch/speech_recognition.launch
- - launch_sound_play [default: true] — Launch sound_play node to speak
  - launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
  - audio_topic [default: /audio] — Name of audio topic captured from microphone
  - voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
  - n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
  - engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
  - language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
  - continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
  - auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
  - self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
  - tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
  - tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged `ros_speech_recognition` at Robotics Stack Exchange

No version for distro eloquent showing lunar. Known supported distros are highlighted in the buttons above.

ros_speech_recognition package from jsk_3rdparty repo

aques_talk assimp_devel downward ff ffha google_cloud_texttospeech julius libcmt libsiftfast lpg_planner mini_maxwell nlopt osqp slic voice_text voicevox zdepth bayesian_belief_networks chaplus_ros dialogflow_task_executive emotion_analyzer gdrive_ros google_chat_ros influxdb_store jsk_3rdparty collada_urdf_jsk_patch laser_filters_jsk_patch julius_ros nfc_ros opt_camera pgm_learner respeaker_ros ros_google_cloud_language ros_speech_recognition rospatlite rosping rostwitter sesame_ros switchbot_ros webrtcvad_ros zdepth_image_transport

ROS Distro
lunar

Package Summary

Tags	No category tags.
Version	2.1.31
License	BSD
Build type	CATKIN
Use	RECOMMENDED

Repository Summary

Checkout URI	https://github.com/jsk-ros-pkg/jsk_3rdparty.git
VCS Type	git
VCS Version	master
Last Updated	2025-05-13
Dev Status	DEVELOPED
Released	RELEASED
Tags	No category tags.
Contributing	Help Wanted (-) Good First Issues (-) Pull Requests to Review (-)

Package Description

ROS wrapper for Python SpeechRecognition library

Additional Links

Website

Maintainers

Yuki Furuta

Authors

Yuki Furuta

ros_speech_recognition

A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.

Tutorials

Normal tutorial

Install this package and SpeechReconition

  sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
  

Launch speech recognition node

  roslaunch ros_speech_recognition speech_recognition.launch

Echo /speech_to_text

  rostopic echo /speech_to_text
  # you can get the recognition result
  

Parrotry tutorial

Parrotry mean オウム返し in Japanese

# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP

`speech_recognition_node.py` Interface

Publishing Topics

~voice_topic (speech_recognition_msgs/SpeechRecognitionCandidates)

Speech recognition candidates topic name.

Topic name is set by parameter ~voice_topic, and default value is speech_to_text.
sound_play (sound_play/SoundRequestAction)

Action client to play sound on events. If the action server is not available or ~enable_sound_effect is False, no sound is played.

Subscribing Topics

~audio_topic (audio_common_msgs/AudioData)

Audio stream data to be recognized.

Topis name is set by parameter ~audio_topic and default value is audio.

Advertising Services

speech_recognition (speech_recognition_msgs/SpeechRecognition)

Service for speech recognition
speech_recognition/start (std_srvs/Empty)

Start service for speech recognition

This service is available when parameter ~contiunous is True.
speech_recognition/start (std_srvs/Empty)

Stop service for speech recognition

This service is available when parameter ~contiunous is True.

Parameters

~voice_topic (String, default: speech_to_text)

Publishing voice topic name
~audio_topic (String, default: audio)

Subscribing audio topic name
~enable_sound_effect (Bool, default: True)

Flag to enable or disable sound to play sound on recognition.
~language (String, default: en-US)

Language to be recognized
~engine (Enum[String], default: Google)

Speech-to-text engine (To see full options use dynamic_reconfigure)

File truncated at 100 lines see the full file

CHANGELOG

Changelog for package ros_speech_recognition

2.1.31 (2025-05-13)

2.1.30 (2025-05-10)

2.1.29 (2025-01-05)

[doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
Contributors: Yukina Iwata

2.1.28 (2023-07-24)

2.1.27 (2023-06-24)

fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
Contributors: Kei Okada

2.1.26 (2023-06-14)

add LICENSE files (#476)
Contributors: Kei Okada

2.1.25 (2023-06-08)

[ros_speech_recognition] Add vosk engine (#474)
Pr/use sound themes freedesktop (#472)
add test to check if ros node is loadable (#463)
add self.conf_thresh in __init_ function (#457)
[ros_speech_recognition] add ubuntu-sounds dependency (#453)
[ros_speech_recognition] Return if result is empty (#443)
[ros_speece_recognition] Set confidence value of google (#434)
[ros_speech_recognition] add parrotry.launch (#414)
[ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
[ros_speech_recogniton, respeaker_ros] add confidence field (#411)
[ros_speech_recognition] add self cancellation for speech recogntion (#413)
[#405 and #410] Fix CI (#415)
add ROS interface for https://cloud.google.com/natural-language (#304)
GithubAction: add test for aarch64(melodic) / indigo (arm64) (#365)
- pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
Explicit python interpreter in catkin_virtualenv (#367)
.github/workflow: integrate all yaml to one (#338)
[ros_speech_recognition] Fixed the behavior of launch file (#336)
[ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
[ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
Enable sound play flag (#315)
Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura

2.1.24 (2021-07-26)

2.1.23 (2021-07-21)

2.1.22 (2021-06-10)

enable to change topic name from speech_recognition.launch (#254)
support SpeakerDiarization, see https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative (#244)
- [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
- more exception message for self.recognize
Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa

File truncated at 100 lines see the full file

Package Dependencies

Deps	Name
	catkin_virtualenv
	dynamic_reconfigure
	jsk_data
	speech_recognition_msgs
	catkin
	audio_capture
	audio_common_msgs
	sound_play
	rostest
	roslaunch

System Dependencies

Name
g++-static
flac
sound-theme-freedesktop
python3-pycryptodome
python3-gnupg

Dependant Packages

Name	Deps
jsk_3rdparty

Launch files

launch/parrotry.launch
- - use_google [default: true]
  - language [default: en-US]
  - confidence_threshold [default: 0.8]
launch/speech_recognition.launch
- - launch_sound_play [default: true] — Launch sound_play node to speak
  - launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
  - audio_topic [default: /audio] — Name of audio topic captured from microphone
  - voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
  - n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
  - engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
  - language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
  - continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
  - auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
  - self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
  - tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
  - tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged `ros_speech_recognition` at Robotics Stack Exchange

No version for distro dashing showing lunar. Known supported distros are highlighted in the buttons above.

ros_speech_recognition package from jsk_3rdparty repo

aques_talk assimp_devel downward ff ffha google_cloud_texttospeech julius libcmt libsiftfast lpg_planner mini_maxwell nlopt osqp slic voice_text voicevox zdepth bayesian_belief_networks chaplus_ros dialogflow_task_executive emotion_analyzer gdrive_ros google_chat_ros influxdb_store jsk_3rdparty collada_urdf_jsk_patch laser_filters_jsk_patch julius_ros nfc_ros opt_camera pgm_learner respeaker_ros ros_google_cloud_language ros_speech_recognition rospatlite rosping rostwitter sesame_ros switchbot_ros webrtcvad_ros zdepth_image_transport

ROS Distro
lunar

Package Summary

Tags	No category tags.
Version	2.1.31
License	BSD
Build type	CATKIN
Use	RECOMMENDED

Repository Summary

Checkout URI	https://github.com/jsk-ros-pkg/jsk_3rdparty.git
VCS Type	git
VCS Version	master
Last Updated	2025-05-13
Dev Status	DEVELOPED
Released	RELEASED
Tags	No category tags.
Contributing	Help Wanted (-) Good First Issues (-) Pull Requests to Review (-)

Package Description

ROS wrapper for Python SpeechRecognition library

Additional Links

Website

Maintainers

Yuki Furuta

Authors

Yuki Furuta

ros_speech_recognition

A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.

Tutorials

Normal tutorial

Install this package and SpeechReconition

  sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
  

Launch speech recognition node

  roslaunch ros_speech_recognition speech_recognition.launch

Echo /speech_to_text

  rostopic echo /speech_to_text
  # you can get the recognition result
  

Parrotry tutorial

Parrotry mean オウム返し in Japanese

# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP

`speech_recognition_node.py` Interface

Publishing Topics

~voice_topic (speech_recognition_msgs/SpeechRecognitionCandidates)

Speech recognition candidates topic name.

Topic name is set by parameter ~voice_topic, and default value is speech_to_text.
sound_play (sound_play/SoundRequestAction)

Action client to play sound on events. If the action server is not available or ~enable_sound_effect is False, no sound is played.

Subscribing Topics

~audio_topic (audio_common_msgs/AudioData)

Audio stream data to be recognized.

Topis name is set by parameter ~audio_topic and default value is audio.

Advertising Services

speech_recognition (speech_recognition_msgs/SpeechRecognition)

Service for speech recognition
speech_recognition/start (std_srvs/Empty)

Start service for speech recognition

This service is available when parameter ~contiunous is True.
speech_recognition/start (std_srvs/Empty)

Stop service for speech recognition

This service is available when parameter ~contiunous is True.

Parameters

~voice_topic (String, default: speech_to_text)

Publishing voice topic name
~audio_topic (String, default: audio)

Subscribing audio topic name
~enable_sound_effect (Bool, default: True)

Flag to enable or disable sound to play sound on recognition.
~language (String, default: en-US)

Language to be recognized
~engine (Enum[String], default: Google)

Speech-to-text engine (To see full options use dynamic_reconfigure)

File truncated at 100 lines see the full file

CHANGELOG

Changelog for package ros_speech_recognition

2.1.31 (2025-05-13)

2.1.30 (2025-05-10)

2.1.29 (2025-01-05)

[doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
Contributors: Yukina Iwata

2.1.28 (2023-07-24)

2.1.27 (2023-06-24)

fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
Contributors: Kei Okada

2.1.26 (2023-06-14)

add LICENSE files (#476)
Contributors: Kei Okada

2.1.25 (2023-06-08)

[ros_speech_recognition] Add vosk engine (#474)
Pr/use sound themes freedesktop (#472)
add test to check if ros node is loadable (#463)
add self.conf_thresh in __init_ function (#457)
[ros_speech_recognition] add ubuntu-sounds dependency (#453)
[ros_speech_recognition] Return if result is empty (#443)
[ros_speece_recognition] Set confidence value of google (#434)
[ros_speech_recognition] add parrotry.launch (#414)
[ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
[ros_speech_recogniton, respeaker_ros] add confidence field (#411)
[ros_speech_recognition] add self cancellation for speech recogntion (#413)
[#405 and #410] Fix CI (#415)
add ROS interface for https://cloud.google.com/natural-language (#304)
GithubAction: add test for aarch64(melodic) / indigo (arm64) (#365)
- pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
Explicit python interpreter in catkin_virtualenv (#367)
.github/workflow: integrate all yaml to one (#338)
[ros_speech_recognition] Fixed the behavior of launch file (#336)
[ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
[ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
Enable sound play flag (#315)
Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura

2.1.24 (2021-07-26)

2.1.23 (2021-07-21)

2.1.22 (2021-06-10)

enable to change topic name from speech_recognition.launch (#254)
support SpeakerDiarization, see https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative (#244)
- [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
- more exception message for self.recognize
Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa

File truncated at 100 lines see the full file

Package Dependencies

Deps	Name
	catkin_virtualenv
	dynamic_reconfigure
	jsk_data
	speech_recognition_msgs
	catkin
	audio_capture
	audio_common_msgs
	sound_play
	rostest
	roslaunch

System Dependencies

Name
g++-static
flac
sound-theme-freedesktop
python3-pycryptodome
python3-gnupg

Dependant Packages

Name	Deps
jsk_3rdparty

Launch files

launch/parrotry.launch
- - use_google [default: true]
  - language [default: en-US]
  - confidence_threshold [default: 0.8]
launch/speech_recognition.launch
- - launch_sound_play [default: true] — Launch sound_play node to speak
  - launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
  - audio_topic [default: /audio] — Name of audio topic captured from microphone
  - voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
  - n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
  - engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
  - language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
  - continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
  - auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
  - self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
  - tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
  - tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged `ros_speech_recognition` at Robotics Stack Exchange

No version for distro galactic showing lunar. Known supported distros are highlighted in the buttons above.

ros_speech_recognition package from jsk_3rdparty repo

aques_talk assimp_devel downward ff ffha google_cloud_texttospeech julius libcmt libsiftfast lpg_planner mini_maxwell nlopt osqp slic voice_text voicevox zdepth bayesian_belief_networks chaplus_ros dialogflow_task_executive emotion_analyzer gdrive_ros google_chat_ros influxdb_store jsk_3rdparty collada_urdf_jsk_patch laser_filters_jsk_patch julius_ros nfc_ros opt_camera pgm_learner respeaker_ros ros_google_cloud_language ros_speech_recognition rospatlite rosping rostwitter sesame_ros switchbot_ros webrtcvad_ros zdepth_image_transport

ROS Distro
lunar

Package Summary

Tags	No category tags.
Version	2.1.31
License	BSD
Build type	CATKIN
Use	RECOMMENDED

Repository Summary

Checkout URI	https://github.com/jsk-ros-pkg/jsk_3rdparty.git
VCS Type	git
VCS Version	master
Last Updated	2025-05-13
Dev Status	DEVELOPED
Released	RELEASED
Tags	No category tags.
Contributing	Help Wanted (-) Good First Issues (-) Pull Requests to Review (-)

Package Description

ROS wrapper for Python SpeechRecognition library

Additional Links

Website

Maintainers

Yuki Furuta

Authors

Yuki Furuta

ros_speech_recognition

A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.

Tutorials

Normal tutorial

Install this package and SpeechReconition

  sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
  

Launch speech recognition node

  roslaunch ros_speech_recognition speech_recognition.launch

Echo /speech_to_text

  rostopic echo /speech_to_text
  # you can get the recognition result
  

Parrotry tutorial

Parrotry mean オウム返し in Japanese

# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP

`speech_recognition_node.py` Interface

Publishing Topics

~voice_topic (speech_recognition_msgs/SpeechRecognitionCandidates)

Speech recognition candidates topic name.

Topic name is set by parameter ~voice_topic, and default value is speech_to_text.
sound_play (sound_play/SoundRequestAction)

Action client to play sound on events. If the action server is not available or ~enable_sound_effect is False, no sound is played.

Subscribing Topics

~audio_topic (audio_common_msgs/AudioData)

Audio stream data to be recognized.

Topis name is set by parameter ~audio_topic and default value is audio.

Advertising Services

speech_recognition (speech_recognition_msgs/SpeechRecognition)

Service for speech recognition
speech_recognition/start (std_srvs/Empty)

Start service for speech recognition

This service is available when parameter ~contiunous is True.
speech_recognition/start (std_srvs/Empty)

Stop service for speech recognition

This service is available when parameter ~contiunous is True.

Parameters

~voice_topic (String, default: speech_to_text)

Publishing voice topic name
~audio_topic (String, default: audio)

Subscribing audio topic name
~enable_sound_effect (Bool, default: True)

Flag to enable or disable sound to play sound on recognition.
~language (String, default: en-US)

Language to be recognized
~engine (Enum[String], default: Google)

Speech-to-text engine (To see full options use dynamic_reconfigure)

File truncated at 100 lines see the full file

CHANGELOG

Changelog for package ros_speech_recognition

2.1.31 (2025-05-13)

2.1.30 (2025-05-10)

2.1.29 (2025-01-05)

[doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
Contributors: Yukina Iwata

2.1.28 (2023-07-24)

2.1.27 (2023-06-24)

fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
Contributors: Kei Okada

2.1.26 (2023-06-14)

add LICENSE files (#476)
Contributors: Kei Okada

2.1.25 (2023-06-08)

[ros_speech_recognition] Add vosk engine (#474)
Pr/use sound themes freedesktop (#472)
add test to check if ros node is loadable (#463)
add self.conf_thresh in __init_ function (#457)
[ros_speech_recognition] add ubuntu-sounds dependency (#453)
[ros_speech_recognition] Return if result is empty (#443)
[ros_speece_recognition] Set confidence value of google (#434)
[ros_speech_recognition] add parrotry.launch (#414)
[ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
[ros_speech_recogniton, respeaker_ros] add confidence field (#411)
[ros_speech_recognition] add self cancellation for speech recogntion (#413)
[#405 and #410] Fix CI (#415)
add ROS interface for https://cloud.google.com/natural-language (#304)
GithubAction: add test for aarch64(melodic) / indigo (arm64) (#365)
- pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
Explicit python interpreter in catkin_virtualenv (#367)
.github/workflow: integrate all yaml to one (#338)
[ros_speech_recognition] Fixed the behavior of launch file (#336)
[ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
[ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
Enable sound play flag (#315)
Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura

2.1.24 (2021-07-26)

2.1.23 (2021-07-21)

2.1.22 (2021-06-10)

enable to change topic name from speech_recognition.launch (#254)
support SpeakerDiarization, see https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative (#244)
- [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
- more exception message for self.recognize
Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa

File truncated at 100 lines see the full file

Package Dependencies

Deps	Name
	catkin_virtualenv
	dynamic_reconfigure
	jsk_data
	speech_recognition_msgs
	catkin
	audio_capture
	audio_common_msgs
	sound_play
	rostest
	roslaunch

System Dependencies

Name
g++-static
flac
sound-theme-freedesktop
python3-pycryptodome
python3-gnupg

Dependant Packages

Name	Deps
jsk_3rdparty

Launch files

launch/parrotry.launch
- - use_google [default: true]
  - language [default: en-US]
  - confidence_threshold [default: 0.8]
launch/speech_recognition.launch
- - launch_sound_play [default: true] — Launch sound_play node to speak
  - launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
  - audio_topic [default: /audio] — Name of audio topic captured from microphone
  - voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
  - n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
  - engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
  - language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
  - continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
  - auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
  - self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
  - tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
  - tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged `ros_speech_recognition` at Robotics Stack Exchange

No version for distro foxy showing lunar. Known supported distros are highlighted in the buttons above.

ros_speech_recognition package from jsk_3rdparty repo

aques_talk assimp_devel downward ff ffha google_cloud_texttospeech julius libcmt libsiftfast lpg_planner mini_maxwell nlopt osqp slic voice_text voicevox zdepth bayesian_belief_networks chaplus_ros dialogflow_task_executive emotion_analyzer gdrive_ros google_chat_ros influxdb_store jsk_3rdparty collada_urdf_jsk_patch laser_filters_jsk_patch julius_ros nfc_ros opt_camera pgm_learner respeaker_ros ros_google_cloud_language ros_speech_recognition rospatlite rosping rostwitter sesame_ros switchbot_ros webrtcvad_ros zdepth_image_transport

ROS Distro
lunar

Package Summary

Tags	No category tags.
Version	2.1.31
License	BSD
Build type	CATKIN
Use	RECOMMENDED

Repository Summary

Checkout URI	https://github.com/jsk-ros-pkg/jsk_3rdparty.git
VCS Type	git
VCS Version	master
Last Updated	2025-05-13
Dev Status	DEVELOPED
Released	RELEASED
Tags	No category tags.
Contributing	Help Wanted (-) Good First Issues (-) Pull Requests to Review (-)

Package Description

ROS wrapper for Python SpeechRecognition library

Additional Links

Website

Maintainers

Yuki Furuta

Authors

Yuki Furuta

ros_speech_recognition

A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.

Tutorials

Normal tutorial

Install this package and SpeechReconition

  sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
  

Launch speech recognition node

  roslaunch ros_speech_recognition speech_recognition.launch

Echo /speech_to_text

  rostopic echo /speech_to_text
  # you can get the recognition result
  

Parrotry tutorial

Parrotry mean オウム返し in Japanese

# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP

`speech_recognition_node.py` Interface

Publishing Topics

~voice_topic (speech_recognition_msgs/SpeechRecognitionCandidates)

Speech recognition candidates topic name.

Topic name is set by parameter ~voice_topic, and default value is speech_to_text.
sound_play (sound_play/SoundRequestAction)

Action client to play sound on events. If the action server is not available or ~enable_sound_effect is False, no sound is played.

Subscribing Topics

~audio_topic (audio_common_msgs/AudioData)

Audio stream data to be recognized.

Topis name is set by parameter ~audio_topic and default value is audio.

Advertising Services

speech_recognition (speech_recognition_msgs/SpeechRecognition)

Service for speech recognition
speech_recognition/start (std_srvs/Empty)

Start service for speech recognition

This service is available when parameter ~contiunous is True.
speech_recognition/start (std_srvs/Empty)

Stop service for speech recognition

This service is available when parameter ~contiunous is True.

Parameters

~voice_topic (String, default: speech_to_text)

Publishing voice topic name
~audio_topic (String, default: audio)

Subscribing audio topic name
~enable_sound_effect (Bool, default: True)

Flag to enable or disable sound to play sound on recognition.
~language (String, default: en-US)

Language to be recognized
~engine (Enum[String], default: Google)

Speech-to-text engine (To see full options use dynamic_reconfigure)

File truncated at 100 lines see the full file

CHANGELOG

Changelog for package ros_speech_recognition

2.1.31 (2025-05-13)

2.1.30 (2025-05-10)

2.1.29 (2025-01-05)

[doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
Contributors: Yukina Iwata

2.1.28 (2023-07-24)

2.1.27 (2023-06-24)

fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
Contributors: Kei Okada

2.1.26 (2023-06-14)

add LICENSE files (#476)
Contributors: Kei Okada

2.1.25 (2023-06-08)

[ros_speech_recognition] Add vosk engine (#474)
Pr/use sound themes freedesktop (#472)
add test to check if ros node is loadable (#463)
add self.conf_thresh in __init_ function (#457)
[ros_speech_recognition] add ubuntu-sounds dependency (#453)
[ros_speech_recognition] Return if result is empty (#443)
[ros_speece_recognition] Set confidence value of google (#434)
[ros_speech_recognition] add parrotry.launch (#414)
[ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
[ros_speech_recogniton, respeaker_ros] add confidence field (#411)
[ros_speech_recognition] add self cancellation for speech recogntion (#413)
[#405 and #410] Fix CI (#415)
add ROS interface for https://cloud.google.com/natural-language (#304)
GithubAction: add test for aarch64(melodic) / indigo (arm64) (#365)
- pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
Explicit python interpreter in catkin_virtualenv (#367)
.github/workflow: integrate all yaml to one (#338)
[ros_speech_recognition] Fixed the behavior of launch file (#336)
[ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
[ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
Enable sound play flag (#315)
Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura

2.1.24 (2021-07-26)

2.1.23 (2021-07-21)

2.1.22 (2021-06-10)

enable to change topic name from speech_recognition.launch (#254)
support SpeakerDiarization, see https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative (#244)
- [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
- more exception message for self.recognize
Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa

File truncated at 100 lines see the full file

Package Dependencies

Deps	Name
	catkin_virtualenv
	dynamic_reconfigure
	jsk_data
	speech_recognition_msgs
	catkin
	audio_capture
	audio_common_msgs
	sound_play
	rostest
	roslaunch

System Dependencies

Name
g++-static
flac
sound-theme-freedesktop
python3-pycryptodome
python3-gnupg

Dependant Packages

Name	Deps
jsk_3rdparty

Launch files

launch/parrotry.launch
- - use_google [default: true]
  - language [default: en-US]
  - confidence_threshold [default: 0.8]
launch/speech_recognition.launch
- - launch_sound_play [default: true] — Launch sound_play node to speak
  - launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
  - audio_topic [default: /audio] — Name of audio topic captured from microphone
  - voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
  - n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
  - engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
  - language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
  - continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
  - auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
  - self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
  - tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
  - tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged `ros_speech_recognition` at Robotics Stack Exchange

No version for distro iron showing lunar. Known supported distros are highlighted in the buttons above.

ros_speech_recognition package from jsk_3rdparty repo

aques_talk assimp_devel downward ff ffha google_cloud_texttospeech julius libcmt libsiftfast lpg_planner mini_maxwell nlopt osqp slic voice_text voicevox zdepth bayesian_belief_networks chaplus_ros dialogflow_task_executive emotion_analyzer gdrive_ros google_chat_ros influxdb_store jsk_3rdparty collada_urdf_jsk_patch laser_filters_jsk_patch julius_ros nfc_ros opt_camera pgm_learner respeaker_ros ros_google_cloud_language ros_speech_recognition rospatlite rosping rostwitter sesame_ros switchbot_ros webrtcvad_ros zdepth_image_transport

ROS Distro
lunar

Package Summary

Tags	No category tags.
Version	2.1.31
License	BSD
Build type	CATKIN
Use	RECOMMENDED

Repository Summary

Checkout URI	https://github.com/jsk-ros-pkg/jsk_3rdparty.git
VCS Type	git
VCS Version	master
Last Updated	2025-05-13
Dev Status	DEVELOPED
Released	RELEASED
Tags	No category tags.
Contributing	Help Wanted (-) Good First Issues (-) Pull Requests to Review (-)

Package Description

ROS wrapper for Python SpeechRecognition library

Additional Links

Website

Maintainers

Yuki Furuta

Authors

Yuki Furuta

ros_speech_recognition

A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.

Tutorials

Normal tutorial

Install this package and SpeechReconition

  sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
  

Launch speech recognition node

  roslaunch ros_speech_recognition speech_recognition.launch

Echo /speech_to_text

  rostopic echo /speech_to_text
  # you can get the recognition result
  

Parrotry tutorial

Parrotry mean オウム返し in Japanese

# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP

`speech_recognition_node.py` Interface

Publishing Topics

~voice_topic (speech_recognition_msgs/SpeechRecognitionCandidates)

Speech recognition candidates topic name.

Topic name is set by parameter ~voice_topic, and default value is speech_to_text.
sound_play (sound_play/SoundRequestAction)

Action client to play sound on events. If the action server is not available or ~enable_sound_effect is False, no sound is played.

Subscribing Topics

~audio_topic (audio_common_msgs/AudioData)

Audio stream data to be recognized.

Topis name is set by parameter ~audio_topic and default value is audio.

Advertising Services

speech_recognition (speech_recognition_msgs/SpeechRecognition)

Service for speech recognition
speech_recognition/start (std_srvs/Empty)

Start service for speech recognition

This service is available when parameter ~contiunous is True.
speech_recognition/start (std_srvs/Empty)

Stop service for speech recognition

This service is available when parameter ~contiunous is True.

Parameters

~voice_topic (String, default: speech_to_text)

Publishing voice topic name
~audio_topic (String, default: audio)

Subscribing audio topic name
~enable_sound_effect (Bool, default: True)

Flag to enable or disable sound to play sound on recognition.
~language (String, default: en-US)

Language to be recognized
~engine (Enum[String], default: Google)

Speech-to-text engine (To see full options use dynamic_reconfigure)

File truncated at 100 lines see the full file

CHANGELOG

Changelog for package ros_speech_recognition

2.1.31 (2025-05-13)

2.1.30 (2025-05-10)

2.1.29 (2025-01-05)

[doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
Contributors: Yukina Iwata

2.1.28 (2023-07-24)

2.1.27 (2023-06-24)

fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
Contributors: Kei Okada

2.1.26 (2023-06-14)

add LICENSE files (#476)
Contributors: Kei Okada

2.1.25 (2023-06-08)

[ros_speech_recognition] Add vosk engine (#474)
Pr/use sound themes freedesktop (#472)
add test to check if ros node is loadable (#463)
add self.conf_thresh in __init_ function (#457)
[ros_speech_recognition] add ubuntu-sounds dependency (#453)
[ros_speech_recognition] Return if result is empty (#443)
[ros_speece_recognition] Set confidence value of google (#434)
[ros_speech_recognition] add parrotry.launch (#414)
[ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
[ros_speech_recogniton, respeaker_ros] add confidence field (#411)
[ros_speech_recognition] add self cancellation for speech recogntion (#413)
[#405 and #410] Fix CI (#415)
add ROS interface for https://cloud.google.com/natural-language (#304)
GithubAction: add test for aarch64(melodic) / indigo (arm64) (#365)
- pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
Explicit python interpreter in catkin_virtualenv (#367)
.github/workflow: integrate all yaml to one (#338)
[ros_speech_recognition] Fixed the behavior of launch file (#336)
[ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
[ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
Enable sound play flag (#315)
Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura

2.1.24 (2021-07-26)

2.1.23 (2021-07-21)

2.1.22 (2021-06-10)

enable to change topic name from speech_recognition.launch (#254)
support SpeakerDiarization, see https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative (#244)
- [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
- more exception message for self.recognize
Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa

File truncated at 100 lines see the full file

Package Dependencies

Deps	Name
	catkin_virtualenv
	dynamic_reconfigure
	jsk_data
	speech_recognition_msgs
	catkin
	audio_capture
	audio_common_msgs
	sound_play
	rostest
	roslaunch

System Dependencies

Name
g++-static
flac
sound-theme-freedesktop
python3-pycryptodome
python3-gnupg

Dependant Packages

Name	Deps
jsk_3rdparty

Launch files

launch/parrotry.launch
- - use_google [default: true]
  - language [default: en-US]
  - confidence_threshold [default: 0.8]
launch/speech_recognition.launch
- - launch_sound_play [default: true] — Launch sound_play node to speak
  - launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
  - audio_topic [default: /audio] — Name of audio topic captured from microphone
  - voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
  - n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
  - engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
  - language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
  - continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
  - auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
  - self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
  - tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
  - tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged `ros_speech_recognition` at Robotics Stack Exchange

ros_speech_recognition package from jsk_3rdparty repo

aques_talk assimp_devel downward ff ffha google_cloud_texttospeech julius libcmt libsiftfast lpg_planner mini_maxwell nlopt osqp slic voice_text voicevox zdepth bayesian_belief_networks chaplus_ros dialogflow_task_executive emotion_analyzer gdrive_ros google_chat_ros influxdb_store jsk_3rdparty collada_urdf_jsk_patch laser_filters_jsk_patch julius_ros nfc_ros opt_camera pgm_learner respeaker_ros ros_google_cloud_language ros_speech_recognition rospatlite rosping rostwitter sesame_ros switchbot_ros webrtcvad_ros zdepth_image_transport

ROS Distro
lunar

Package Summary

Tags	No category tags.
Version	2.1.31
License	BSD
Build type	CATKIN
Use	RECOMMENDED

Repository Summary

Checkout URI	https://github.com/jsk-ros-pkg/jsk_3rdparty.git
VCS Type	git
VCS Version	master
Last Updated	2025-05-13
Dev Status	DEVELOPED
Released	RELEASED
Tags	No category tags.
Contributing	Help Wanted (-) Good First Issues (-) Pull Requests to Review (-)

Package Description

ROS wrapper for Python SpeechRecognition library

Additional Links

Website

Maintainers

Yuki Furuta

Authors

Yuki Furuta

ros_speech_recognition

A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.

Tutorials

Normal tutorial

Install this package and SpeechReconition

  sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
  

Launch speech recognition node

  roslaunch ros_speech_recognition speech_recognition.launch

Echo /speech_to_text

  rostopic echo /speech_to_text
  # you can get the recognition result
  

Parrotry tutorial

Parrotry mean オウム返し in Japanese

# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP

`speech_recognition_node.py` Interface

Publishing Topics

~voice_topic (speech_recognition_msgs/SpeechRecognitionCandidates)

Speech recognition candidates topic name.

Topic name is set by parameter ~voice_topic, and default value is speech_to_text.
sound_play (sound_play/SoundRequestAction)

Action client to play sound on events. If the action server is not available or ~enable_sound_effect is False, no sound is played.

Subscribing Topics

~audio_topic (audio_common_msgs/AudioData)

Audio stream data to be recognized.

Topis name is set by parameter ~audio_topic and default value is audio.

Advertising Services

speech_recognition (speech_recognition_msgs/SpeechRecognition)

Service for speech recognition
speech_recognition/start (std_srvs/Empty)

Start service for speech recognition

This service is available when parameter ~contiunous is True.
speech_recognition/start (std_srvs/Empty)

Stop service for speech recognition

This service is available when parameter ~contiunous is True.

Parameters

~voice_topic (String, default: speech_to_text)

Publishing voice topic name
~audio_topic (String, default: audio)

Subscribing audio topic name
~enable_sound_effect (Bool, default: True)

Flag to enable or disable sound to play sound on recognition.
~language (String, default: en-US)

Language to be recognized
~engine (Enum[String], default: Google)

Speech-to-text engine (To see full options use dynamic_reconfigure)

File truncated at 100 lines see the full file

CHANGELOG

Changelog for package ros_speech_recognition

2.1.31 (2025-05-13)

2.1.30 (2025-05-10)

2.1.29 (2025-01-05)

[doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
Contributors: Yukina Iwata

2.1.28 (2023-07-24)

2.1.27 (2023-06-24)

fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
Contributors: Kei Okada

2.1.26 (2023-06-14)

add LICENSE files (#476)
Contributors: Kei Okada

2.1.25 (2023-06-08)

[ros_speech_recognition] Add vosk engine (#474)
Pr/use sound themes freedesktop (#472)
add test to check if ros node is loadable (#463)
add self.conf_thresh in __init_ function (#457)
[ros_speech_recognition] add ubuntu-sounds dependency (#453)
[ros_speech_recognition] Return if result is empty (#443)
[ros_speece_recognition] Set confidence value of google (#434)
[ros_speech_recognition] add parrotry.launch (#414)
[ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
[ros_speech_recogniton, respeaker_ros] add confidence field (#411)
[ros_speech_recognition] add self cancellation for speech recogntion (#413)
[#405 and #410] Fix CI (#415)
add ROS interface for https://cloud.google.com/natural-language (#304)
GithubAction: add test for aarch64(melodic) / indigo (arm64) (#365)
- pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
Explicit python interpreter in catkin_virtualenv (#367)
.github/workflow: integrate all yaml to one (#338)
[ros_speech_recognition] Fixed the behavior of launch file (#336)
[ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
[ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
Enable sound play flag (#315)
Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura

2.1.24 (2021-07-26)

2.1.23 (2021-07-21)

2.1.22 (2021-06-10)

enable to change topic name from speech_recognition.launch (#254)
support SpeakerDiarization, see https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative (#244)
- [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
- more exception message for self.recognize
Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa

File truncated at 100 lines see the full file

Package Dependencies

Deps	Name
	catkin_virtualenv
	dynamic_reconfigure
	jsk_data
	speech_recognition_msgs
	catkin
	audio_capture
	audio_common_msgs
	sound_play
	rostest
	roslaunch

System Dependencies

Name
g++-static
flac
sound-theme-freedesktop
python3-pycryptodome
python3-gnupg

Dependant Packages

Name	Deps
jsk_3rdparty

Launch files

launch/parrotry.launch
- - use_google [default: true]
  - language [default: en-US]
  - confidence_threshold [default: 0.8]
launch/speech_recognition.launch
- - launch_sound_play [default: true] — Launch sound_play node to speak
  - launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
  - audio_topic [default: /audio] — Name of audio topic captured from microphone
  - voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
  - n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
  - engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
  - language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
  - continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
  - auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
  - self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
  - tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
  - tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged `ros_speech_recognition` at Robotics Stack Exchange

ros_speech_recognition package from jsk_3rdparty repo

aques_talk assimp_devel downward ff ffha google_cloud_texttospeech julius libcmt libsiftfast lpg_planner mini_maxwell nlopt osqp slic voice_text voicevox zdepth bayesian_belief_networks chaplus_ros dialogflow_task_executive emotion_analyzer gdrive_ros google_chat_ros influxdb_store jsk_3rdparty collada_urdf_jsk_patch laser_filters_jsk_patch julius_ros nfc_ros opt_camera pgm_learner respeaker_ros ros_google_cloud_language ros_speech_recognition rospatlite rosping rostwitter sesame_ros switchbot_ros webrtcvad_ros zdepth_image_transport

ROS Distro
jade

Package Summary

Tags	No category tags.
Version	2.1.31
License	BSD
Build type	CATKIN
Use	RECOMMENDED

Repository Summary

Checkout URI	https://github.com/jsk-ros-pkg/jsk_3rdparty.git
VCS Type	git
VCS Version	master
Last Updated	2025-05-13
Dev Status	DEVELOPED
Released	RELEASED
Tags	No category tags.
Contributing	Help Wanted (-) Good First Issues (-) Pull Requests to Review (-)

Package Description

ROS wrapper for Python SpeechRecognition library

Additional Links

Website

Maintainers

Yuki Furuta

Authors

Yuki Furuta

ros_speech_recognition

A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.

Tutorials

Normal tutorial

Install this package and SpeechReconition

  sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
  

Launch speech recognition node

  roslaunch ros_speech_recognition speech_recognition.launch

Echo /speech_to_text

  rostopic echo /speech_to_text
  # you can get the recognition result
  

Parrotry tutorial

Parrotry mean オウム返し in Japanese

# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP

`speech_recognition_node.py` Interface

Publishing Topics

~voice_topic (speech_recognition_msgs/SpeechRecognitionCandidates)

Speech recognition candidates topic name.

Topic name is set by parameter ~voice_topic, and default value is speech_to_text.
sound_play (sound_play/SoundRequestAction)

Action client to play sound on events. If the action server is not available or ~enable_sound_effect is False, no sound is played.

Subscribing Topics

~audio_topic (audio_common_msgs/AudioData)

Audio stream data to be recognized.

Topis name is set by parameter ~audio_topic and default value is audio.

Advertising Services

speech_recognition (speech_recognition_msgs/SpeechRecognition)

Service for speech recognition
speech_recognition/start (std_srvs/Empty)

Start service for speech recognition

This service is available when parameter ~contiunous is True.
speech_recognition/start (std_srvs/Empty)

Stop service for speech recognition

This service is available when parameter ~contiunous is True.

Parameters

~voice_topic (String, default: speech_to_text)

Publishing voice topic name
~audio_topic (String, default: audio)

Subscribing audio topic name
~enable_sound_effect (Bool, default: True)

Flag to enable or disable sound to play sound on recognition.
~language (String, default: en-US)

Language to be recognized
~engine (Enum[String], default: Google)

Speech-to-text engine (To see full options use dynamic_reconfigure)

File truncated at 100 lines see the full file

CHANGELOG

Changelog for package ros_speech_recognition

2.1.31 (2025-05-13)

2.1.30 (2025-05-10)

2.1.29 (2025-01-05)

[doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
Contributors: Yukina Iwata

2.1.28 (2023-07-24)

2.1.27 (2023-06-24)

fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
Contributors: Kei Okada

2.1.26 (2023-06-14)

add LICENSE files (#476)
Contributors: Kei Okada

2.1.25 (2023-06-08)

[ros_speech_recognition] Add vosk engine (#474)
Pr/use sound themes freedesktop (#472)
add test to check if ros node is loadable (#463)
add self.conf_thresh in __init_ function (#457)
[ros_speech_recognition] add ubuntu-sounds dependency (#453)
[ros_speech_recognition] Return if result is empty (#443)
[ros_speece_recognition] Set confidence value of google (#434)
[ros_speech_recognition] add parrotry.launch (#414)
[ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
[ros_speech_recogniton, respeaker_ros] add confidence field (#411)
[ros_speech_recognition] add self cancellation for speech recogntion (#413)
[#405 and #410] Fix CI (#415)
add ROS interface for https://cloud.google.com/natural-language (#304)
GithubAction: add test for aarch64(melodic) / indigo (arm64) (#365)
- pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
Explicit python interpreter in catkin_virtualenv (#367)
.github/workflow: integrate all yaml to one (#338)
[ros_speech_recognition] Fixed the behavior of launch file (#336)
[ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
[ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
Enable sound play flag (#315)
Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura

2.1.24 (2021-07-26)

2.1.23 (2021-07-21)

2.1.22 (2021-06-10)

enable to change topic name from speech_recognition.launch (#254)
support SpeakerDiarization, see https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative (#244)
- [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
- more exception message for self.recognize
Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa

File truncated at 100 lines see the full file

Package Dependencies

Deps	Name
	catkin_virtualenv
	dynamic_reconfigure
	jsk_data
	speech_recognition_msgs
	catkin
	audio_capture
	audio_common_msgs
	sound_play
	rostest
	roslaunch

System Dependencies

Name
g++-static
flac
sound-theme-freedesktop
python3-pycryptodome
python3-gnupg

Dependant Packages

Name	Deps
jsk_3rdparty

Launch files

launch/parrotry.launch
- - use_google [default: true]
  - language [default: en-US]
  - confidence_threshold [default: 0.8]
launch/speech_recognition.launch
- - launch_sound_play [default: true] — Launch sound_play node to speak
  - launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
  - audio_topic [default: /audio] — Name of audio topic captured from microphone
  - voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
  - n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
  - engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
  - language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
  - continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
  - auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
  - self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
  - tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
  - tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged `ros_speech_recognition` at Robotics Stack Exchange

ros_speech_recognition package from jsk_3rdparty repo

aques_talk assimp_devel downward ff ffha google_cloud_texttospeech julius libcmt libsiftfast lpg_planner mini_maxwell nlopt osqp slic voice_text voicevox zdepth bayesian_belief_networks chaplus_ros dialogflow_task_executive emotion_analyzer gdrive_ros google_chat_ros influxdb_store jsk_3rdparty collada_urdf_jsk_patch laser_filters_jsk_patch julius_ros nfc_ros opt_camera pgm_learner respeaker_ros ros_google_cloud_language ros_speech_recognition rospatlite rosping rostwitter sesame_ros switchbot_ros webrtcvad_ros zdepth_image_transport

ROS Distro
indigo

Package Summary

Tags	No category tags.
Version	2.1.31
License	BSD
Build type	CATKIN
Use	RECOMMENDED

Repository Summary

Checkout URI	https://github.com/jsk-ros-pkg/jsk_3rdparty.git
VCS Type	git
VCS Version	master
Last Updated	2025-05-13
Dev Status	DEVELOPED
Released	RELEASED
Tags	No category tags.
Contributing	Help Wanted (-) Good First Issues (-) Pull Requests to Review (-)

Package Description

ROS wrapper for Python SpeechRecognition library

Additional Links

Website

Maintainers

Yuki Furuta

Authors

Yuki Furuta

ros_speech_recognition

A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.

Tutorials

Normal tutorial

Install this package and SpeechReconition

  sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
  

Launch speech recognition node

  roslaunch ros_speech_recognition speech_recognition.launch

Echo /speech_to_text

  rostopic echo /speech_to_text
  # you can get the recognition result
  

Parrotry tutorial

Parrotry mean オウム返し in Japanese

# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP

`speech_recognition_node.py` Interface

Publishing Topics

~voice_topic (speech_recognition_msgs/SpeechRecognitionCandidates)

Speech recognition candidates topic name.

Topic name is set by parameter ~voice_topic, and default value is speech_to_text.
sound_play (sound_play/SoundRequestAction)

Action client to play sound on events. If the action server is not available or ~enable_sound_effect is False, no sound is played.

Subscribing Topics

~audio_topic (audio_common_msgs/AudioData)

Audio stream data to be recognized.

Topis name is set by parameter ~audio_topic and default value is audio.

Advertising Services

speech_recognition (speech_recognition_msgs/SpeechRecognition)

Service for speech recognition
speech_recognition/start (std_srvs/Empty)

Start service for speech recognition

This service is available when parameter ~contiunous is True.
speech_recognition/start (std_srvs/Empty)

Stop service for speech recognition

This service is available when parameter ~contiunous is True.

Parameters

~voice_topic (String, default: speech_to_text)

Publishing voice topic name
~audio_topic (String, default: audio)

Subscribing audio topic name
~enable_sound_effect (Bool, default: True)

Flag to enable or disable sound to play sound on recognition.
~language (String, default: en-US)

Language to be recognized
~engine (Enum[String], default: Google)

Speech-to-text engine (To see full options use dynamic_reconfigure)

File truncated at 100 lines see the full file

CHANGELOG

Changelog for package ros_speech_recognition

2.1.31 (2025-05-13)

2.1.30 (2025-05-10)

2.1.29 (2025-01-05)

[doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
Contributors: Yukina Iwata

2.1.28 (2023-07-24)

2.1.27 (2023-06-24)

fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
Contributors: Kei Okada

2.1.26 (2023-06-14)

add LICENSE files (#476)
Contributors: Kei Okada

2.1.25 (2023-06-08)

[ros_speech_recognition] Add vosk engine (#474)
Pr/use sound themes freedesktop (#472)
add test to check if ros node is loadable (#463)
add self.conf_thresh in __init_ function (#457)
[ros_speech_recognition] add ubuntu-sounds dependency (#453)
[ros_speech_recognition] Return if result is empty (#443)
[ros_speece_recognition] Set confidence value of google (#434)
[ros_speech_recognition] add parrotry.launch (#414)
[ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
[ros_speech_recogniton, respeaker_ros] add confidence field (#411)
[ros_speech_recognition] add self cancellation for speech recogntion (#413)
[#405 and #410] Fix CI (#415)
add ROS interface for https://cloud.google.com/natural-language (#304)
GithubAction: add test for aarch64(melodic) / indigo (arm64) (#365)
- pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
Explicit python interpreter in catkin_virtualenv (#367)
.github/workflow: integrate all yaml to one (#338)
[ros_speech_recognition] Fixed the behavior of launch file (#336)
[ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
[ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
Enable sound play flag (#315)
Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura

2.1.24 (2021-07-26)

2.1.23 (2021-07-21)

2.1.22 (2021-06-10)

enable to change topic name from speech_recognition.launch (#254)
support SpeakerDiarization, see https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative (#244)
- [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
- more exception message for self.recognize
Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa

File truncated at 100 lines see the full file

Package Dependencies

Deps	Name
	catkin_virtualenv
	dynamic_reconfigure
	jsk_data
	speech_recognition_msgs
	catkin
	audio_capture
	audio_common_msgs
	sound_play
	rostest
	roslaunch

System Dependencies

Name
g++-static
flac
sound-theme-freedesktop
python3-pycryptodome
python3-gnupg

Dependant Packages

Name	Deps
jsk_3rdparty
jsk_nao_startup

Launch files

launch/parrotry.launch
- - use_google [default: true]
  - language [default: en-US]
  - confidence_threshold [default: 0.8]
launch/speech_recognition.launch
- - launch_sound_play [default: true] — Launch sound_play node to speak
  - launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
  - audio_topic [default: /audio] — Name of audio topic captured from microphone
  - voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
  - n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
  - engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
  - language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
  - continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
  - auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
  - self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
  - tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
  - tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged `ros_speech_recognition` at Robotics Stack Exchange

ros_speech_recognition package from jsk_3rdparty repo

aques_talk assimp_devel downward ff ffha google_cloud_texttospeech julius libcmt libsiftfast lpg_planner mini_maxwell nlopt osqp slic voice_text voicevox zdepth bayesian_belief_networks chaplus_ros dialogflow_task_executive emotion_analyzer gdrive_ros google_chat_ros influxdb_store jsk_3rdparty collada_urdf_jsk_patch laser_filters_jsk_patch julius_ros nfc_ros opt_camera pgm_learner respeaker_ros ros_google_cloud_language ros_speech_recognition rospatlite rosping rostwitter sesame_ros switchbot_ros webrtcvad_ros zdepth_image_transport

ROS Distro
hydro

Package Summary

Tags	No category tags.
Version	2.1.31
License	BSD
Build type	CATKIN
Use	RECOMMENDED

Repository Summary

Checkout URI	https://github.com/jsk-ros-pkg/jsk_3rdparty.git
VCS Type	git
VCS Version	master
Last Updated	2025-05-13
Dev Status	DEVELOPED
Released	RELEASED
Tags	No category tags.
Contributing	Help Wanted (-) Good First Issues (-) Pull Requests to Review (-)

Package Description

ROS wrapper for Python SpeechRecognition library

Additional Links

Website

Maintainers

Yuki Furuta

Authors

Yuki Furuta

ros_speech_recognition

A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.

Tutorials

Normal tutorial

Install this package and SpeechReconition

  sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
  

Launch speech recognition node

  roslaunch ros_speech_recognition speech_recognition.launch

Echo /speech_to_text

  rostopic echo /speech_to_text
  # you can get the recognition result
  

Parrotry tutorial

Parrotry mean オウム返し in Japanese

# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP

`speech_recognition_node.py` Interface

Publishing Topics

~voice_topic (speech_recognition_msgs/SpeechRecognitionCandidates)

Speech recognition candidates topic name.

Topic name is set by parameter ~voice_topic, and default value is speech_to_text.
sound_play (sound_play/SoundRequestAction)

Action client to play sound on events. If the action server is not available or ~enable_sound_effect is False, no sound is played.

Subscribing Topics

~audio_topic (audio_common_msgs/AudioData)

Audio stream data to be recognized.

Topis name is set by parameter ~audio_topic and default value is audio.

Advertising Services

speech_recognition (speech_recognition_msgs/SpeechRecognition)

Service for speech recognition
speech_recognition/start (std_srvs/Empty)

Start service for speech recognition

This service is available when parameter ~contiunous is True.
speech_recognition/start (std_srvs/Empty)

Stop service for speech recognition

This service is available when parameter ~contiunous is True.

Parameters

~voice_topic (String, default: speech_to_text)

Publishing voice topic name
~audio_topic (String, default: audio)

Subscribing audio topic name
~enable_sound_effect (Bool, default: True)

Flag to enable or disable sound to play sound on recognition.
~language (String, default: en-US)

Language to be recognized
~engine (Enum[String], default: Google)

Speech-to-text engine (To see full options use dynamic_reconfigure)

File truncated at 100 lines see the full file

CHANGELOG

Changelog for package ros_speech_recognition

2.1.31 (2025-05-13)

2.1.30 (2025-05-10)

2.1.29 (2025-01-05)

[doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
Contributors: Yukina Iwata

2.1.28 (2023-07-24)

2.1.27 (2023-06-24)

fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
Contributors: Kei Okada

2.1.26 (2023-06-14)

add LICENSE files (#476)
Contributors: Kei Okada

2.1.25 (2023-06-08)

[ros_speech_recognition] Add vosk engine (#474)
Pr/use sound themes freedesktop (#472)
add test to check if ros node is loadable (#463)
add self.conf_thresh in __init_ function (#457)
[ros_speech_recognition] add ubuntu-sounds dependency (#453)
[ros_speech_recognition] Return if result is empty (#443)
[ros_speece_recognition] Set confidence value of google (#434)
[ros_speech_recognition] add parrotry.launch (#414)
[ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
[ros_speech_recogniton, respeaker_ros] add confidence field (#411)
[ros_speech_recognition] add self cancellation for speech recogntion (#413)
[#405 and #410] Fix CI (#415)
add ROS interface for https://cloud.google.com/natural-language (#304)
GithubAction: add test for aarch64(melodic) / indigo (arm64) (#365)
- pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
Explicit python interpreter in catkin_virtualenv (#367)
.github/workflow: integrate all yaml to one (#338)
[ros_speech_recognition] Fixed the behavior of launch file (#336)
[ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
[ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
Enable sound play flag (#315)
Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura

2.1.24 (2021-07-26)

2.1.23 (2021-07-21)

2.1.22 (2021-06-10)

enable to change topic name from speech_recognition.launch (#254)
support SpeakerDiarization, see https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative (#244)
- [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
- more exception message for self.recognize
Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa

File truncated at 100 lines see the full file

Package Dependencies

Deps	Name
	catkin_virtualenv
	dynamic_reconfigure
	jsk_data
	speech_recognition_msgs
	catkin
	audio_capture
	audio_common_msgs
	sound_play
	rostest
	roslaunch

System Dependencies

Name
g++-static
flac
sound-theme-freedesktop
python3-pycryptodome
python3-gnupg

Dependant Packages

Name	Deps
jsk_3rdparty
jsk_nao_startup

Launch files

launch/parrotry.launch
- - use_google [default: true]
  - language [default: en-US]
  - confidence_threshold [default: 0.8]
launch/speech_recognition.launch
- - launch_sound_play [default: true] — Launch sound_play node to speak
  - launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
  - audio_topic [default: /audio] — Name of audio topic captured from microphone
  - voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
  - n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
  - engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
  - language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
  - continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
  - auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
  - self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
  - tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
  - tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged `ros_speech_recognition` at Robotics Stack Exchange

ros_speech_recognition package from jsk_3rdparty repo

aques_talk assimp_devel downward ff ffha google_cloud_texttospeech julius libcmt libsiftfast lpg_planner mini_maxwell nlopt osqp slic voice_text voicevox zdepth bayesian_belief_networks chaplus_ros dialogflow_task_executive emotion_analyzer gdrive_ros google_chat_ros influxdb_store jsk_3rdparty collada_urdf_jsk_patch laser_filters_jsk_patch julius_ros nfc_ros opt_camera pgm_learner respeaker_ros ros_google_cloud_language ros_speech_recognition rospatlite rosping rostwitter sesame_ros switchbot_ros webrtcvad_ros zdepth_image_transport

ROS Distro
kinetic

Package Summary

Tags	No category tags.
Version	2.1.31
License	BSD
Build type	CATKIN
Use	RECOMMENDED

Repository Summary

Checkout URI	https://github.com/jsk-ros-pkg/jsk_3rdparty.git
VCS Type	git
VCS Version	master
Last Updated	2025-05-13
Dev Status	DEVELOPED
Released	RELEASED
Tags	No category tags.
Contributing	Help Wanted (-) Good First Issues (-) Pull Requests to Review (-)

Package Description

ROS wrapper for Python SpeechRecognition library

Additional Links

Website

Maintainers

Yuki Furuta

Authors

Yuki Furuta

ros_speech_recognition

A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.

Tutorials

Normal tutorial

Install this package and SpeechReconition

  sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
  

Launch speech recognition node

  roslaunch ros_speech_recognition speech_recognition.launch

Echo /speech_to_text

  rostopic echo /speech_to_text
  # you can get the recognition result
  

Parrotry tutorial

Parrotry mean オウム返し in Japanese

# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP

`speech_recognition_node.py` Interface

Publishing Topics

~voice_topic (speech_recognition_msgs/SpeechRecognitionCandidates)

Speech recognition candidates topic name.

Topic name is set by parameter ~voice_topic, and default value is speech_to_text.
sound_play (sound_play/SoundRequestAction)

Action client to play sound on events. If the action server is not available or ~enable_sound_effect is False, no sound is played.

Subscribing Topics

~audio_topic (audio_common_msgs/AudioData)

Audio stream data to be recognized.

Topis name is set by parameter ~audio_topic and default value is audio.

Advertising Services

speech_recognition (speech_recognition_msgs/SpeechRecognition)

Service for speech recognition
speech_recognition/start (std_srvs/Empty)

Start service for speech recognition

This service is available when parameter ~contiunous is True.
speech_recognition/start (std_srvs/Empty)

Stop service for speech recognition

This service is available when parameter ~contiunous is True.

Parameters

~voice_topic (String, default: speech_to_text)

Publishing voice topic name
~audio_topic (String, default: audio)

Subscribing audio topic name
~enable_sound_effect (Bool, default: True)

Flag to enable or disable sound to play sound on recognition.
~language (String, default: en-US)

Language to be recognized
~engine (Enum[String], default: Google)

Speech-to-text engine (To see full options use dynamic_reconfigure)

File truncated at 100 lines see the full file

CHANGELOG

Changelog for package ros_speech_recognition

2.1.31 (2025-05-13)

2.1.30 (2025-05-10)

2.1.29 (2025-01-05)

[doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
Contributors: Yukina Iwata

2.1.28 (2023-07-24)

2.1.27 (2023-06-24)

fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
Contributors: Kei Okada

2.1.26 (2023-06-14)

add LICENSE files (#476)
Contributors: Kei Okada

2.1.25 (2023-06-08)

[ros_speech_recognition] Add vosk engine (#474)
Pr/use sound themes freedesktop (#472)
add test to check if ros node is loadable (#463)
add self.conf_thresh in __init_ function (#457)
[ros_speech_recognition] add ubuntu-sounds dependency (#453)
[ros_speech_recognition] Return if result is empty (#443)
[ros_speece_recognition] Set confidence value of google (#434)
[ros_speech_recognition] add parrotry.launch (#414)
[ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
[ros_speech_recogniton, respeaker_ros] add confidence field (#411)
[ros_speech_recognition] add self cancellation for speech recogntion (#413)
[#405 and #410] Fix CI (#415)
add ROS interface for https://cloud.google.com/natural-language (#304)
GithubAction: add test for aarch64(melodic) / indigo (arm64) (#365)
- pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
Explicit python interpreter in catkin_virtualenv (#367)
.github/workflow: integrate all yaml to one (#338)
[ros_speech_recognition] Fixed the behavior of launch file (#336)
[ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
[ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
Enable sound play flag (#315)
Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura

2.1.24 (2021-07-26)

2.1.23 (2021-07-21)

2.1.22 (2021-06-10)

enable to change topic name from speech_recognition.launch (#254)
support SpeakerDiarization, see https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative (#244)
- [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
- more exception message for self.recognize
Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa

File truncated at 100 lines see the full file

Package Dependencies

Deps	Name
	catkin_virtualenv
	dynamic_reconfigure
	jsk_data
	speech_recognition_msgs
	catkin
	audio_capture
	audio_common_msgs
	sound_play
	rostest
	roslaunch

System Dependencies

Name
g++-static
flac
sound-theme-freedesktop
python3-pycryptodome
python3-gnupg

Dependant Packages

Name	Deps
jsk_3rdparty

Launch files

launch/parrotry.launch
- - use_google [default: true]
  - language [default: en-US]
  - confidence_threshold [default: 0.8]
launch/speech_recognition.launch
- - launch_sound_play [default: true] — Launch sound_play node to speak
  - launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
  - audio_topic [default: /audio] — Name of audio topic captured from microphone
  - voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
  - n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
  - engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
  - language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
  - continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
  - auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
  - self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
  - tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
  - tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged `ros_speech_recognition` at Robotics Stack Exchange

ros_speech_recognition package from jsk_3rdparty repo

aques_talk assimp_devel downward ff ffha google_cloud_texttospeech julius libcmt libsiftfast lpg_planner mini_maxwell nlopt osqp slic voice_text voicevox zdepth bayesian_belief_networks chaplus_ros dialogflow_task_executive emotion_analyzer gdrive_ros google_chat_ros influxdb_store jsk_3rdparty collada_urdf_jsk_patch laser_filters_jsk_patch julius_ros nfc_ros opt_camera pgm_learner respeaker_ros ros_google_cloud_language ros_speech_recognition rospatlite rosping rostwitter sesame_ros switchbot_ros webrtcvad_ros zdepth_image_transport

ROS Distro
melodic

Package Summary

Tags	No category tags.
Version	2.1.31
License	BSD
Build type	CATKIN
Use	RECOMMENDED

Repository Summary

Checkout URI	https://github.com/jsk-ros-pkg/jsk_3rdparty.git
VCS Type	git
VCS Version	master
Last Updated	2025-05-13
Dev Status	DEVELOPED
Released	RELEASED
Tags	No category tags.
Contributing	Help Wanted (-) Good First Issues (-) Pull Requests to Review (-)

Package Description

ROS wrapper for Python SpeechRecognition library

Additional Links

Website

Maintainers

Yuki Furuta

Authors

Yuki Furuta

ros_speech_recognition

A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.

Tutorials

Normal tutorial

Install this package and SpeechReconition

  sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
  

Launch speech recognition node

  roslaunch ros_speech_recognition speech_recognition.launch

Echo /speech_to_text

  rostopic echo /speech_to_text
  # you can get the recognition result
  

Parrotry tutorial

Parrotry mean オウム返し in Japanese

# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP

`speech_recognition_node.py` Interface

Publishing Topics

~voice_topic (speech_recognition_msgs/SpeechRecognitionCandidates)

Speech recognition candidates topic name.

Topic name is set by parameter ~voice_topic, and default value is speech_to_text.
sound_play (sound_play/SoundRequestAction)

Action client to play sound on events. If the action server is not available or ~enable_sound_effect is False, no sound is played.

Subscribing Topics

~audio_topic (audio_common_msgs/AudioData)

Audio stream data to be recognized.

Topis name is set by parameter ~audio_topic and default value is audio.

Advertising Services

speech_recognition (speech_recognition_msgs/SpeechRecognition)

Service for speech recognition
speech_recognition/start (std_srvs/Empty)

Start service for speech recognition

This service is available when parameter ~contiunous is True.
speech_recognition/start (std_srvs/Empty)

Stop service for speech recognition

This service is available when parameter ~contiunous is True.

Parameters

~voice_topic (String, default: speech_to_text)

Publishing voice topic name
~audio_topic (String, default: audio)

Subscribing audio topic name
~enable_sound_effect (Bool, default: True)

Flag to enable or disable sound to play sound on recognition.
~language (String, default: en-US)

Language to be recognized
~engine (Enum[String], default: Google)

Speech-to-text engine (To see full options use dynamic_reconfigure)

File truncated at 100 lines see the full file

CHANGELOG

Changelog for package ros_speech_recognition

2.1.31 (2025-05-13)

2.1.30 (2025-05-10)

2.1.29 (2025-01-05)

[doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
Contributors: Yukina Iwata

2.1.28 (2023-07-24)

2.1.27 (2023-06-24)

fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
Contributors: Kei Okada

2.1.26 (2023-06-14)

add LICENSE files (#476)
Contributors: Kei Okada

2.1.25 (2023-06-08)

[ros_speech_recognition] Add vosk engine (#474)
Pr/use sound themes freedesktop (#472)
add test to check if ros node is loadable (#463)
add self.conf_thresh in __init_ function (#457)
[ros_speech_recognition] add ubuntu-sounds dependency (#453)
[ros_speech_recognition] Return if result is empty (#443)
[ros_speece_recognition] Set confidence value of google (#434)
[ros_speech_recognition] add parrotry.launch (#414)
[ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
[ros_speech_recogniton, respeaker_ros] add confidence field (#411)
[ros_speech_recognition] add self cancellation for speech recogntion (#413)
[#405 and #410] Fix CI (#415)
add ROS interface for https://cloud.google.com/natural-language (#304)
GithubAction: add test for aarch64(melodic) / indigo (arm64) (#365)
- pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
Explicit python interpreter in catkin_virtualenv (#367)
.github/workflow: integrate all yaml to one (#338)
[ros_speech_recognition] Fixed the behavior of launch file (#336)
[ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
[ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
Enable sound play flag (#315)
Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura

2.1.24 (2021-07-26)

2.1.23 (2021-07-21)

2.1.22 (2021-06-10)

enable to change topic name from speech_recognition.launch (#254)
support SpeakerDiarization, see https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative (#244)
- [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
- more exception message for self.recognize
Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa

File truncated at 100 lines see the full file

Package Dependencies

Deps	Name
	catkin_virtualenv
	dynamic_reconfigure
	jsk_data
	speech_recognition_msgs
	catkin
	audio_capture
	audio_common_msgs
	sound_play
	rostest
	roslaunch

System Dependencies

Name
g++-static
flac
sound-theme-freedesktop
python3-pycryptodome
python3-gnupg

Dependant Packages

Name	Deps
jsk_3rdparty

Launch files

launch/parrotry.launch
- - use_google [default: true]
  - language [default: en-US]
  - confidence_threshold [default: 0.8]
launch/speech_recognition.launch
- - launch_sound_play [default: true] — Launch sound_play node to speak
  - launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
  - audio_topic [default: /audio] — Name of audio topic captured from microphone
  - voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
  - n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
  - engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
  - language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
  - continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
  - auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
  - self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
  - tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
  - tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged `ros_speech_recognition` at Robotics Stack Exchange

ros_speech_recognition package from jsk_3rdparty repo

aques_talk assimp_devel downward ff ffha google_cloud_texttospeech julius libcmt libsiftfast lpg_planner mini_maxwell nlopt osqp slic voice_text voicevox zdepth bayesian_belief_networks chaplus_ros dialogflow_task_executive emotion_analyzer gdrive_ros google_chat_ros influxdb_store jsk_3rdparty collada_urdf_jsk_patch laser_filters_jsk_patch julius_ros nfc_ros opt_camera pgm_learner respeaker_ros ros_google_cloud_language ros_speech_recognition rospatlite rosping rostwitter sesame_ros switchbot_ros webrtcvad_ros zdepth_image_transport

ROS Distro
noetic

Package Summary

Tags	No category tags.
Version	2.1.31
License	BSD
Build type	CATKIN
Use	RECOMMENDED

Repository Summary

Checkout URI	https://github.com/jsk-ros-pkg/jsk_3rdparty.git
VCS Type	git
VCS Version	master
Last Updated	2025-05-13
Dev Status	DEVELOPED
Released	RELEASED
Tags	No category tags.
Contributing	Help Wanted (-) Good First Issues (-) Pull Requests to Review (-)

Package Description

ROS wrapper for Python SpeechRecognition library

Additional Links

Website

Maintainers

Yuki Furuta

Authors

Yuki Furuta

ros_speech_recognition

A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.

Tutorials

Normal tutorial

Install this package and SpeechReconition

  sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
  

Launch speech recognition node

  roslaunch ros_speech_recognition speech_recognition.launch

Echo /speech_to_text

  rostopic echo /speech_to_text
  # you can get the recognition result
  

Parrotry tutorial

Parrotry mean オウム返し in Japanese

# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP

`speech_recognition_node.py` Interface

Publishing Topics

~voice_topic (speech_recognition_msgs/SpeechRecognitionCandidates)

Speech recognition candidates topic name.

Topic name is set by parameter ~voice_topic, and default value is speech_to_text.
sound_play (sound_play/SoundRequestAction)

Action client to play sound on events. If the action server is not available or ~enable_sound_effect is False, no sound is played.

Subscribing Topics

~audio_topic (audio_common_msgs/AudioData)

Audio stream data to be recognized.

Topis name is set by parameter ~audio_topic and default value is audio.

Advertising Services

speech_recognition (speech_recognition_msgs/SpeechRecognition)

Service for speech recognition
speech_recognition/start (std_srvs/Empty)

Start service for speech recognition

This service is available when parameter ~contiunous is True.
speech_recognition/start (std_srvs/Empty)

Stop service for speech recognition

This service is available when parameter ~contiunous is True.

Parameters

~voice_topic (String, default: speech_to_text)

Publishing voice topic name
~audio_topic (String, default: audio)

Subscribing audio topic name
~enable_sound_effect (Bool, default: True)

Flag to enable or disable sound to play sound on recognition.
~language (String, default: en-US)

Language to be recognized
~engine (Enum[String], default: Google)

Speech-to-text engine (To see full options use dynamic_reconfigure)

File truncated at 100 lines see the full file

CHANGELOG

Changelog for package ros_speech_recognition

2.1.31 (2025-05-13)

2.1.30 (2025-05-10)

2.1.29 (2025-01-05)

[doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
Contributors: Yukina Iwata

2.1.28 (2023-07-24)

2.1.27 (2023-06-24)

fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
Contributors: Kei Okada

2.1.26 (2023-06-14)

add LICENSE files (#476)
Contributors: Kei Okada

2.1.25 (2023-06-08)

[ros_speech_recognition] Add vosk engine (#474)
Pr/use sound themes freedesktop (#472)
add test to check if ros node is loadable (#463)
add self.conf_thresh in __init_ function (#457)
[ros_speech_recognition] add ubuntu-sounds dependency (#453)
[ros_speech_recognition] Return if result is empty (#443)
[ros_speece_recognition] Set confidence value of google (#434)
[ros_speech_recognition] add parrotry.launch (#414)
[ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
[ros_speech_recogniton, respeaker_ros] add confidence field (#411)
[ros_speech_recognition] add self cancellation for speech recogntion (#413)
[#405 and #410] Fix CI (#415)
add ROS interface for https://cloud.google.com/natural-language (#304)
GithubAction: add test for aarch64(melodic) / indigo (arm64) (#365)
- pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
Explicit python interpreter in catkin_virtualenv (#367)
.github/workflow: integrate all yaml to one (#338)
[ros_speech_recognition] Fixed the behavior of launch file (#336)
[ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
[ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
Enable sound play flag (#315)
Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura

2.1.24 (2021-07-26)

2.1.23 (2021-07-21)

2.1.22 (2021-06-10)

enable to change topic name from speech_recognition.launch (#254)
support SpeakerDiarization, see https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative (#244)
- [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
- more exception message for self.recognize
Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa

File truncated at 100 lines see the full file

Package Dependencies

Deps	Name
	catkin_virtualenv
	dynamic_reconfigure
	jsk_data
	speech_recognition_msgs
	catkin
	audio_capture
	audio_common_msgs
	sound_play
	rostest
	roslaunch

System Dependencies

Name
g++-static
flac
sound-theme-freedesktop
python3-pycryptodome
python3-gnupg

Dependant Packages

Name	Deps
jsk_3rdparty

Launch files

launch/parrotry.launch
- - use_google [default: true]
  - language [default: en-US]
  - confidence_threshold [default: 0.8]
launch/speech_recognition.launch
- - launch_sound_play [default: true] — Launch sound_play node to speak
  - launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
  - audio_topic [default: /audio] — Name of audio topic captured from microphone
  - voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
  - n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
  - engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
  - language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
  - continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
  - auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
  - self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
  - tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
  - tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged `ros_speech_recognition` at Robotics Stack Exchange