Package Summary

Tags No category tags.
Version 2.1.20
License BSD
Build type CATKIN
Use RECOMMENDED

Repository Summary

Checkout URI https://github.com/jsk-ros-pkg/jsk_3rdparty.git
VCS Type git
VCS Version master
Last Updated 2020-08-07
Dev Status DEVELOPED
CI status Continuous Integration
Released RELEASED
Package Tags No category tags.
Contributing Help Wanted (0)
Good First Issues (0)
Pull Requests to Review (0)

Package Description

ROS wrapper for Python SpeechRecognition library

Additional Links

Maintainers

  • Yuki Furuta

Authors

  • Yuki Furuta

ros_speech_recognition

A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.

Tutorials

  1. Install this package and SpeechReconition
  sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition

  1. Launch speech recognition node
  roslaunch ros_speech_recognition speech_recognition.launch

  1. Use from Python
  import rospy
  from ros_speech_recognition import SpeechRecognitionClient

  rospy.init_node("client")
  client = SpeechRecognitionClient()
  result = client.recognize()  # Please say 'Hello, world!' towards microphone
  print result # => 'Hello, world!'

Interface

Publishing Topics

  • sound_play (sound_play/SoundRequestAction)

Action client to play sound on events. If the action server is not available, no sound is played.

Subscribing Topics

  • audio (audio_common_msgs/AudioData)

Audio stream data to be recognized.

Advertising Services

  • speech_recognition (speech_recognition_msgs/SpeechRecognition)

Service for speech recognition

Parameters

  • ~language (String, default: en-US)

Language to be recognized

  • ~engine (Enum[String], default: Google)

Speech-to-text engine (To see full options use dynamic_reconfigure)

  • ~energy_threshold (Double, default: 300)

Threshold for Voice activity detection

  • ~dynamic_energy_threshold (Bool, default: True)

Adaptive estimation for energy_threshold

  • ~dynamic_energy_adjustment_damping (Double, default: 0.15)

Damping threshold for dynamic VAD

  • ~dynamic_energy_ratio (Double, default: 1.5)

Energy ratio for dynamic VAD

  • ~pause_threshold (Double, default: 0.8)

Seconds of non-speaking audio before a phrase is considered complete

  • ~operation_timeout (Double, default: 0.0)

Seconds after an internal operation (e.g., an API request) starts before it times out

  • ~listen_timeout (Double, default: 0.0)

The maximum number of seconds that this will wait for a phrase to start before giving up

  • ~phrase_time_limit (Double, default: 10.0)

The maximum number of seconds that this will allow a phrase to continue before stopping and returning the part of the phrase processed before the time limit was reached

  • ~phrase_threshold (Double, default: 0.3)

Minimum seconds of speaking audio before we consider the speaking audio a phrase

  • ~non_speaking_duration (Double, default: 0.5)

Seconds of non-speaking audio to keep on both sides of the recording

  • ~duration (Double, default: 10.0)

Seconds of waiting for speech

  • ~audio_topic (String, default: audio)

Topic name of input audio data

  • ~depth (Int, default: 16)

Depth of audio signal

  • ~n_channel (Int, default: 1)

Total number of channels in audio data (e.g. 1: mono, 2: stereo)

  • ~sample_rate (Int, default: 16000)

Sample rate of audio signal

  • ~buffer_size (Int, default: 10240)

Maximum buffer size to store audio data for speech recognition

  • ~start_signal (String, default: /usr/share/sounds/ubuntu/stereo/bell.ogg)

Path to sound file for bell on the start of audio caption

  • ~recognized_signal (String, default: /usr/share/sounds/ubuntu/stereo/button-toggle-on.ogg)

Path to sound file for bell on the end of audio caption

  • ~success_signal (String, default: /usr/share/sounds/ubuntu/stereo/message-new-instant.ogg)

Path to sound file for bell on getting successful recognition result

  • ~timeout_signal (String, default: /usr/share/sounds/ubuntu/stereo/window-slide.ogg)

Path to sound file for bell on timeout for recognition

  • ~continuous (Bool, default: False)

Selecting to use topic or service. By default, service is used.

  • ~google_key (String, default: None)

Auth Key for Google API. If None, use public key. (No guarantee to be blocked.)
This is valid only if ~engine is Google.

  • ~google_cloud_credentials_json (String, default: None)

Path to credential json file. This is valid only if ~engine is GoogleCloud.

  • ~google_cloud_preferred_phrases ([String], default: None)

Preferred phrases parameters. This is valid only if ~engine is GoogleCloud.

  • ~bing_key (String, default: None)

Auth key for Bing API.
This is valid only if ~engine is bing.

Author

Yuki Furuta <furushchev@jsk.imi.i.u-tokyo.ac.jp>

CHANGELOG

Changelog for package ros_speech_recognition

2.1.20 (2020-08-07)

2.1.19 (2020-07-21)

2.1.18 (2020-07-20)

  • Fix for noetic (#200)
    • fix 2to3, with print, raise, exception
  • [ros_speech_recognition] Enable multi channel audio recognition (#198)
    • adjust type code to the CPU platform
    • replace rosparam name: channels -> n_channel
    • add rosparam description to README
    • enable multi channel audio recognition
  • Add args to ros_speech_recognition (#197)
    • Add flac as run_depend for SpeechRecognition pip package
    • Use catkin_virtualenv to use SpeechRecognition pip package
    • Add arguments and params to pass rostest
    • Add test for ros_speech_recognition
    • add args to launch
    • add pip install to tutorials
    • add param description to README
  • Contributors: Kei Okada, Naoya Yamaguchi

2.1.17 (2020-04-16)

2.1.16 (2020-04-16)

2.1.15 (2019-12-12)

2.1.14 (2019-11-21)

  • set SoundRequest.volume for kinetic (#173)
  • Contributors: Kei Okada

2.1.13 (2019-07-10)

2.1.12 (2019-05-25)

  • fixes GoogleCloud auth (#158)
  • Contributors: jonasius

2.1.11 (2018-08-29)

2.1.10 (2018-04-25)

2.1.9 (2018-04-24)

2.1.8 (2018-04-17)

2.1.7 (2018-04-09)

2.1.6 (2017-11-21)

2.1.5 (2017-11-20)

  • ros_speech_recognition: add continuous mode (#127)
  • ros_speech_recognition: add README (#123)
  • add ros_speech_recognition package (#121)
  • Contributors: Yuki Furuta

2.1.4 (2017-07-16)

2.1.3 (2017-07-07)

2.1.2 (2017-07-06)

2.1.1 (2017-07-05)

2.1.0 (2017-07-02)

2.0.20 (2017-05-09)

2.0.19 (2017-02-22)

2.0.18 (2016-10-28)

2.0.17 (2016-10-22)

2.0.16 (2016-10-17)

2.0.15 (2016-10-16)

2.0.14 (2016-03-20)

2.0.13 (2015-12-15)

2.0.12 (2015-11-26)

2.0.11 (2015-10-07 14:16)

2.0.10 (2015-10-07 12:47)

2.0.9 (2015-09-26)

2.0.8 (2015-09-15)

2.0.7 (2015-09-14)

2.0.6 (2015-09-08)

2.0.5 (2015-08-23)

2.0.4 (2015-08-18)

2.0.3 (2015-08-01)

2.0.2 (2015-06-29)

2.0.1 (2015-06-19 21:21)

2.0.0 (2015-06-19 10:41)

1.0.71 (2015-05-17)

1.0.70 (2015-05-08)

1.0.69 (2015-05-05 12:28)

1.0.68 (2015-05-05 09:49)

1.0.67 (2015-05-03)

1.0.66 (2015-04-03)

1.0.65 (2015-04-02)

1.0.64 (2015-03-29)

1.0.63 (2015-02-19)

1.0.62 (2015-02-17)

1.0.61 (2015-02-11)

1.0.60 (2015-02-03 10:12)

1.0.59 (2015-02-03 04:05)

1.0.58 (2015-01-07)

1.0.57 (2014-12-23)

1.0.56 (2014-12-17)

1.0.55 (2014-12-09)

1.0.54 (2014-11-15)

1.0.53 (2014-11-01)

1.0.52 (2014-10-23)

1.0.51 (2014-10-20 16:01)

1.0.50 (2014-10-20 01:50)

1.0.49 (2014-10-13)

1.0.48 (2014-10-12)

1.0.47 (2014-10-08)

1.0.46 (2014-10-03)

1.0.45 (2014-09-29)

1.0.44 (2014-09-26 09:17)

1.0.43 (2014-09-26 01:08)

1.0.42 (2014-09-25)

1.0.41 (2014-09-23)

1.0.40 (2014-09-19)

1.0.39 (2014-09-17)

1.0.38 (2014-09-13)

1.0.37 (2014-09-08)

1.0.36 (2014-09-01)

1.0.35 (2014-08-16)

1.0.34 (2014-08-14)

1.0.33 (2014-07-28)

1.0.32 (2014-07-26)

1.0.31 (2014-07-23)

1.0.30 (2014-07-15)

1.0.29 (2014-07-02)

1.0.28 (2014-06-24)

1.0.27 (2014-06-10)

1.0.26 (2014-05-30)

1.0.25 (2014-05-26)

1.0.24 (2014-05-24)

1.0.23 (2014-05-23)

1.0.22 (2014-05-22)

1.0.21 (2014-05-20)

1.0.20 (2014-05-09)

1.0.19 (2014-05-06)

1.0.18 (2014-05-04)

1.0.17 (2014-04-20)

1.0.16 (2014-04-19 23:29)

1.0.15 (2014-04-19 20:19)

1.0.14 (2014-04-19 12:52)

1.0.13 (2014-04-19 11:06)

1.0.12 (2014-04-18 16:58)

1.0.11 (2014-04-18 08:18)

1.0.10 (2014-04-17)

1.0.9 (2014-04-12)

1.0.8 (2014-04-11)

1.0.7 (2014-04-10)

1.0.6 (2014-04-07)

1.0.5 (2014-03-31)

1.0.4 (2014-03-29)

1.0.3 (2014-03-19)

1.0.2 (2014-03-12)

1.0.1 (2014-03-07)

1.0.0 (2014-03-05)

Wiki Tutorials

See ROS Wiki Tutorials for more details.

Source Tutorials

Not currently indexed.

Launch files

  • launch/speech_recognition.launch
      • launch_sound_play [default: true]
      • launch_audio_capture [default: true]
      • audio_topic [default: /audio]
      • n_channel [default: 1]
      • depth [default: 16]
      • sample_rate [default: 16000]
      • device [default: ]
      • engine [default: Google]
      • language [default: en-US]
      • continuous [default: false]

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged ros_speech_recognition at answers.ros.org

Package Summary

Tags No category tags.
Version 2.1.20
License BSD
Build type CATKIN
Use RECOMMENDED

Repository Summary

Checkout URI https://github.com/jsk-ros-pkg/jsk_3rdparty.git
VCS Type git
VCS Version master
Last Updated 2020-08-07
Dev Status DEVELOPED
CI status Continuous Integration
Released RELEASED
Package Tags No category tags.
Contributing Help Wanted (0)
Good First Issues (0)
Pull Requests to Review (0)

Package Description

ROS wrapper for Python SpeechRecognition library

Additional Links

Maintainers

  • Yuki Furuta

Authors

  • Yuki Furuta

ros_speech_recognition

A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.

Tutorials

  1. Install this package and SpeechReconition
  sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition

  1. Launch speech recognition node
  roslaunch ros_speech_recognition speech_recognition.launch

  1. Use from Python
  import rospy
  from ros_speech_recognition import SpeechRecognitionClient

  rospy.init_node("client")
  client = SpeechRecognitionClient()
  result = client.recognize()  # Please say 'Hello, world!' towards microphone
  print result # => 'Hello, world!'

Interface

Publishing Topics

  • sound_play (sound_play/SoundRequestAction)

Action client to play sound on events. If the action server is not available, no sound is played.

Subscribing Topics

  • audio (audio_common_msgs/AudioData)

Audio stream data to be recognized.

Advertising Services

  • speech_recognition (speech_recognition_msgs/SpeechRecognition)

Service for speech recognition

Parameters

  • ~language (String, default: en-US)

Language to be recognized

  • ~engine (Enum[String], default: Google)

Speech-to-text engine (To see full options use dynamic_reconfigure)

  • ~energy_threshold (Double, default: 300)

Threshold for Voice activity detection

  • ~dynamic_energy_threshold (Bool, default: True)

Adaptive estimation for energy_threshold

  • ~dynamic_energy_adjustment_damping (Double, default: 0.15)

Damping threshold for dynamic VAD

  • ~dynamic_energy_ratio (Double, default: 1.5)

Energy ratio for dynamic VAD

  • ~pause_threshold (Double, default: 0.8)

Seconds of non-speaking audio before a phrase is considered complete

  • ~operation_timeout (Double, default: 0.0)

Seconds after an internal operation (e.g., an API request) starts before it times out

  • ~listen_timeout (Double, default: 0.0)

The maximum number of seconds that this will wait for a phrase to start before giving up

  • ~phrase_time_limit (Double, default: 10.0)

The maximum number of seconds that this will allow a phrase to continue before stopping and returning the part of the phrase processed before the time limit was reached

  • ~phrase_threshold (Double, default: 0.3)

Minimum seconds of speaking audio before we consider the speaking audio a phrase

  • ~non_speaking_duration (Double, default: 0.5)

Seconds of non-speaking audio to keep on both sides of the recording

  • ~duration (Double, default: 10.0)

Seconds of waiting for speech

  • ~audio_topic (String, default: audio)

Topic name of input audio data

  • ~depth (Int, default: 16)

Depth of audio signal

  • ~n_channel (Int, default: 1)

Total number of channels in audio data (e.g. 1: mono, 2: stereo)

  • ~sample_rate (Int, default: 16000)

Sample rate of audio signal

  • ~buffer_size (Int, default: 10240)

Maximum buffer size to store audio data for speech recognition

  • ~start_signal (String, default: /usr/share/sounds/ubuntu/stereo/bell.ogg)

Path to sound file for bell on the start of audio caption

  • ~recognized_signal (String, default: /usr/share/sounds/ubuntu/stereo/button-toggle-on.ogg)

Path to sound file for bell on the end of audio caption

  • ~success_signal (String, default: /usr/share/sounds/ubuntu/stereo/message-new-instant.ogg)

Path to sound file for bell on getting successful recognition result

  • ~timeout_signal (String, default: /usr/share/sounds/ubuntu/stereo/window-slide.ogg)

Path to sound file for bell on timeout for recognition

  • ~continuous (Bool, default: False)

Selecting to use topic or service. By default, service is used.

  • ~google_key (String, default: None)

Auth Key for Google API. If None, use public key. (No guarantee to be blocked.)
This is valid only if ~engine is Google.

  • ~google_cloud_credentials_json (String, default: None)

Path to credential json file. This is valid only if ~engine is GoogleCloud.

  • ~google_cloud_preferred_phrases ([String], default: None)

Preferred phrases parameters. This is valid only if ~engine is GoogleCloud.

  • ~bing_key (String, default: None)

Auth key for Bing API.
This is valid only if ~engine is bing.

Author

Yuki Furuta <furushchev@jsk.imi.i.u-tokyo.ac.jp>

CHANGELOG

Changelog for package ros_speech_recognition

2.1.20 (2020-08-07)

2.1.19 (2020-07-21)

2.1.18 (2020-07-20)

  • Fix for noetic (#200)
    • fix 2to3, with print, raise, exception
  • [ros_speech_recognition] Enable multi channel audio recognition (#198)
    • adjust type code to the CPU platform
    • replace rosparam name: channels -> n_channel
    • add rosparam description to README
    • enable multi channel audio recognition
  • Add args to ros_speech_recognition (#197)
    • Add flac as run_depend for SpeechRecognition pip package
    • Use catkin_virtualenv to use SpeechRecognition pip package
    • Add arguments and params to pass rostest
    • Add test for ros_speech_recognition
    • add args to launch
    • add pip install to tutorials
    • add param description to README
  • Contributors: Kei Okada, Naoya Yamaguchi

2.1.17 (2020-04-16)

2.1.16 (2020-04-16)

2.1.15 (2019-12-12)

2.1.14 (2019-11-21)

  • set SoundRequest.volume for kinetic (#173)
  • Contributors: Kei Okada

2.1.13 (2019-07-10)

2.1.12 (2019-05-25)

  • fixes GoogleCloud auth (#158)
  • Contributors: jonasius

2.1.11 (2018-08-29)

2.1.10 (2018-04-25)

2.1.9 (2018-04-24)

2.1.8 (2018-04-17)

2.1.7 (2018-04-09)

2.1.6 (2017-11-21)

2.1.5 (2017-11-20)

  • ros_speech_recognition: add continuous mode (#127)
  • ros_speech_recognition: add README (#123)
  • add ros_speech_recognition package (#121)
  • Contributors: Yuki Furuta

2.1.4 (2017-07-16)

2.1.3 (2017-07-07)

2.1.2 (2017-07-06)

2.1.1 (2017-07-05)

2.1.0 (2017-07-02)

2.0.20 (2017-05-09)

2.0.19 (2017-02-22)

2.0.18 (2016-10-28)

2.0.17 (2016-10-22)

2.0.16 (2016-10-17)

2.0.15 (2016-10-16)

2.0.14 (2016-03-20)

2.0.13 (2015-12-15)

2.0.12 (2015-11-26)

2.0.11 (2015-10-07 14:16)

2.0.10 (2015-10-07 12:47)

2.0.9 (2015-09-26)

2.0.8 (2015-09-15)

2.0.7 (2015-09-14)

2.0.6 (2015-09-08)

2.0.5 (2015-08-23)

2.0.4 (2015-08-18)

2.0.3 (2015-08-01)

2.0.2 (2015-06-29)

2.0.1 (2015-06-19 21:21)

2.0.0 (2015-06-19 10:41)

1.0.71 (2015-05-17)

1.0.70 (2015-05-08)

1.0.69 (2015-05-05 12:28)

1.0.68 (2015-05-05 09:49)

1.0.67 (2015-05-03)

1.0.66 (2015-04-03)

1.0.65 (2015-04-02)

1.0.64 (2015-03-29)

1.0.63 (2015-02-19)

1.0.62 (2015-02-17)

1.0.61 (2015-02-11)

1.0.60 (2015-02-03 10:12)

1.0.59 (2015-02-03 04:05)

1.0.58 (2015-01-07)

1.0.57 (2014-12-23)

1.0.56 (2014-12-17)

1.0.55 (2014-12-09)

1.0.54 (2014-11-15)

1.0.53 (2014-11-01)

1.0.52 (2014-10-23)

1.0.51 (2014-10-20 16:01)

1.0.50 (2014-10-20 01:50)

1.0.49 (2014-10-13)

1.0.48 (2014-10-12)

1.0.47 (2014-10-08)

1.0.46 (2014-10-03)

1.0.45 (2014-09-29)

1.0.44 (2014-09-26 09:17)

1.0.43 (2014-09-26 01:08)

1.0.42 (2014-09-25)

1.0.41 (2014-09-23)

1.0.40 (2014-09-19)

1.0.39 (2014-09-17)

1.0.38 (2014-09-13)

1.0.37 (2014-09-08)

1.0.36 (2014-09-01)

1.0.35 (2014-08-16)

1.0.34 (2014-08-14)

1.0.33 (2014-07-28)

1.0.32 (2014-07-26)

1.0.31 (2014-07-23)

1.0.30 (2014-07-15)

1.0.29 (2014-07-02)

1.0.28 (2014-06-24)

1.0.27 (2014-06-10)

1.0.26 (2014-05-30)

1.0.25 (2014-05-26)

1.0.24 (2014-05-24)

1.0.23 (2014-05-23)

1.0.22 (2014-05-22)

1.0.21 (2014-05-20)

1.0.20 (2014-05-09)

1.0.19 (2014-05-06)

1.0.18 (2014-05-04)

1.0.17 (2014-04-20)

1.0.16 (2014-04-19 23:29)

1.0.15 (2014-04-19 20:19)

1.0.14 (2014-04-19 12:52)

1.0.13 (2014-04-19 11:06)

1.0.12 (2014-04-18 16:58)

1.0.11 (2014-04-18 08:18)

1.0.10 (2014-04-17)

1.0.9 (2014-04-12)

1.0.8 (2014-04-11)

1.0.7 (2014-04-10)

1.0.6 (2014-04-07)

1.0.5 (2014-03-31)

1.0.4 (2014-03-29)

1.0.3 (2014-03-19)

1.0.2 (2014-03-12)

1.0.1 (2014-03-07)

1.0.0 (2014-03-05)

Wiki Tutorials

See ROS Wiki Tutorials for more details.

Source Tutorials

Not currently indexed.

Launch files

  • launch/speech_recognition.launch
      • launch_sound_play [default: true]
      • launch_audio_capture [default: true]
      • audio_topic [default: /audio]
      • n_channel [default: 1]
      • depth [default: 16]
      • sample_rate [default: 16000]
      • device [default: ]
      • engine [default: Google]
      • language [default: en-US]
      • continuous [default: false]

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged ros_speech_recognition at answers.ros.org

Package Summary

Tags No category tags.
Version 2.1.20
License BSD
Build type CATKIN
Use RECOMMENDED

Repository Summary

Checkout URI https://github.com/jsk-ros-pkg/jsk_3rdparty.git
VCS Type git
VCS Version master
Last Updated 2020-08-07
Dev Status DEVELOPED
CI status No Continuous Integration
Released RELEASED
Package Tags No category tags.
Contributing Help Wanted (0)
Good First Issues (0)
Pull Requests to Review (0)

Package Description

ROS wrapper for Python SpeechRecognition library

Additional Links

Maintainers

  • Yuki Furuta

Authors

  • Yuki Furuta

ros_speech_recognition

A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.

Tutorials

  1. Install this package and SpeechReconition
  sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition

  1. Launch speech recognition node
  roslaunch ros_speech_recognition speech_recognition.launch

  1. Use from Python
  import rospy
  from ros_speech_recognition import SpeechRecognitionClient

  rospy.init_node("client")
  client = SpeechRecognitionClient()
  result = client.recognize()  # Please say 'Hello, world!' towards microphone
  print result # => 'Hello, world!'

Interface

Publishing Topics

  • sound_play (sound_play/SoundRequestAction)

Action client to play sound on events. If the action server is not available, no sound is played.

Subscribing Topics

  • audio (audio_common_msgs/AudioData)

Audio stream data to be recognized.

Advertising Services

  • speech_recognition (speech_recognition_msgs/SpeechRecognition)

Service for speech recognition

Parameters

  • ~language (String, default: en-US)

Language to be recognized

  • ~engine (Enum[String], default: Google)

Speech-to-text engine (To see full options use dynamic_reconfigure)

  • ~energy_threshold (Double, default: 300)

Threshold for Voice activity detection

  • ~dynamic_energy_threshold (Bool, default: True)

Adaptive estimation for energy_threshold

  • ~dynamic_energy_adjustment_damping (Double, default: 0.15)

Damping threshold for dynamic VAD

  • ~dynamic_energy_ratio (Double, default: 1.5)

Energy ratio for dynamic VAD

  • ~pause_threshold (Double, default: 0.8)

Seconds of non-speaking audio before a phrase is considered complete

  • ~operation_timeout (Double, default: 0.0)

Seconds after an internal operation (e.g., an API request) starts before it times out

  • ~listen_timeout (Double, default: 0.0)

The maximum number of seconds that this will wait for a phrase to start before giving up

  • ~phrase_time_limit (Double, default: 10.0)

The maximum number of seconds that this will allow a phrase to continue before stopping and returning the part of the phrase processed before the time limit was reached

  • ~phrase_threshold (Double, default: 0.3)

Minimum seconds of speaking audio before we consider the speaking audio a phrase

  • ~non_speaking_duration (Double, default: 0.5)

Seconds of non-speaking audio to keep on both sides of the recording

  • ~duration (Double, default: 10.0)

Seconds of waiting for speech

  • ~audio_topic (String, default: audio)

Topic name of input audio data

  • ~depth (Int, default: 16)

Depth of audio signal

  • ~n_channel (Int, default: 1)

Total number of channels in audio data (e.g. 1: mono, 2: stereo)

  • ~sample_rate (Int, default: 16000)

Sample rate of audio signal

  • ~buffer_size (Int, default: 10240)

Maximum buffer size to store audio data for speech recognition

  • ~start_signal (String, default: /usr/share/sounds/ubuntu/stereo/bell.ogg)

Path to sound file for bell on the start of audio caption

  • ~recognized_signal (String, default: /usr/share/sounds/ubuntu/stereo/button-toggle-on.ogg)

Path to sound file for bell on the end of audio caption

  • ~success_signal (String, default: /usr/share/sounds/ubuntu/stereo/message-new-instant.ogg)

Path to sound file for bell on getting successful recognition result

  • ~timeout_signal (String, default: /usr/share/sounds/ubuntu/stereo/window-slide.ogg)

Path to sound file for bell on timeout for recognition

  • ~continuous (Bool, default: False)

Selecting to use topic or service. By default, service is used.

  • ~google_key (String, default: None)

Auth Key for Google API. If None, use public key. (No guarantee to be blocked.)
This is valid only if ~engine is Google.

  • ~google_cloud_credentials_json (String, default: None)

Path to credential json file. This is valid only if ~engine is GoogleCloud.

  • ~google_cloud_preferred_phrases ([String], default: None)

Preferred phrases parameters. This is valid only if ~engine is GoogleCloud.

  • ~bing_key (String, default: None)

Auth key for Bing API.
This is valid only if ~engine is bing.

Author

Yuki Furuta <furushchev@jsk.imi.i.u-tokyo.ac.jp>

CHANGELOG

Changelog for package ros_speech_recognition

2.1.20 (2020-08-07)

2.1.19 (2020-07-21)

2.1.18 (2020-07-20)

  • Fix for noetic (#200)
    • fix 2to3, with print, raise, exception
  • [ros_speech_recognition] Enable multi channel audio recognition (#198)
    • adjust type code to the CPU platform
    • replace rosparam name: channels -> n_channel
    • add rosparam description to README
    • enable multi channel audio recognition
  • Add args to ros_speech_recognition (#197)
    • Add flac as run_depend for SpeechRecognition pip package
    • Use catkin_virtualenv to use SpeechRecognition pip package
    • Add arguments and params to pass rostest
    • Add test for ros_speech_recognition
    • add args to launch
    • add pip install to tutorials
    • add param description to README
  • Contributors: Kei Okada, Naoya Yamaguchi

2.1.17 (2020-04-16)

2.1.16 (2020-04-16)

2.1.15 (2019-12-12)

2.1.14 (2019-11-21)

  • set SoundRequest.volume for kinetic (#173)
  • Contributors: Kei Okada

2.1.13 (2019-07-10)

2.1.12 (2019-05-25)

  • fixes GoogleCloud auth (#158)
  • Contributors: jonasius

2.1.11 (2018-08-29)

2.1.10 (2018-04-25)

2.1.9 (2018-04-24)

2.1.8 (2018-04-17)

2.1.7 (2018-04-09)

2.1.6 (2017-11-21)

2.1.5 (2017-11-20)

  • ros_speech_recognition: add continuous mode (#127)
  • ros_speech_recognition: add README (#123)
  • add ros_speech_recognition package (#121)
  • Contributors: Yuki Furuta

2.1.4 (2017-07-16)

2.1.3 (2017-07-07)

2.1.2 (2017-07-06)

2.1.1 (2017-07-05)

2.1.0 (2017-07-02)

2.0.20 (2017-05-09)

2.0.19 (2017-02-22)

2.0.18 (2016-10-28)

2.0.17 (2016-10-22)

2.0.16 (2016-10-17)

2.0.15 (2016-10-16)

2.0.14 (2016-03-20)

2.0.13 (2015-12-15)

2.0.12 (2015-11-26)

2.0.11 (2015-10-07 14:16)

2.0.10 (2015-10-07 12:47)

2.0.9 (2015-09-26)

2.0.8 (2015-09-15)

2.0.7 (2015-09-14)

2.0.6 (2015-09-08)

2.0.5 (2015-08-23)

2.0.4 (2015-08-18)

2.0.3 (2015-08-01)

2.0.2 (2015-06-29)

2.0.1 (2015-06-19 21:21)

2.0.0 (2015-06-19 10:41)

1.0.71 (2015-05-17)

1.0.70 (2015-05-08)

1.0.69 (2015-05-05 12:28)

1.0.68 (2015-05-05 09:49)

1.0.67 (2015-05-03)

1.0.66 (2015-04-03)

1.0.65 (2015-04-02)

1.0.64 (2015-03-29)

1.0.63 (2015-02-19)

1.0.62 (2015-02-17)

1.0.61 (2015-02-11)

1.0.60 (2015-02-03 10:12)

1.0.59 (2015-02-03 04:05)

1.0.58 (2015-01-07)

1.0.57 (2014-12-23)

1.0.56 (2014-12-17)

1.0.55 (2014-12-09)

1.0.54 (2014-11-15)

1.0.53 (2014-11-01)

1.0.52 (2014-10-23)

1.0.51 (2014-10-20 16:01)

1.0.50 (2014-10-20 01:50)

1.0.49 (2014-10-13)

1.0.48 (2014-10-12)

1.0.47 (2014-10-08)

1.0.46 (2014-10-03)

1.0.45 (2014-09-29)

1.0.44 (2014-09-26 09:17)

1.0.43 (2014-09-26 01:08)

1.0.42 (2014-09-25)

1.0.41 (2014-09-23)

1.0.40 (2014-09-19)

1.0.39 (2014-09-17)

1.0.38 (2014-09-13)

1.0.37 (2014-09-08)

1.0.36 (2014-09-01)

1.0.35 (2014-08-16)

1.0.34 (2014-08-14)

1.0.33 (2014-07-28)

1.0.32 (2014-07-26)

1.0.31 (2014-07-23)

1.0.30 (2014-07-15)

1.0.29 (2014-07-02)

1.0.28 (2014-06-24)

1.0.27 (2014-06-10)

1.0.26 (2014-05-30)

1.0.25 (2014-05-26)

1.0.24 (2014-05-24)

1.0.23 (2014-05-23)

1.0.22 (2014-05-22)

1.0.21 (2014-05-20)

1.0.20 (2014-05-09)

1.0.19 (2014-05-06)

1.0.18 (2014-05-04)

1.0.17 (2014-04-20)

1.0.16 (2014-04-19 23:29)

1.0.15 (2014-04-19 20:19)

1.0.14 (2014-04-19 12:52)

1.0.13 (2014-04-19 11:06)

1.0.12 (2014-04-18 16:58)

1.0.11 (2014-04-18 08:18)

1.0.10 (2014-04-17)

1.0.9 (2014-04-12)

1.0.8 (2014-04-11)

1.0.7 (2014-04-10)

1.0.6 (2014-04-07)

1.0.5 (2014-03-31)

1.0.4 (2014-03-29)

1.0.3 (2014-03-19)

1.0.2 (2014-03-12)

1.0.1 (2014-03-07)

1.0.0 (2014-03-05)

Wiki Tutorials

See ROS Wiki Tutorials for more details.

Source Tutorials

Not currently indexed.

Launch files

  • launch/speech_recognition.launch
      • launch_sound_play [default: true]
      • launch_audio_capture [default: true]
      • audio_topic [default: /audio]
      • n_channel [default: 1]
      • depth [default: 16]
      • sample_rate [default: 16000]
      • device [default: ]
      • engine [default: Google]
      • language [default: en-US]
      • continuous [default: false]

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged ros_speech_recognition at answers.ros.org

Package Summary

Tags No category tags.
Version 2.1.20
License BSD
Build type CATKIN
Use RECOMMENDED

Repository Summary

Checkout URI https://github.com/jsk-ros-pkg/jsk_3rdparty.git
VCS Type git
VCS Version master
Last Updated 2020-08-07
Dev Status DEVELOPED
CI status Continuous Integration
Released RELEASED
Package Tags No category tags.
Contributing Help Wanted (0)
Good First Issues (0)
Pull Requests to Review (0)

Package Description

ROS wrapper for Python SpeechRecognition library

Additional Links

Maintainers

  • Yuki Furuta

Authors

  • Yuki Furuta

ros_speech_recognition

A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.

Tutorials

  1. Install this package and SpeechReconition
  sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition

  1. Launch speech recognition node
  roslaunch ros_speech_recognition speech_recognition.launch

  1. Use from Python
  import rospy
  from ros_speech_recognition import SpeechRecognitionClient

  rospy.init_node("client")
  client = SpeechRecognitionClient()
  result = client.recognize()  # Please say 'Hello, world!' towards microphone
  print result # => 'Hello, world!'

Interface

Publishing Topics

  • sound_play (sound_play/SoundRequestAction)

Action client to play sound on events. If the action server is not available, no sound is played.

Subscribing Topics

  • audio (audio_common_msgs/AudioData)

Audio stream data to be recognized.

Advertising Services

  • speech_recognition (speech_recognition_msgs/SpeechRecognition)

Service for speech recognition

Parameters

  • ~language (String, default: en-US)

Language to be recognized

  • ~engine (Enum[String], default: Google)

Speech-to-text engine (To see full options use dynamic_reconfigure)

  • ~energy_threshold (Double, default: 300)

Threshold for Voice activity detection

  • ~dynamic_energy_threshold (Bool, default: True)

Adaptive estimation for energy_threshold

  • ~dynamic_energy_adjustment_damping (Double, default: 0.15)

Damping threshold for dynamic VAD

  • ~dynamic_energy_ratio (Double, default: 1.5)

Energy ratio for dynamic VAD

  • ~pause_threshold (Double, default: 0.8)

Seconds of non-speaking audio before a phrase is considered complete

  • ~operation_timeout (Double, default: 0.0)

Seconds after an internal operation (e.g., an API request) starts before it times out

  • ~listen_timeout (Double, default: 0.0)

The maximum number of seconds that this will wait for a phrase to start before giving up

  • ~phrase_time_limit (Double, default: 10.0)

The maximum number of seconds that this will allow a phrase to continue before stopping and returning the part of the phrase processed before the time limit was reached

  • ~phrase_threshold (Double, default: 0.3)

Minimum seconds of speaking audio before we consider the speaking audio a phrase

  • ~non_speaking_duration (Double, default: 0.5)

Seconds of non-speaking audio to keep on both sides of the recording

  • ~duration (Double, default: 10.0)

Seconds of waiting for speech

  • ~audio_topic (String, default: audio)

Topic name of input audio data

  • ~depth (Int, default: 16)

Depth of audio signal

  • ~n_channel (Int, default: 1)

Total number of channels in audio data (e.g. 1: mono, 2: stereo)

  • ~sample_rate (Int, default: 16000)

Sample rate of audio signal

  • ~buffer_size (Int, default: 10240)

Maximum buffer size to store audio data for speech recognition

  • ~start_signal (String, default: /usr/share/sounds/ubuntu/stereo/bell.ogg)

Path to sound file for bell on the start of audio caption

  • ~recognized_signal (String, default: /usr/share/sounds/ubuntu/stereo/button-toggle-on.ogg)

Path to sound file for bell on the end of audio caption

  • ~success_signal (String, default: /usr/share/sounds/ubuntu/stereo/message-new-instant.ogg)

Path to sound file for bell on getting successful recognition result

  • ~timeout_signal (String, default: /usr/share/sounds/ubuntu/stereo/window-slide.ogg)

Path to sound file for bell on timeout for recognition

  • ~continuous (Bool, default: False)

Selecting to use topic or service. By default, service is used.

  • ~google_key (String, default: None)

Auth Key for Google API. If None, use public key. (No guarantee to be blocked.)
This is valid only if ~engine is Google.

  • ~google_cloud_credentials_json (String, default: None)

Path to credential json file. This is valid only if ~engine is GoogleCloud.

  • ~google_cloud_preferred_phrases ([String], default: None)

Preferred phrases parameters. This is valid only if ~engine is GoogleCloud.

  • ~bing_key (String, default: None)

Auth key for Bing API.
This is valid only if ~engine is bing.

Author

Yuki Furuta <furushchev@jsk.imi.i.u-tokyo.ac.jp>

CHANGELOG

Changelog for package ros_speech_recognition

2.1.20 (2020-08-07)

2.1.19 (2020-07-21)

2.1.18 (2020-07-20)

  • Fix for noetic (#200)
    • fix 2to3, with print, raise, exception
  • [ros_speech_recognition] Enable multi channel audio recognition (#198)
    • adjust type code to the CPU platform
    • replace rosparam name: channels -> n_channel
    • add rosparam description to README
    • enable multi channel audio recognition
  • Add args to ros_speech_recognition (#197)
    • Add flac as run_depend for SpeechRecognition pip package
    • Use catkin_virtualenv to use SpeechRecognition pip package
    • Add arguments and params to pass rostest
    • Add test for ros_speech_recognition
    • add args to launch
    • add pip install to tutorials
    • add param description to README
  • Contributors: Kei Okada, Naoya Yamaguchi

2.1.17 (2020-04-16)

2.1.16 (2020-04-16)

2.1.15 (2019-12-12)

2.1.14 (2019-11-21)

  • set SoundRequest.volume for kinetic (#173)
  • Contributors: Kei Okada

2.1.13 (2019-07-10)

2.1.12 (2019-05-25)

  • fixes GoogleCloud auth (#158)
  • Contributors: jonasius

2.1.11 (2018-08-29)

2.1.10 (2018-04-25)

2.1.9 (2018-04-24)

2.1.8 (2018-04-17)

2.1.7 (2018-04-09)

2.1.6 (2017-11-21)

2.1.5 (2017-11-20)

  • ros_speech_recognition: add continuous mode (#127)
  • ros_speech_recognition: add README (#123)
  • add ros_speech_recognition package (#121)
  • Contributors: Yuki Furuta

2.1.4 (2017-07-16)

2.1.3 (2017-07-07)

2.1.2 (2017-07-06)

2.1.1 (2017-07-05)

2.1.0 (2017-07-02)

2.0.20 (2017-05-09)

2.0.19 (2017-02-22)

2.0.18 (2016-10-28)

2.0.17 (2016-10-22)

2.0.16 (2016-10-17)

2.0.15 (2016-10-16)

2.0.14 (2016-03-20)

2.0.13 (2015-12-15)

2.0.12 (2015-11-26)

2.0.11 (2015-10-07 14:16)

2.0.10 (2015-10-07 12:47)

2.0.9 (2015-09-26)

2.0.8 (2015-09-15)

2.0.7 (2015-09-14)

2.0.6 (2015-09-08)

2.0.5 (2015-08-23)

2.0.4 (2015-08-18)

2.0.3 (2015-08-01)

2.0.2 (2015-06-29)

2.0.1 (2015-06-19 21:21)

2.0.0 (2015-06-19 10:41)

1.0.71 (2015-05-17)

1.0.70 (2015-05-08)

1.0.69 (2015-05-05 12:28)

1.0.68 (2015-05-05 09:49)

1.0.67 (2015-05-03)

1.0.66 (2015-04-03)

1.0.65 (2015-04-02)

1.0.64 (2015-03-29)

1.0.63 (2015-02-19)

1.0.62 (2015-02-17)

1.0.61 (2015-02-11)

1.0.60 (2015-02-03 10:12)

1.0.59 (2015-02-03 04:05)

1.0.58 (2015-01-07)

1.0.57 (2014-12-23)

1.0.56 (2014-12-17)

1.0.55 (2014-12-09)

1.0.54 (2014-11-15)

1.0.53 (2014-11-01)

1.0.52 (2014-10-23)

1.0.51 (2014-10-20 16:01)

1.0.50 (2014-10-20 01:50)

1.0.49 (2014-10-13)

1.0.48 (2014-10-12)

1.0.47 (2014-10-08)

1.0.46 (2014-10-03)

1.0.45 (2014-09-29)

1.0.44 (2014-09-26 09:17)

1.0.43 (2014-09-26 01:08)

1.0.42 (2014-09-25)

1.0.41 (2014-09-23)

1.0.40 (2014-09-19)

1.0.39 (2014-09-17)

1.0.38 (2014-09-13)

1.0.37 (2014-09-08)

1.0.36 (2014-09-01)

1.0.35 (2014-08-16)

1.0.34 (2014-08-14)

1.0.33 (2014-07-28)

1.0.32 (2014-07-26)

1.0.31 (2014-07-23)

1.0.30 (2014-07-15)

1.0.29 (2014-07-02)

1.0.28 (2014-06-24)

1.0.27 (2014-06-10)

1.0.26 (2014-05-30)

1.0.25 (2014-05-26)

1.0.24 (2014-05-24)

1.0.23 (2014-05-23)

1.0.22 (2014-05-22)

1.0.21 (2014-05-20)

1.0.20 (2014-05-09)

1.0.19 (2014-05-06)

1.0.18 (2014-05-04)

1.0.17 (2014-04-20)

1.0.16 (2014-04-19 23:29)

1.0.15 (2014-04-19 20:19)

1.0.14 (2014-04-19 12:52)

1.0.13 (2014-04-19 11:06)

1.0.12 (2014-04-18 16:58)

1.0.11 (2014-04-18 08:18)

1.0.10 (2014-04-17)

1.0.9 (2014-04-12)

1.0.8 (2014-04-11)

1.0.7 (2014-04-10)

1.0.6 (2014-04-07)

1.0.5 (2014-03-31)

1.0.4 (2014-03-29)

1.0.3 (2014-03-19)

1.0.2 (2014-03-12)

1.0.1 (2014-03-07)

1.0.0 (2014-03-05)

Wiki Tutorials

See ROS Wiki Tutorials for more details.

Source Tutorials

Not currently indexed.

Launch files

  • launch/speech_recognition.launch
      • launch_sound_play [default: true]
      • launch_audio_capture [default: true]
      • audio_topic [default: /audio]
      • n_channel [default: 1]
      • depth [default: 16]
      • sample_rate [default: 16000]
      • device [default: ]
      • engine [default: Google]
      • language [default: en-US]
      • continuous [default: false]

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged ros_speech_recognition at answers.ros.org

Package Summary

Tags No category tags.
Version 2.1.20
License BSD
Build type CATKIN
Use RECOMMENDED

Repository Summary

Checkout URI https://github.com/jsk-ros-pkg/jsk_3rdparty.git
VCS Type git
VCS Version master
Last Updated 2020-08-07
Dev Status DEVELOPED
CI status Continuous Integration
Released RELEASED
Package Tags No category tags.
Contributing Help Wanted (0)
Good First Issues (0)
Pull Requests to Review (0)

Package Description

ROS wrapper for Python SpeechRecognition library

Additional Links

Maintainers

  • Yuki Furuta

Authors

  • Yuki Furuta

ros_speech_recognition

A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.

Tutorials

  1. Install this package and SpeechReconition
  sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition

  1. Launch speech recognition node
  roslaunch ros_speech_recognition speech_recognition.launch

  1. Use from Python
  import rospy
  from ros_speech_recognition import SpeechRecognitionClient

  rospy.init_node("client")
  client = SpeechRecognitionClient()
  result = client.recognize()  # Please say 'Hello, world!' towards microphone
  print result # => 'Hello, world!'

Interface

Publishing Topics

  • sound_play (sound_play/SoundRequestAction)

Action client to play sound on events. If the action server is not available, no sound is played.

Subscribing Topics

  • audio (audio_common_msgs/AudioData)

Audio stream data to be recognized.

Advertising Services

  • speech_recognition (speech_recognition_msgs/SpeechRecognition)

Service for speech recognition

Parameters

  • ~language (String, default: en-US)

Language to be recognized

  • ~engine (Enum[String], default: Google)

Speech-to-text engine (To see full options use dynamic_reconfigure)

  • ~energy_threshold (Double, default: 300)

Threshold for Voice activity detection

  • ~dynamic_energy_threshold (Bool, default: True)

Adaptive estimation for energy_threshold

  • ~dynamic_energy_adjustment_damping (Double, default: 0.15)

Damping threshold for dynamic VAD

  • ~dynamic_energy_ratio (Double, default: 1.5)

Energy ratio for dynamic VAD

  • ~pause_threshold (Double, default: 0.8)

Seconds of non-speaking audio before a phrase is considered complete

  • ~operation_timeout (Double, default: 0.0)

Seconds after an internal operation (e.g., an API request) starts before it times out

  • ~listen_timeout (Double, default: 0.0)

The maximum number of seconds that this will wait for a phrase to start before giving up

  • ~phrase_time_limit (Double, default: 10.0)

The maximum number of seconds that this will allow a phrase to continue before stopping and returning the part of the phrase processed before the time limit was reached

  • ~phrase_threshold (Double, default: 0.3)

Minimum seconds of speaking audio before we consider the speaking audio a phrase

  • ~non_speaking_duration (Double, default: 0.5)

Seconds of non-speaking audio to keep on both sides of the recording

  • ~duration (Double, default: 10.0)

Seconds of waiting for speech

  • ~audio_topic (String, default: audio)

Topic name of input audio data

  • ~depth (Int, default: 16)

Depth of audio signal

  • ~n_channel (Int, default: 1)

Total number of channels in audio data (e.g. 1: mono, 2: stereo)

  • ~sample_rate (Int, default: 16000)

Sample rate of audio signal

  • ~buffer_size (Int, default: 10240)

Maximum buffer size to store audio data for speech recognition

  • ~start_signal (String, default: /usr/share/sounds/ubuntu/stereo/bell.ogg)

Path to sound file for bell on the start of audio caption

  • ~recognized_signal (String, default: /usr/share/sounds/ubuntu/stereo/button-toggle-on.ogg)

Path to sound file for bell on the end of audio caption

  • ~success_signal (String, default: /usr/share/sounds/ubuntu/stereo/message-new-instant.ogg)

Path to sound file for bell on getting successful recognition result

  • ~timeout_signal (String, default: /usr/share/sounds/ubuntu/stereo/window-slide.ogg)

Path to sound file for bell on timeout for recognition

  • ~continuous (Bool, default: False)

Selecting to use topic or service. By default, service is used.

  • ~google_key (String, default: None)

Auth Key for Google API. If None, use public key. (No guarantee to be blocked.)
This is valid only if ~engine is Google.

  • ~google_cloud_credentials_json (String, default: None)

Path to credential json file. This is valid only if ~engine is GoogleCloud.

  • ~google_cloud_preferred_phrases ([String], default: None)

Preferred phrases parameters. This is valid only if ~engine is GoogleCloud.

  • ~bing_key (String, default: None)

Auth key for Bing API.
This is valid only if ~engine is bing.

Author

Yuki Furuta <furushchev@jsk.imi.i.u-tokyo.ac.jp>

CHANGELOG

Changelog for package ros_speech_recognition

2.1.20 (2020-08-07)

2.1.19 (2020-07-21)

2.1.18 (2020-07-20)

  • Fix for noetic (#200)
    • fix 2to3, with print, raise, exception
  • [ros_speech_recognition] Enable multi channel audio recognition (#198)
    • adjust type code to the CPU platform
    • replace rosparam name: channels -> n_channel
    • add rosparam description to README
    • enable multi channel audio recognition
  • Add args to ros_speech_recognition (#197)
    • Add flac as run_depend for SpeechRecognition pip package
    • Use catkin_virtualenv to use SpeechRecognition pip package
    • Add arguments and params to pass rostest
    • Add test for ros_speech_recognition
    • add args to launch
    • add pip install to tutorials
    • add param description to README
  • Contributors: Kei Okada, Naoya Yamaguchi

2.1.17 (2020-04-16)

2.1.16 (2020-04-16)

2.1.15 (2019-12-12)

2.1.14 (2019-11-21)

  • set SoundRequest.volume for kinetic (#173)
  • Contributors: Kei Okada

2.1.13 (2019-07-10)

2.1.12 (2019-05-25)

  • fixes GoogleCloud auth (#158)
  • Contributors: jonasius

2.1.11 (2018-08-29)

2.1.10 (2018-04-25)

2.1.9 (2018-04-24)

2.1.8 (2018-04-17)

2.1.7 (2018-04-09)

2.1.6 (2017-11-21)

2.1.5 (2017-11-20)

  • ros_speech_recognition: add continuous mode (#127)
  • ros_speech_recognition: add README (#123)
  • add ros_speech_recognition package (#121)
  • Contributors: Yuki Furuta

2.1.4 (2017-07-16)

2.1.3 (2017-07-07)

2.1.2 (2017-07-06)

2.1.1 (2017-07-05)

2.1.0 (2017-07-02)

2.0.20 (2017-05-09)

2.0.19 (2017-02-22)

2.0.18 (2016-10-28)

2.0.17 (2016-10-22)

2.0.16 (2016-10-17)

2.0.15 (2016-10-16)

2.0.14 (2016-03-20)

2.0.13 (2015-12-15)

2.0.12 (2015-11-26)

2.0.11 (2015-10-07 14:16)

2.0.10 (2015-10-07 12:47)

2.0.9 (2015-09-26)

2.0.8 (2015-09-15)

2.0.7 (2015-09-14)

2.0.6 (2015-09-08)

2.0.5 (2015-08-23)

2.0.4 (2015-08-18)

2.0.3 (2015-08-01)

2.0.2 (2015-06-29)

2.0.1 (2015-06-19 21:21)

2.0.0 (2015-06-19 10:41)

1.0.71 (2015-05-17)

1.0.70 (2015-05-08)

1.0.69 (2015-05-05 12:28)

1.0.68 (2015-05-05 09:49)

1.0.67 (2015-05-03)

1.0.66 (2015-04-03)

1.0.65 (2015-04-02)

1.0.64 (2015-03-29)

1.0.63 (2015-02-19)

1.0.62 (2015-02-17)

1.0.61 (2015-02-11)

1.0.60 (2015-02-03 10:12)

1.0.59 (2015-02-03 04:05)

1.0.58 (2015-01-07)

1.0.57 (2014-12-23)

1.0.56 (2014-12-17)

1.0.55 (2014-12-09)

1.0.54 (2014-11-15)

1.0.53 (2014-11-01)

1.0.52 (2014-10-23)

1.0.51 (2014-10-20 16:01)

1.0.50 (2014-10-20 01:50)

1.0.49 (2014-10-13)

1.0.48 (2014-10-12)

1.0.47 (2014-10-08)

1.0.46 (2014-10-03)

1.0.45 (2014-09-29)

1.0.44 (2014-09-26 09:17)

1.0.43 (2014-09-26 01:08)

1.0.42 (2014-09-25)

1.0.41 (2014-09-23)

1.0.40 (2014-09-19)

1.0.39 (2014-09-17)

1.0.38 (2014-09-13)

1.0.37 (2014-09-08)

1.0.36 (2014-09-01)

1.0.35 (2014-08-16)

1.0.34 (2014-08-14)

1.0.33 (2014-07-28)

1.0.32 (2014-07-26)

1.0.31 (2014-07-23)

1.0.30 (2014-07-15)

1.0.29 (2014-07-02)

1.0.28 (2014-06-24)

1.0.27 (2014-06-10)

1.0.26 (2014-05-30)

1.0.25 (2014-05-26)

1.0.24 (2014-05-24)

1.0.23 (2014-05-23)

1.0.22 (2014-05-22)

1.0.21 (2014-05-20)

1.0.20 (2014-05-09)

1.0.19 (2014-05-06)

1.0.18 (2014-05-04)

1.0.17 (2014-04-20)

1.0.16 (2014-04-19 23:29)

1.0.15 (2014-04-19 20:19)

1.0.14 (2014-04-19 12:52)

1.0.13 (2014-04-19 11:06)

1.0.12 (2014-04-18 16:58)

1.0.11 (2014-04-18 08:18)

1.0.10 (2014-04-17)

1.0.9 (2014-04-12)

1.0.8 (2014-04-11)

1.0.7 (2014-04-10)

1.0.6 (2014-04-07)

1.0.5 (2014-03-31)

1.0.4 (2014-03-29)

1.0.3 (2014-03-19)

1.0.2 (2014-03-12)

1.0.1 (2014-03-07)

1.0.0 (2014-03-05)

Wiki Tutorials

See ROS Wiki Tutorials for more details.

Source Tutorials

Not currently indexed.

Launch files

  • launch/speech_recognition.launch
      • launch_sound_play [default: true]
      • launch_audio_capture [default: true]
      • audio_topic [default: /audio]
      • n_channel [default: 1]
      • depth [default: 16]
      • sample_rate [default: 16000]
      • device [default: ]
      • engine [default: Google]
      • language [default: en-US]
      • continuous [default: false]

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged ros_speech_recognition at answers.ros.org

Package Summary

Tags No category tags.
Version 2.1.20
License BSD
Build type CATKIN
Use RECOMMENDED

Repository Summary

Checkout URI https://github.com/jsk-ros-pkg/jsk_3rdparty.git
VCS Type git
VCS Version master
Last Updated 2020-08-07
Dev Status DEVELOPED
CI status No Continuous Integration
Released RELEASED
Package Tags No category tags.
Contributing Help Wanted (0)
Good First Issues (0)
Pull Requests to Review (0)

Package Description

ROS wrapper for Python SpeechRecognition library

Additional Links

Maintainers

  • Yuki Furuta

Authors

  • Yuki Furuta

ros_speech_recognition

A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.

Tutorials

  1. Install this package and SpeechReconition
  sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition

  1. Launch speech recognition node
  roslaunch ros_speech_recognition speech_recognition.launch

  1. Use from Python
  import rospy
  from ros_speech_recognition import SpeechRecognitionClient

  rospy.init_node("client")
  client = SpeechRecognitionClient()
  result = client.recognize()  # Please say 'Hello, world!' towards microphone
  print result # => 'Hello, world!'

Interface

Publishing Topics

  • sound_play (sound_play/SoundRequestAction)

Action client to play sound on events. If the action server is not available, no sound is played.

Subscribing Topics

  • audio (audio_common_msgs/AudioData)

Audio stream data to be recognized.

Advertising Services

  • speech_recognition (speech_recognition_msgs/SpeechRecognition)

Service for speech recognition

Parameters

  • ~language (String, default: en-US)

Language to be recognized

  • ~engine (Enum[String], default: Google)

Speech-to-text engine (To see full options use dynamic_reconfigure)

  • ~energy_threshold (Double, default: 300)

Threshold for Voice activity detection

  • ~dynamic_energy_threshold (Bool, default: True)

Adaptive estimation for energy_threshold

  • ~dynamic_energy_adjustment_damping (Double, default: 0.15)

Damping threshold for dynamic VAD

  • ~dynamic_energy_ratio (Double, default: 1.5)

Energy ratio for dynamic VAD

  • ~pause_threshold (Double, default: 0.8)

Seconds of non-speaking audio before a phrase is considered complete

  • ~operation_timeout (Double, default: 0.0)

Seconds after an internal operation (e.g., an API request) starts before it times out

  • ~listen_timeout (Double, default: 0.0)

The maximum number of seconds that this will wait for a phrase to start before giving up

  • ~phrase_time_limit (Double, default: 10.0)

The maximum number of seconds that this will allow a phrase to continue before stopping and returning the part of the phrase processed before the time limit was reached

  • ~phrase_threshold (Double, default: 0.3)

Minimum seconds of speaking audio before we consider the speaking audio a phrase

  • ~non_speaking_duration (Double, default: 0.5)

Seconds of non-speaking audio to keep on both sides of the recording

  • ~duration (Double, default: 10.0)

Seconds of waiting for speech

  • ~audio_topic (String, default: audio)

Topic name of input audio data

  • ~depth (Int, default: 16)

Depth of audio signal

  • ~n_channel (Int, default: 1)

Total number of channels in audio data (e.g. 1: mono, 2: stereo)

  • ~sample_rate (Int, default: 16000)

Sample rate of audio signal

  • ~buffer_size (Int, default: 10240)

Maximum buffer size to store audio data for speech recognition

  • ~start_signal (String, default: /usr/share/sounds/ubuntu/stereo/bell.ogg)

Path to sound file for bell on the start of audio caption

  • ~recognized_signal (String, default: /usr/share/sounds/ubuntu/stereo/button-toggle-on.ogg)

Path to sound file for bell on the end of audio caption

  • ~success_signal (String, default: /usr/share/sounds/ubuntu/stereo/message-new-instant.ogg)

Path to sound file for bell on getting successful recognition result

  • ~timeout_signal (String, default: /usr/share/sounds/ubuntu/stereo/window-slide.ogg)

Path to sound file for bell on timeout for recognition

  • ~continuous (Bool, default: False)

Selecting to use topic or service. By default, service is used.

  • ~google_key (String, default: None)

Auth Key for Google API. If None, use public key. (No guarantee to be blocked.)
This is valid only if ~engine is Google.

  • ~google_cloud_credentials_json (String, default: None)

Path to credential json file. This is valid only if ~engine is GoogleCloud.

  • ~google_cloud_preferred_phrases ([String], default: None)

Preferred phrases parameters. This is valid only if ~engine is GoogleCloud.

  • ~bing_key (String, default: None)

Auth key for Bing API.
This is valid only if ~engine is bing.

Author

Yuki Furuta <furushchev@jsk.imi.i.u-tokyo.ac.jp>

CHANGELOG

Changelog for package ros_speech_recognition

2.1.20 (2020-08-07)

2.1.19 (2020-07-21)

2.1.18 (2020-07-20)

  • Fix for noetic (#200)
    • fix 2to3, with print, raise, exception
  • [ros_speech_recognition] Enable multi channel audio recognition (#198)
    • adjust type code to the CPU platform
    • replace rosparam name: channels -> n_channel
    • add rosparam description to README
    • enable multi channel audio recognition
  • Add args to ros_speech_recognition (#197)
    • Add flac as run_depend for SpeechRecognition pip package
    • Use catkin_virtualenv to use SpeechRecognition pip package
    • Add arguments and params to pass rostest
    • Add test for ros_speech_recognition
    • add args to launch
    • add pip install to tutorials
    • add param description to README
  • Contributors: Kei Okada, Naoya Yamaguchi

2.1.17 (2020-04-16)

2.1.16 (2020-04-16)

2.1.15 (2019-12-12)

2.1.14 (2019-11-21)

  • set SoundRequest.volume for kinetic (#173)
  • Contributors: Kei Okada

2.1.13 (2019-07-10)

2.1.12 (2019-05-25)

  • fixes GoogleCloud auth (#158)
  • Contributors: jonasius

2.1.11 (2018-08-29)

2.1.10 (2018-04-25)

2.1.9 (2018-04-24)

2.1.8 (2018-04-17)

2.1.7 (2018-04-09)

2.1.6 (2017-11-21)

2.1.5 (2017-11-20)

  • ros_speech_recognition: add continuous mode (#127)
  • ros_speech_recognition: add README (#123)
  • add ros_speech_recognition package (#121)
  • Contributors: Yuki Furuta

2.1.4 (2017-07-16)

2.1.3 (2017-07-07)

2.1.2 (2017-07-06)

2.1.1 (2017-07-05)

2.1.0 (2017-07-02)

2.0.20 (2017-05-09)

2.0.19 (2017-02-22)

2.0.18 (2016-10-28)

2.0.17 (2016-10-22)

2.0.16 (2016-10-17)

2.0.15 (2016-10-16)

2.0.14 (2016-03-20)

2.0.13 (2015-12-15)

2.0.12 (2015-11-26)

2.0.11 (2015-10-07 14:16)

2.0.10 (2015-10-07 12:47)

2.0.9 (2015-09-26)

2.0.8 (2015-09-15)

2.0.7 (2015-09-14)

2.0.6 (2015-09-08)

2.0.5 (2015-08-23)

2.0.4 (2015-08-18)

2.0.3 (2015-08-01)

2.0.2 (2015-06-29)

2.0.1 (2015-06-19 21:21)

2.0.0 (2015-06-19 10:41)

1.0.71 (2015-05-17)

1.0.70 (2015-05-08)

1.0.69 (2015-05-05 12:28)

1.0.68 (2015-05-05 09:49)

1.0.67 (2015-05-03)

1.0.66 (2015-04-03)

1.0.65 (2015-04-02)

1.0.64 (2015-03-29)

1.0.63 (2015-02-19)

1.0.62 (2015-02-17)

1.0.61 (2015-02-11)

1.0.60 (2015-02-03 10:12)

1.0.59 (2015-02-03 04:05)

1.0.58 (2015-01-07)

1.0.57 (2014-12-23)

1.0.56 (2014-12-17)

1.0.55 (2014-12-09)

1.0.54 (2014-11-15)

1.0.53 (2014-11-01)

1.0.52 (2014-10-23)

1.0.51 (2014-10-20 16:01)

1.0.50 (2014-10-20 01:50)

1.0.49 (2014-10-13)

1.0.48 (2014-10-12)

1.0.47 (2014-10-08)

1.0.46 (2014-10-03)

1.0.45 (2014-09-29)

1.0.44 (2014-09-26 09:17)

1.0.43 (2014-09-26 01:08)

1.0.42 (2014-09-25)

1.0.41 (2014-09-23)

1.0.40 (2014-09-19)

1.0.39 (2014-09-17)

1.0.38 (2014-09-13)

1.0.37 (2014-09-08)

1.0.36 (2014-09-01)

1.0.35 (2014-08-16)

1.0.34 (2014-08-14)

1.0.33 (2014-07-28)

1.0.32 (2014-07-26)

1.0.31 (2014-07-23)

1.0.30 (2014-07-15)

1.0.29 (2014-07-02)

1.0.28 (2014-06-24)

1.0.27 (2014-06-10)

1.0.26 (2014-05-30)

1.0.25 (2014-05-26)

1.0.24 (2014-05-24)

1.0.23 (2014-05-23)

1.0.22 (2014-05-22)

1.0.21 (2014-05-20)

1.0.20 (2014-05-09)

1.0.19 (2014-05-06)

1.0.18 (2014-05-04)

1.0.17 (2014-04-20)

1.0.16 (2014-04-19 23:29)

1.0.15 (2014-04-19 20:19)

1.0.14 (2014-04-19 12:52)

1.0.13 (2014-04-19 11:06)

1.0.12 (2014-04-18 16:58)

1.0.11 (2014-04-18 08:18)

1.0.10 (2014-04-17)

1.0.9 (2014-04-12)

1.0.8 (2014-04-11)

1.0.7 (2014-04-10)

1.0.6 (2014-04-07)

1.0.5 (2014-03-31)

1.0.4 (2014-03-29)

1.0.3 (2014-03-19)

1.0.2 (2014-03-12)

1.0.1 (2014-03-07)

1.0.0 (2014-03-05)

Wiki Tutorials

See ROS Wiki Tutorials for more details.

Source Tutorials

Not currently indexed.

Launch files

  • launch/speech_recognition.launch
      • launch_sound_play [default: true]
      • launch_audio_capture [default: true]
      • audio_topic [default: /audio]
      • n_channel [default: 1]
      • depth [default: 16]
      • sample_rate [default: 16000]
      • device [default: ]
      • engine [default: Google]
      • language [default: en-US]
      • continuous [default: false]

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged ros_speech_recognition at answers.ros.org