![]() |
automatika_embodied_agents package from automatika_embodied_agents repoautomatika_embodied_agents |
ROS Distro
|
Package Summary
Version | 0.4.2 |
License | MIT |
Build type | AMENT_CMAKE |
Use | RECOMMENDED |
Repository Summary
Checkout URI | https://github.com/automatika-robotics/ros-agents.git |
VCS Type | git |
VCS Version | main |
Last Updated | 2025-09-16 |
Dev Status | DEVELOPED |
Released | RELEASED |
Contributing |
Help Wanted (-)
Good First Issues (-) Pull Requests to Review (-) |
Package Description
Additional Links
Maintainers
- Automatika Robotics
Authors

🇨🇳 简体ä¸ć–‡ | 🇯🇵 日本語 |
EmbodiedAgents is a fully-loaded ROS2 based framework for creating interactive physical agents that can understand, remember, and act upon contextual information from their environment.
- Production Ready Physical Agents: Designed to be used with autonomous robot systems that operate in real world dynamic environments. EmbodiedAgents makes it simple to create systems that make use of Physical AI.
- Intuitive API: Simple pythonic API to utilize local or cloud based ML models (specifically Multimodal LLMs and other transformer based architectures) on robots, with all the benefits of component lifecycle management, health monitoring and fallback mechanisms to make your agents robust.
- Self-referential and Event Driven: An agent created with EmbodiedAgents can start, stop or reconfigure its own components based on internal and external events. For example, an agent can change the ML model for planning based on its location on the map or input from the vision model. EmbodiedAgents makes it simple to create agents that are self-referential Gödel machines.
- Semantic Memory: Integrates vector databases, semantic routing and other supporting components to quickly build arbitrarily complex graphs for agentic information flow. No need to utilize bloated “GenAI” frameworks on your robot.
- Made in ROS2: Utilizes ROS2 as the underlying distributed communications backbone. Theoretically, all devices that provide a ROS2 package can be utilized to send data to ML models, with callbacks implemented for most commonly used data types and infinite extensibility.
Checkout Installation Instructions 🛠️
Get started with the Quickstart Guide 🚀
Get familiar with Basic Concepts 📚
Dive right in with Example Recipes ✨
Installation 🛠️
Install a model serving platform
The core of EmbodiedAgents is agnostic to model serving platforms. It currently supports Ollama, RoboML and any platform or cloud provider with an OpenAI compatible API (e.g. vLLM, lmdeploy etc.). Please install either of these by following the instructions provided by respective projects. Support for new platforms is being continuously added. If you would like to support a particular platform, please open an issue/PR.
Install EmbodiedAgents (Ubuntu)
For ROS versions >= humble, you can install EmbodiedAgents with your package manager. For example on Ubuntu:
sudo apt install ros-$ROS_DISTRO-automatika-embodied-agents
Alternatively, grab your favorite deb package from the release page and install it as follows:
sudo dpkg -i ros-$ROS_DISTRO-automatica-embodied-agents_$version$DISTRO_$ARCHITECTURE.deb
If the attrs version from your package manager is < 23.2, install it using pip as follows:
pip install 'attrs>=23.2.0'
Install EmbodiedAgents from source
Get Dependencies
Install python dependencies
pip install numpy opencv-python-headless 'attrs>=23.2.0' jinja2 httpx setproctitle msgpack msgpack-numpy platformdirs tqdm websockets
Download Sugarcoat🍬
git clone https://github.com/automatika-robotics/sugarcoat
Install EmbodiedAgents
git clone https://github.com/automatika-robotics/embodied-agents.git
cd ..
colcon build
source install/setup.bash
python your_script.py
Quick Start 🚀
Unlike other ROS package, EmbodiedAgents provides a pure pythonic way of describing the node graph using Sugarcoat🍬. Copy the following code in a python script and run it.
```python from agents.clients.ollama import OllamaClient from agents.components import MLLM from agents.models import OllamaModel from agents.ros import Topic, Launcher
Define input and output topics (pay attention to msg_type)
text0 = Topic(name=”text0”, msg_type=”String”) image0 = Topic(name=”image_raw”, msg_type=”Image”) text1 = Topic(name=”text1”, msg_type=”String”)
Define a model client (working with Ollama in this case)
llava = OllamaModel(name=”llava”, checkpoint=”llava:latest”) llava_client = OllamaClient(llava)
Define an MLLM component (A component represents a node with a particular functionality)
mllm = MLLM( inputs=[text0, image0], outputs=[text1], model_client=llava_client, trigger=[text0], component_name=”vqa” )
Additional prompt settings
mllm.set_topic_prompt(text0, template=”"”You are an amazing and funny robot.
File truncated at 100 lines see the full file
Changelog for package automatika_embodied_agents
0.4.2 (2025-09-03)
- (feature) Adds udp streaming to IP:PORT as an option to TextToStream component when play_on_device is enabled
- (docs) Updates docs to use new web based client
- (feature) Adds processing of audio messages in web client
- (chore) Removes chainlit based client
- (feature) Adds a custom webclient to replace chainlit
- (feature) Adds persistent ros node in web client for async stream handling
- (feature) Adds warning when not using streaming string msg_type with streaming enabled in components
- (feature) Adds streaming string msg for managing streams in external clients
- (docs) Adds recipe for vision guided point navigation
- (fix) Fixes empty image input for Detection2D msg publication
- (fix) Fixes websocket receiving in text to speech
- (fix) Fixes keyword argument in detection and tracking publishing
- (feature) Adds publishing a singular detection or tracking message from the vision component
- Contributors: ahr, mkabtoul
0.4.1 (2025-07-10)
- (docs) Updates docs for using planning based MLLMs
- (feature) Adds options to get RGBD array from rgbd message callback
- (refactor) Breaks complex functions and fixes warmup result logging
- (feature) Adds support for planning mllm models, starting with robobrain2.0
- (docs) Adds streaming to conversational agent example
- Contributors: ahr, mkabtoul
0.4.0 (2025-06-18)
- (docs) Adds international readme files
- (feature) Adds better connection error messages in clients, adds installation instructions
- (chore) Adds debian packaging workflow
- (docs) Updates installation instructions
- (chore) Updates package names .. ROS Agents -> EmbodiedAgents
- (feature) Adds a GenericHTTPClient for using llm and mllm models served on any OpenAI compatible API
- (feature) Adds ollama specific inference options to OllamaModel and client
- (feature) Adds MeloTTS model to model definitions
- (feature) Adds say text method to text to speech for invoking with events
- (feature) Adds streaming playback for streaming input in speeech to text component
- (fix) Fixes clearing old output in the vision component when getting subscription data in a timed manner
- (feature) Adds tensorrt as an onnx provider option for local models
- (refactor) Removes sounddevice as a dependancy for text to speech component
- (feature) Adds local classification model for Vision component Default model: DEIM: DETR with Improved Matching for Fast Convergence by Huang et al.
- (feature) Adds warnings if device for local models is set to GPU and runtime is not available
- (feature) Adds hypothesis buffer for publishing confirmed transcripts when using streaming
- (feature) Adds asynchronous receiving for streaming websockets client in speech to text component
- (refactor) Adds getting inference params just once during node configuration
- (fix) Fixes handling of model init params and sending np arrays during inference
- (feature) Adds asynchronous publishing of response in LLM component when streaming with websocket client
- (feature) Adds local embeddings option using sentence-transformers to ChromaDB client
- (feature) Adds ChromaDB http client with ollama embeddigs
- (feature) Adds streaming with websocket client in llm component
- (fix) Fixes error message for required topics when they can be either/or
- (feature) Adds support for RGBD messages (in realsense style)
- (feature) Adds async websocket client for roboml
- (refactor) Marks child threads as daemons for smoother termination
- (feature) Adds break_character to llm component config to handle breaking streaming output into chunks for publishing
- (feature) Adds streaming to roboml http client for text data
- (feature) Adds streaming output handling to ollama client
- (refactor) Adds set_system_prompt to components and removes it from model config The same model can be called with various system prompts by different components
- (fix) Fixes typing bugs for for python 3.8 compatibility
- Contributors: ahr, aleph-ra, mkabtoul
0.3.3 (2025-01-28)
- (fix) Removes python dependencies from package manifest until package names merged in rosdistro
- Contributors: ahr
0.3.2 (2025-01-28)
- (docs) Updates docs for conversational agent and SpeechToTextConfig
- (feature) Adds vad, audio feautres and wakeword classification
File truncated at 100 lines see the full file
Package Dependencies
Deps | Name |
---|---|
ament_cmake | |
ament_cmake_python | |
rosidl_default_generators | |
rosidl_default_runtime | |
builtin_interfaces | |
std_msgs | |
sensor_msgs | |
automatika_ros_sugar |
System Dependencies
Dependant Packages
Launch files
Messages
Services
Plugins
Recent questions tagged automatika_embodied_agents at Robotics Stack Exchange
No questions yet, you can ask one on Robotics Stack Exchange.
Failed to get question list, you can ticket an issue on the github issue tracker.
![]() |
automatika_embodied_agents package from automatika_embodied_agents repoautomatika_embodied_agents |
ROS Distro
|
Package Summary
Version | 0.4.2 |
License | MIT |
Build type | AMENT_CMAKE |
Use | RECOMMENDED |
Repository Summary
Checkout URI | https://github.com/automatika-robotics/ros-agents.git |
VCS Type | git |
VCS Version | main |
Last Updated | 2025-09-16 |
Dev Status | DEVELOPED |
Released | RELEASED |
Contributing |
Help Wanted (-)
Good First Issues (-) Pull Requests to Review (-) |
Package Description
Additional Links
Maintainers
- Automatika Robotics
Authors

🇨🇳 简体ä¸ć–‡ | 🇯🇵 日本語 |
EmbodiedAgents is a fully-loaded ROS2 based framework for creating interactive physical agents that can understand, remember, and act upon contextual information from their environment.
- Production Ready Physical Agents: Designed to be used with autonomous robot systems that operate in real world dynamic environments. EmbodiedAgents makes it simple to create systems that make use of Physical AI.
- Intuitive API: Simple pythonic API to utilize local or cloud based ML models (specifically Multimodal LLMs and other transformer based architectures) on robots, with all the benefits of component lifecycle management, health monitoring and fallback mechanisms to make your agents robust.
- Self-referential and Event Driven: An agent created with EmbodiedAgents can start, stop or reconfigure its own components based on internal and external events. For example, an agent can change the ML model for planning based on its location on the map or input from the vision model. EmbodiedAgents makes it simple to create agents that are self-referential Gödel machines.
- Semantic Memory: Integrates vector databases, semantic routing and other supporting components to quickly build arbitrarily complex graphs for agentic information flow. No need to utilize bloated “GenAI” frameworks on your robot.
- Made in ROS2: Utilizes ROS2 as the underlying distributed communications backbone. Theoretically, all devices that provide a ROS2 package can be utilized to send data to ML models, with callbacks implemented for most commonly used data types and infinite extensibility.
Checkout Installation Instructions 🛠️
Get started with the Quickstart Guide 🚀
Get familiar with Basic Concepts 📚
Dive right in with Example Recipes ✨
Installation 🛠️
Install a model serving platform
The core of EmbodiedAgents is agnostic to model serving platforms. It currently supports Ollama, RoboML and any platform or cloud provider with an OpenAI compatible API (e.g. vLLM, lmdeploy etc.). Please install either of these by following the instructions provided by respective projects. Support for new platforms is being continuously added. If you would like to support a particular platform, please open an issue/PR.
Install EmbodiedAgents (Ubuntu)
For ROS versions >= humble, you can install EmbodiedAgents with your package manager. For example on Ubuntu:
sudo apt install ros-$ROS_DISTRO-automatika-embodied-agents
Alternatively, grab your favorite deb package from the release page and install it as follows:
sudo dpkg -i ros-$ROS_DISTRO-automatica-embodied-agents_$version$DISTRO_$ARCHITECTURE.deb
If the attrs version from your package manager is < 23.2, install it using pip as follows:
pip install 'attrs>=23.2.0'
Install EmbodiedAgents from source
Get Dependencies
Install python dependencies
pip install numpy opencv-python-headless 'attrs>=23.2.0' jinja2 httpx setproctitle msgpack msgpack-numpy platformdirs tqdm websockets
Download Sugarcoat🍬
git clone https://github.com/automatika-robotics/sugarcoat
Install EmbodiedAgents
git clone https://github.com/automatika-robotics/embodied-agents.git
cd ..
colcon build
source install/setup.bash
python your_script.py
Quick Start 🚀
Unlike other ROS package, EmbodiedAgents provides a pure pythonic way of describing the node graph using Sugarcoat🍬. Copy the following code in a python script and run it.
```python from agents.clients.ollama import OllamaClient from agents.components import MLLM from agents.models import OllamaModel from agents.ros import Topic, Launcher
Define input and output topics (pay attention to msg_type)
text0 = Topic(name=”text0”, msg_type=”String”) image0 = Topic(name=”image_raw”, msg_type=”Image”) text1 = Topic(name=”text1”, msg_type=”String”)
Define a model client (working with Ollama in this case)
llava = OllamaModel(name=”llava”, checkpoint=”llava:latest”) llava_client = OllamaClient(llava)
Define an MLLM component (A component represents a node with a particular functionality)
mllm = MLLM( inputs=[text0, image0], outputs=[text1], model_client=llava_client, trigger=[text0], component_name=”vqa” )
Additional prompt settings
mllm.set_topic_prompt(text0, template=”"”You are an amazing and funny robot.
File truncated at 100 lines see the full file
Changelog for package automatika_embodied_agents
0.4.2 (2025-09-03)
- (feature) Adds udp streaming to IP:PORT as an option to TextToStream component when play_on_device is enabled
- (docs) Updates docs to use new web based client
- (feature) Adds processing of audio messages in web client
- (chore) Removes chainlit based client
- (feature) Adds a custom webclient to replace chainlit
- (feature) Adds persistent ros node in web client for async stream handling
- (feature) Adds warning when not using streaming string msg_type with streaming enabled in components
- (feature) Adds streaming string msg for managing streams in external clients
- (docs) Adds recipe for vision guided point navigation
- (fix) Fixes empty image input for Detection2D msg publication
- (fix) Fixes websocket receiving in text to speech
- (fix) Fixes keyword argument in detection and tracking publishing
- (feature) Adds publishing a singular detection or tracking message from the vision component
- Contributors: ahr, mkabtoul
0.4.1 (2025-07-10)
- (docs) Updates docs for using planning based MLLMs
- (feature) Adds options to get RGBD array from rgbd message callback
- (refactor) Breaks complex functions and fixes warmup result logging
- (feature) Adds support for planning mllm models, starting with robobrain2.0
- (docs) Adds streaming to conversational agent example
- Contributors: ahr, mkabtoul
0.4.0 (2025-06-18)
- (docs) Adds international readme files
- (feature) Adds better connection error messages in clients, adds installation instructions
- (chore) Adds debian packaging workflow
- (docs) Updates installation instructions
- (chore) Updates package names .. ROS Agents -> EmbodiedAgents
- (feature) Adds a GenericHTTPClient for using llm and mllm models served on any OpenAI compatible API
- (feature) Adds ollama specific inference options to OllamaModel and client
- (feature) Adds MeloTTS model to model definitions
- (feature) Adds say text method to text to speech for invoking with events
- (feature) Adds streaming playback for streaming input in speeech to text component
- (fix) Fixes clearing old output in the vision component when getting subscription data in a timed manner
- (feature) Adds tensorrt as an onnx provider option for local models
- (refactor) Removes sounddevice as a dependancy for text to speech component
- (feature) Adds local classification model for Vision component Default model: DEIM: DETR with Improved Matching for Fast Convergence by Huang et al.
- (feature) Adds warnings if device for local models is set to GPU and runtime is not available
- (feature) Adds hypothesis buffer for publishing confirmed transcripts when using streaming
- (feature) Adds asynchronous receiving for streaming websockets client in speech to text component
- (refactor) Adds getting inference params just once during node configuration
- (fix) Fixes handling of model init params and sending np arrays during inference
- (feature) Adds asynchronous publishing of response in LLM component when streaming with websocket client
- (feature) Adds local embeddings option using sentence-transformers to ChromaDB client
- (feature) Adds ChromaDB http client with ollama embeddigs
- (feature) Adds streaming with websocket client in llm component
- (fix) Fixes error message for required topics when they can be either/or
- (feature) Adds support for RGBD messages (in realsense style)
- (feature) Adds async websocket client for roboml
- (refactor) Marks child threads as daemons for smoother termination
- (feature) Adds break_character to llm component config to handle breaking streaming output into chunks for publishing
- (feature) Adds streaming to roboml http client for text data
- (feature) Adds streaming output handling to ollama client
- (refactor) Adds set_system_prompt to components and removes it from model config The same model can be called with various system prompts by different components
- (fix) Fixes typing bugs for for python 3.8 compatibility
- Contributors: ahr, aleph-ra, mkabtoul
0.3.3 (2025-01-28)
- (fix) Removes python dependencies from package manifest until package names merged in rosdistro
- Contributors: ahr
0.3.2 (2025-01-28)
- (docs) Updates docs for conversational agent and SpeechToTextConfig
- (feature) Adds vad, audio feautres and wakeword classification
File truncated at 100 lines see the full file
Package Dependencies
Deps | Name |
---|---|
ament_cmake | |
ament_cmake_python | |
rosidl_default_generators | |
rosidl_default_runtime | |
builtin_interfaces | |
std_msgs | |
sensor_msgs | |
automatika_ros_sugar |
System Dependencies
Dependant Packages
Launch files
Messages
Services
Plugins
Recent questions tagged automatika_embodied_agents at Robotics Stack Exchange
No questions yet, you can ask one on Robotics Stack Exchange.
Failed to get question list, you can ticket an issue on the github issue tracker.
![]() |
automatika_embodied_agents package from automatika_embodied_agents repoautomatika_embodied_agents |
ROS Distro
|
Package Summary
Version | 0.4.2 |
License | MIT |
Build type | AMENT_CMAKE |
Use | RECOMMENDED |
Repository Summary
Checkout URI | https://github.com/automatika-robotics/ros-agents.git |
VCS Type | git |
VCS Version | main |
Last Updated | 2025-09-16 |
Dev Status | DEVELOPED |
Released | RELEASED |
Contributing |
Help Wanted (-)
Good First Issues (-) Pull Requests to Review (-) |
Package Description
Additional Links
Maintainers
- Automatika Robotics
Authors

🇨🇳 简体ä¸ć–‡ | 🇯🇵 日本語 |
EmbodiedAgents is a fully-loaded ROS2 based framework for creating interactive physical agents that can understand, remember, and act upon contextual information from their environment.
- Production Ready Physical Agents: Designed to be used with autonomous robot systems that operate in real world dynamic environments. EmbodiedAgents makes it simple to create systems that make use of Physical AI.
- Intuitive API: Simple pythonic API to utilize local or cloud based ML models (specifically Multimodal LLMs and other transformer based architectures) on robots, with all the benefits of component lifecycle management, health monitoring and fallback mechanisms to make your agents robust.
- Self-referential and Event Driven: An agent created with EmbodiedAgents can start, stop or reconfigure its own components based on internal and external events. For example, an agent can change the ML model for planning based on its location on the map or input from the vision model. EmbodiedAgents makes it simple to create agents that are self-referential Gödel machines.
- Semantic Memory: Integrates vector databases, semantic routing and other supporting components to quickly build arbitrarily complex graphs for agentic information flow. No need to utilize bloated “GenAI” frameworks on your robot.
- Made in ROS2: Utilizes ROS2 as the underlying distributed communications backbone. Theoretically, all devices that provide a ROS2 package can be utilized to send data to ML models, with callbacks implemented for most commonly used data types and infinite extensibility.
Checkout Installation Instructions 🛠️
Get started with the Quickstart Guide 🚀
Get familiar with Basic Concepts 📚
Dive right in with Example Recipes ✨
Installation 🛠️
Install a model serving platform
The core of EmbodiedAgents is agnostic to model serving platforms. It currently supports Ollama, RoboML and any platform or cloud provider with an OpenAI compatible API (e.g. vLLM, lmdeploy etc.). Please install either of these by following the instructions provided by respective projects. Support for new platforms is being continuously added. If you would like to support a particular platform, please open an issue/PR.
Install EmbodiedAgents (Ubuntu)
For ROS versions >= humble, you can install EmbodiedAgents with your package manager. For example on Ubuntu:
sudo apt install ros-$ROS_DISTRO-automatika-embodied-agents
Alternatively, grab your favorite deb package from the release page and install it as follows:
sudo dpkg -i ros-$ROS_DISTRO-automatica-embodied-agents_$version$DISTRO_$ARCHITECTURE.deb
If the attrs version from your package manager is < 23.2, install it using pip as follows:
pip install 'attrs>=23.2.0'
Install EmbodiedAgents from source
Get Dependencies
Install python dependencies
pip install numpy opencv-python-headless 'attrs>=23.2.0' jinja2 httpx setproctitle msgpack msgpack-numpy platformdirs tqdm websockets
Download Sugarcoat🍬
git clone https://github.com/automatika-robotics/sugarcoat
Install EmbodiedAgents
git clone https://github.com/automatika-robotics/embodied-agents.git
cd ..
colcon build
source install/setup.bash
python your_script.py
Quick Start 🚀
Unlike other ROS package, EmbodiedAgents provides a pure pythonic way of describing the node graph using Sugarcoat🍬. Copy the following code in a python script and run it.
```python from agents.clients.ollama import OllamaClient from agents.components import MLLM from agents.models import OllamaModel from agents.ros import Topic, Launcher
Define input and output topics (pay attention to msg_type)
text0 = Topic(name=”text0”, msg_type=”String”) image0 = Topic(name=”image_raw”, msg_type=”Image”) text1 = Topic(name=”text1”, msg_type=”String”)
Define a model client (working with Ollama in this case)
llava = OllamaModel(name=”llava”, checkpoint=”llava:latest”) llava_client = OllamaClient(llava)
Define an MLLM component (A component represents a node with a particular functionality)
mllm = MLLM( inputs=[text0, image0], outputs=[text1], model_client=llava_client, trigger=[text0], component_name=”vqa” )
Additional prompt settings
mllm.set_topic_prompt(text0, template=”"”You are an amazing and funny robot.
File truncated at 100 lines see the full file
Changelog for package automatika_embodied_agents
0.4.2 (2025-09-03)
- (feature) Adds udp streaming to IP:PORT as an option to TextToStream component when play_on_device is enabled
- (docs) Updates docs to use new web based client
- (feature) Adds processing of audio messages in web client
- (chore) Removes chainlit based client
- (feature) Adds a custom webclient to replace chainlit
- (feature) Adds persistent ros node in web client for async stream handling
- (feature) Adds warning when not using streaming string msg_type with streaming enabled in components
- (feature) Adds streaming string msg for managing streams in external clients
- (docs) Adds recipe for vision guided point navigation
- (fix) Fixes empty image input for Detection2D msg publication
- (fix) Fixes websocket receiving in text to speech
- (fix) Fixes keyword argument in detection and tracking publishing
- (feature) Adds publishing a singular detection or tracking message from the vision component
- Contributors: ahr, mkabtoul
0.4.1 (2025-07-10)
- (docs) Updates docs for using planning based MLLMs
- (feature) Adds options to get RGBD array from rgbd message callback
- (refactor) Breaks complex functions and fixes warmup result logging
- (feature) Adds support for planning mllm models, starting with robobrain2.0
- (docs) Adds streaming to conversational agent example
- Contributors: ahr, mkabtoul
0.4.0 (2025-06-18)
- (docs) Adds international readme files
- (feature) Adds better connection error messages in clients, adds installation instructions
- (chore) Adds debian packaging workflow
- (docs) Updates installation instructions
- (chore) Updates package names .. ROS Agents -> EmbodiedAgents
- (feature) Adds a GenericHTTPClient for using llm and mllm models served on any OpenAI compatible API
- (feature) Adds ollama specific inference options to OllamaModel and client
- (feature) Adds MeloTTS model to model definitions
- (feature) Adds say text method to text to speech for invoking with events
- (feature) Adds streaming playback for streaming input in speeech to text component
- (fix) Fixes clearing old output in the vision component when getting subscription data in a timed manner
- (feature) Adds tensorrt as an onnx provider option for local models
- (refactor) Removes sounddevice as a dependancy for text to speech component
- (feature) Adds local classification model for Vision component Default model: DEIM: DETR with Improved Matching for Fast Convergence by Huang et al.
- (feature) Adds warnings if device for local models is set to GPU and runtime is not available
- (feature) Adds hypothesis buffer for publishing confirmed transcripts when using streaming
- (feature) Adds asynchronous receiving for streaming websockets client in speech to text component
- (refactor) Adds getting inference params just once during node configuration
- (fix) Fixes handling of model init params and sending np arrays during inference
- (feature) Adds asynchronous publishing of response in LLM component when streaming with websocket client
- (feature) Adds local embeddings option using sentence-transformers to ChromaDB client
- (feature) Adds ChromaDB http client with ollama embeddigs
- (feature) Adds streaming with websocket client in llm component
- (fix) Fixes error message for required topics when they can be either/or
- (feature) Adds support for RGBD messages (in realsense style)
- (feature) Adds async websocket client for roboml
- (refactor) Marks child threads as daemons for smoother termination
- (feature) Adds break_character to llm component config to handle breaking streaming output into chunks for publishing
- (feature) Adds streaming to roboml http client for text data
- (feature) Adds streaming output handling to ollama client
- (refactor) Adds set_system_prompt to components and removes it from model config The same model can be called with various system prompts by different components
- (fix) Fixes typing bugs for for python 3.8 compatibility
- Contributors: ahr, aleph-ra, mkabtoul
0.3.3 (2025-01-28)
- (fix) Removes python dependencies from package manifest until package names merged in rosdistro
- Contributors: ahr
0.3.2 (2025-01-28)
- (docs) Updates docs for conversational agent and SpeechToTextConfig
- (feature) Adds vad, audio feautres and wakeword classification
File truncated at 100 lines see the full file
Package Dependencies
Deps | Name |
---|---|
ament_cmake | |
ament_cmake_python | |
rosidl_default_generators | |
rosidl_default_runtime | |
builtin_interfaces | |
std_msgs | |
sensor_msgs | |
automatika_ros_sugar |
System Dependencies
Dependant Packages
Launch files
Messages
Services
Plugins
Recent questions tagged automatika_embodied_agents at Robotics Stack Exchange
No questions yet, you can ask one on Robotics Stack Exchange.
Failed to get question list, you can ticket an issue on the github issue tracker.
![]() |
automatika_embodied_agents package from automatika_embodied_agents repoautomatika_embodied_agents |
ROS Distro
|
Package Summary
Version | 0.4.2 |
License | MIT |
Build type | AMENT_CMAKE |
Use | RECOMMENDED |
Repository Summary
Checkout URI | https://github.com/automatika-robotics/ros-agents.git |
VCS Type | git |
VCS Version | main |
Last Updated | 2025-09-16 |
Dev Status | DEVELOPED |
Released | RELEASED |
Contributing |
Help Wanted (-)
Good First Issues (-) Pull Requests to Review (-) |
Package Description
Additional Links
Maintainers
- Automatika Robotics
Authors

🇨🇳 简体ä¸ć–‡ | 🇯🇵 日本語 |
EmbodiedAgents is a fully-loaded ROS2 based framework for creating interactive physical agents that can understand, remember, and act upon contextual information from their environment.
- Production Ready Physical Agents: Designed to be used with autonomous robot systems that operate in real world dynamic environments. EmbodiedAgents makes it simple to create systems that make use of Physical AI.
- Intuitive API: Simple pythonic API to utilize local or cloud based ML models (specifically Multimodal LLMs and other transformer based architectures) on robots, with all the benefits of component lifecycle management, health monitoring and fallback mechanisms to make your agents robust.
- Self-referential and Event Driven: An agent created with EmbodiedAgents can start, stop or reconfigure its own components based on internal and external events. For example, an agent can change the ML model for planning based on its location on the map or input from the vision model. EmbodiedAgents makes it simple to create agents that are self-referential Gödel machines.
- Semantic Memory: Integrates vector databases, semantic routing and other supporting components to quickly build arbitrarily complex graphs for agentic information flow. No need to utilize bloated “GenAI” frameworks on your robot.
- Made in ROS2: Utilizes ROS2 as the underlying distributed communications backbone. Theoretically, all devices that provide a ROS2 package can be utilized to send data to ML models, with callbacks implemented for most commonly used data types and infinite extensibility.
Checkout Installation Instructions 🛠️
Get started with the Quickstart Guide 🚀
Get familiar with Basic Concepts 📚
Dive right in with Example Recipes ✨
Installation 🛠️
Install a model serving platform
The core of EmbodiedAgents is agnostic to model serving platforms. It currently supports Ollama, RoboML and any platform or cloud provider with an OpenAI compatible API (e.g. vLLM, lmdeploy etc.). Please install either of these by following the instructions provided by respective projects. Support for new platforms is being continuously added. If you would like to support a particular platform, please open an issue/PR.
Install EmbodiedAgents (Ubuntu)
For ROS versions >= humble, you can install EmbodiedAgents with your package manager. For example on Ubuntu:
sudo apt install ros-$ROS_DISTRO-automatika-embodied-agents
Alternatively, grab your favorite deb package from the release page and install it as follows:
sudo dpkg -i ros-$ROS_DISTRO-automatica-embodied-agents_$version$DISTRO_$ARCHITECTURE.deb
If the attrs version from your package manager is < 23.2, install it using pip as follows:
pip install 'attrs>=23.2.0'
Install EmbodiedAgents from source
Get Dependencies
Install python dependencies
pip install numpy opencv-python-headless 'attrs>=23.2.0' jinja2 httpx setproctitle msgpack msgpack-numpy platformdirs tqdm websockets
Download Sugarcoat🍬
git clone https://github.com/automatika-robotics/sugarcoat
Install EmbodiedAgents
git clone https://github.com/automatika-robotics/embodied-agents.git
cd ..
colcon build
source install/setup.bash
python your_script.py
Quick Start 🚀
Unlike other ROS package, EmbodiedAgents provides a pure pythonic way of describing the node graph using Sugarcoat🍬. Copy the following code in a python script and run it.
```python from agents.clients.ollama import OllamaClient from agents.components import MLLM from agents.models import OllamaModel from agents.ros import Topic, Launcher
Define input and output topics (pay attention to msg_type)
text0 = Topic(name=”text0”, msg_type=”String”) image0 = Topic(name=”image_raw”, msg_type=”Image”) text1 = Topic(name=”text1”, msg_type=”String”)
Define a model client (working with Ollama in this case)
llava = OllamaModel(name=”llava”, checkpoint=”llava:latest”) llava_client = OllamaClient(llava)
Define an MLLM component (A component represents a node with a particular functionality)
mllm = MLLM( inputs=[text0, image0], outputs=[text1], model_client=llava_client, trigger=[text0], component_name=”vqa” )
Additional prompt settings
mllm.set_topic_prompt(text0, template=”"”You are an amazing and funny robot.
File truncated at 100 lines see the full file
Changelog for package automatika_embodied_agents
0.4.2 (2025-09-03)
- (feature) Adds udp streaming to IP:PORT as an option to TextToStream component when play_on_device is enabled
- (docs) Updates docs to use new web based client
- (feature) Adds processing of audio messages in web client
- (chore) Removes chainlit based client
- (feature) Adds a custom webclient to replace chainlit
- (feature) Adds persistent ros node in web client for async stream handling
- (feature) Adds warning when not using streaming string msg_type with streaming enabled in components
- (feature) Adds streaming string msg for managing streams in external clients
- (docs) Adds recipe for vision guided point navigation
- (fix) Fixes empty image input for Detection2D msg publication
- (fix) Fixes websocket receiving in text to speech
- (fix) Fixes keyword argument in detection and tracking publishing
- (feature) Adds publishing a singular detection or tracking message from the vision component
- Contributors: ahr, mkabtoul
0.4.1 (2025-07-10)
- (docs) Updates docs for using planning based MLLMs
- (feature) Adds options to get RGBD array from rgbd message callback
- (refactor) Breaks complex functions and fixes warmup result logging
- (feature) Adds support for planning mllm models, starting with robobrain2.0
- (docs) Adds streaming to conversational agent example
- Contributors: ahr, mkabtoul
0.4.0 (2025-06-18)
- (docs) Adds international readme files
- (feature) Adds better connection error messages in clients, adds installation instructions
- (chore) Adds debian packaging workflow
- (docs) Updates installation instructions
- (chore) Updates package names .. ROS Agents -> EmbodiedAgents
- (feature) Adds a GenericHTTPClient for using llm and mllm models served on any OpenAI compatible API
- (feature) Adds ollama specific inference options to OllamaModel and client
- (feature) Adds MeloTTS model to model definitions
- (feature) Adds say text method to text to speech for invoking with events
- (feature) Adds streaming playback for streaming input in speeech to text component
- (fix) Fixes clearing old output in the vision component when getting subscription data in a timed manner
- (feature) Adds tensorrt as an onnx provider option for local models
- (refactor) Removes sounddevice as a dependancy for text to speech component
- (feature) Adds local classification model for Vision component Default model: DEIM: DETR with Improved Matching for Fast Convergence by Huang et al.
- (feature) Adds warnings if device for local models is set to GPU and runtime is not available
- (feature) Adds hypothesis buffer for publishing confirmed transcripts when using streaming
- (feature) Adds asynchronous receiving for streaming websockets client in speech to text component
- (refactor) Adds getting inference params just once during node configuration
- (fix) Fixes handling of model init params and sending np arrays during inference
- (feature) Adds asynchronous publishing of response in LLM component when streaming with websocket client
- (feature) Adds local embeddings option using sentence-transformers to ChromaDB client
- (feature) Adds ChromaDB http client with ollama embeddigs
- (feature) Adds streaming with websocket client in llm component
- (fix) Fixes error message for required topics when they can be either/or
- (feature) Adds support for RGBD messages (in realsense style)
- (feature) Adds async websocket client for roboml
- (refactor) Marks child threads as daemons for smoother termination
- (feature) Adds break_character to llm component config to handle breaking streaming output into chunks for publishing
- (feature) Adds streaming to roboml http client for text data
- (feature) Adds streaming output handling to ollama client
- (refactor) Adds set_system_prompt to components and removes it from model config The same model can be called with various system prompts by different components
- (fix) Fixes typing bugs for for python 3.8 compatibility
- Contributors: ahr, aleph-ra, mkabtoul
0.3.3 (2025-01-28)
- (fix) Removes python dependencies from package manifest until package names merged in rosdistro
- Contributors: ahr
0.3.2 (2025-01-28)
- (docs) Updates docs for conversational agent and SpeechToTextConfig
- (feature) Adds vad, audio feautres and wakeword classification
File truncated at 100 lines see the full file
Package Dependencies
Deps | Name |
---|---|
ament_cmake | |
ament_cmake_python | |
rosidl_default_generators | |
rosidl_default_runtime | |
builtin_interfaces | |
std_msgs | |
sensor_msgs | |
automatika_ros_sugar |
System Dependencies
Dependant Packages
Launch files
Messages
Services
Plugins
Recent questions tagged automatika_embodied_agents at Robotics Stack Exchange
No questions yet, you can ask one on Robotics Stack Exchange.
Failed to get question list, you can ticket an issue on the github issue tracker.
![]() |
automatika_embodied_agents package from automatika_embodied_agents repoautomatika_embodied_agents |
ROS Distro
|
Package Summary
Version | 0.4.2 |
License | MIT |
Build type | AMENT_CMAKE |
Use | RECOMMENDED |
Repository Summary
Checkout URI | https://github.com/automatika-robotics/ros-agents.git |
VCS Type | git |
VCS Version | main |
Last Updated | 2025-09-16 |
Dev Status | DEVELOPED |
Released | RELEASED |
Contributing |
Help Wanted (-)
Good First Issues (-) Pull Requests to Review (-) |
Package Description
Additional Links
Maintainers
- Automatika Robotics
Authors

🇨🇳 简体ä¸ć–‡ | 🇯🇵 日本語 |
EmbodiedAgents is a fully-loaded ROS2 based framework for creating interactive physical agents that can understand, remember, and act upon contextual information from their environment.
- Production Ready Physical Agents: Designed to be used with autonomous robot systems that operate in real world dynamic environments. EmbodiedAgents makes it simple to create systems that make use of Physical AI.
- Intuitive API: Simple pythonic API to utilize local or cloud based ML models (specifically Multimodal LLMs and other transformer based architectures) on robots, with all the benefits of component lifecycle management, health monitoring and fallback mechanisms to make your agents robust.
- Self-referential and Event Driven: An agent created with EmbodiedAgents can start, stop or reconfigure its own components based on internal and external events. For example, an agent can change the ML model for planning based on its location on the map or input from the vision model. EmbodiedAgents makes it simple to create agents that are self-referential Gödel machines.
- Semantic Memory: Integrates vector databases, semantic routing and other supporting components to quickly build arbitrarily complex graphs for agentic information flow. No need to utilize bloated “GenAI” frameworks on your robot.
- Made in ROS2: Utilizes ROS2 as the underlying distributed communications backbone. Theoretically, all devices that provide a ROS2 package can be utilized to send data to ML models, with callbacks implemented for most commonly used data types and infinite extensibility.
Checkout Installation Instructions 🛠️
Get started with the Quickstart Guide 🚀
Get familiar with Basic Concepts 📚
Dive right in with Example Recipes ✨
Installation 🛠️
Install a model serving platform
The core of EmbodiedAgents is agnostic to model serving platforms. It currently supports Ollama, RoboML and any platform or cloud provider with an OpenAI compatible API (e.g. vLLM, lmdeploy etc.). Please install either of these by following the instructions provided by respective projects. Support for new platforms is being continuously added. If you would like to support a particular platform, please open an issue/PR.
Install EmbodiedAgents (Ubuntu)
For ROS versions >= humble, you can install EmbodiedAgents with your package manager. For example on Ubuntu:
sudo apt install ros-$ROS_DISTRO-automatika-embodied-agents
Alternatively, grab your favorite deb package from the release page and install it as follows:
sudo dpkg -i ros-$ROS_DISTRO-automatica-embodied-agents_$version$DISTRO_$ARCHITECTURE.deb
If the attrs version from your package manager is < 23.2, install it using pip as follows:
pip install 'attrs>=23.2.0'
Install EmbodiedAgents from source
Get Dependencies
Install python dependencies
pip install numpy opencv-python-headless 'attrs>=23.2.0' jinja2 httpx setproctitle msgpack msgpack-numpy platformdirs tqdm websockets
Download Sugarcoat🍬
git clone https://github.com/automatika-robotics/sugarcoat
Install EmbodiedAgents
git clone https://github.com/automatika-robotics/embodied-agents.git
cd ..
colcon build
source install/setup.bash
python your_script.py
Quick Start 🚀
Unlike other ROS package, EmbodiedAgents provides a pure pythonic way of describing the node graph using Sugarcoat🍬. Copy the following code in a python script and run it.
```python from agents.clients.ollama import OllamaClient from agents.components import MLLM from agents.models import OllamaModel from agents.ros import Topic, Launcher
Define input and output topics (pay attention to msg_type)
text0 = Topic(name=”text0”, msg_type=”String”) image0 = Topic(name=”image_raw”, msg_type=”Image”) text1 = Topic(name=”text1”, msg_type=”String”)
Define a model client (working with Ollama in this case)
llava = OllamaModel(name=”llava”, checkpoint=”llava:latest”) llava_client = OllamaClient(llava)
Define an MLLM component (A component represents a node with a particular functionality)
mllm = MLLM( inputs=[text0, image0], outputs=[text1], model_client=llava_client, trigger=[text0], component_name=”vqa” )
Additional prompt settings
mllm.set_topic_prompt(text0, template=”"”You are an amazing and funny robot.
File truncated at 100 lines see the full file
Changelog for package automatika_embodied_agents
0.4.2 (2025-09-03)
- (feature) Adds udp streaming to IP:PORT as an option to TextToStream component when play_on_device is enabled
- (docs) Updates docs to use new web based client
- (feature) Adds processing of audio messages in web client
- (chore) Removes chainlit based client
- (feature) Adds a custom webclient to replace chainlit
- (feature) Adds persistent ros node in web client for async stream handling
- (feature) Adds warning when not using streaming string msg_type with streaming enabled in components
- (feature) Adds streaming string msg for managing streams in external clients
- (docs) Adds recipe for vision guided point navigation
- (fix) Fixes empty image input for Detection2D msg publication
- (fix) Fixes websocket receiving in text to speech
- (fix) Fixes keyword argument in detection and tracking publishing
- (feature) Adds publishing a singular detection or tracking message from the vision component
- Contributors: ahr, mkabtoul
0.4.1 (2025-07-10)
- (docs) Updates docs for using planning based MLLMs
- (feature) Adds options to get RGBD array from rgbd message callback
- (refactor) Breaks complex functions and fixes warmup result logging
- (feature) Adds support for planning mllm models, starting with robobrain2.0
- (docs) Adds streaming to conversational agent example
- Contributors: ahr, mkabtoul
0.4.0 (2025-06-18)
- (docs) Adds international readme files
- (feature) Adds better connection error messages in clients, adds installation instructions
- (chore) Adds debian packaging workflow
- (docs) Updates installation instructions
- (chore) Updates package names .. ROS Agents -> EmbodiedAgents
- (feature) Adds a GenericHTTPClient for using llm and mllm models served on any OpenAI compatible API
- (feature) Adds ollama specific inference options to OllamaModel and client
- (feature) Adds MeloTTS model to model definitions
- (feature) Adds say text method to text to speech for invoking with events
- (feature) Adds streaming playback for streaming input in speeech to text component
- (fix) Fixes clearing old output in the vision component when getting subscription data in a timed manner
- (feature) Adds tensorrt as an onnx provider option for local models
- (refactor) Removes sounddevice as a dependancy for text to speech component
- (feature) Adds local classification model for Vision component Default model: DEIM: DETR with Improved Matching for Fast Convergence by Huang et al.
- (feature) Adds warnings if device for local models is set to GPU and runtime is not available
- (feature) Adds hypothesis buffer for publishing confirmed transcripts when using streaming
- (feature) Adds asynchronous receiving for streaming websockets client in speech to text component
- (refactor) Adds getting inference params just once during node configuration
- (fix) Fixes handling of model init params and sending np arrays during inference
- (feature) Adds asynchronous publishing of response in LLM component when streaming with websocket client
- (feature) Adds local embeddings option using sentence-transformers to ChromaDB client
- (feature) Adds ChromaDB http client with ollama embeddigs
- (feature) Adds streaming with websocket client in llm component
- (fix) Fixes error message for required topics when they can be either/or
- (feature) Adds support for RGBD messages (in realsense style)
- (feature) Adds async websocket client for roboml
- (refactor) Marks child threads as daemons for smoother termination
- (feature) Adds break_character to llm component config to handle breaking streaming output into chunks for publishing
- (feature) Adds streaming to roboml http client for text data
- (feature) Adds streaming output handling to ollama client
- (refactor) Adds set_system_prompt to components and removes it from model config The same model can be called with various system prompts by different components
- (fix) Fixes typing bugs for for python 3.8 compatibility
- Contributors: ahr, aleph-ra, mkabtoul
0.3.3 (2025-01-28)
- (fix) Removes python dependencies from package manifest until package names merged in rosdistro
- Contributors: ahr
0.3.2 (2025-01-28)
- (docs) Updates docs for conversational agent and SpeechToTextConfig
- (feature) Adds vad, audio feautres and wakeword classification
File truncated at 100 lines see the full file
Package Dependencies
Deps | Name |
---|---|
ament_cmake | |
ament_cmake_python | |
rosidl_default_generators | |
rosidl_default_runtime | |
builtin_interfaces | |
std_msgs | |
sensor_msgs | |
automatika_ros_sugar |
System Dependencies
Dependant Packages
Launch files
Messages
Services
Plugins
Recent questions tagged automatika_embodied_agents at Robotics Stack Exchange
No questions yet, you can ask one on Robotics Stack Exchange.
Failed to get question list, you can ticket an issue on the github issue tracker.
![]() |
automatika_embodied_agents package from automatika_embodied_agents repoautomatika_embodied_agents |
ROS Distro
|
Package Summary
Version | 0.4.2 |
License | MIT |
Build type | AMENT_CMAKE |
Use | RECOMMENDED |
Repository Summary
Checkout URI | https://github.com/automatika-robotics/ros-agents.git |
VCS Type | git |
VCS Version | main |
Last Updated | 2025-09-16 |
Dev Status | DEVELOPED |
Released | RELEASED |
Contributing |
Help Wanted (-)
Good First Issues (-) Pull Requests to Review (-) |
Package Description
Additional Links
Maintainers
- Automatika Robotics
Authors

🇨🇳 简体ä¸ć–‡ | 🇯🇵 日本語 |
EmbodiedAgents is a fully-loaded ROS2 based framework for creating interactive physical agents that can understand, remember, and act upon contextual information from their environment.
- Production Ready Physical Agents: Designed to be used with autonomous robot systems that operate in real world dynamic environments. EmbodiedAgents makes it simple to create systems that make use of Physical AI.
- Intuitive API: Simple pythonic API to utilize local or cloud based ML models (specifically Multimodal LLMs and other transformer based architectures) on robots, with all the benefits of component lifecycle management, health monitoring and fallback mechanisms to make your agents robust.
- Self-referential and Event Driven: An agent created with EmbodiedAgents can start, stop or reconfigure its own components based on internal and external events. For example, an agent can change the ML model for planning based on its location on the map or input from the vision model. EmbodiedAgents makes it simple to create agents that are self-referential Gödel machines.
- Semantic Memory: Integrates vector databases, semantic routing and other supporting components to quickly build arbitrarily complex graphs for agentic information flow. No need to utilize bloated “GenAI” frameworks on your robot.
- Made in ROS2: Utilizes ROS2 as the underlying distributed communications backbone. Theoretically, all devices that provide a ROS2 package can be utilized to send data to ML models, with callbacks implemented for most commonly used data types and infinite extensibility.
Checkout Installation Instructions 🛠️
Get started with the Quickstart Guide 🚀
Get familiar with Basic Concepts 📚
Dive right in with Example Recipes ✨
Installation 🛠️
Install a model serving platform
The core of EmbodiedAgents is agnostic to model serving platforms. It currently supports Ollama, RoboML and any platform or cloud provider with an OpenAI compatible API (e.g. vLLM, lmdeploy etc.). Please install either of these by following the instructions provided by respective projects. Support for new platforms is being continuously added. If you would like to support a particular platform, please open an issue/PR.
Install EmbodiedAgents (Ubuntu)
For ROS versions >= humble, you can install EmbodiedAgents with your package manager. For example on Ubuntu:
sudo apt install ros-$ROS_DISTRO-automatika-embodied-agents
Alternatively, grab your favorite deb package from the release page and install it as follows:
sudo dpkg -i ros-$ROS_DISTRO-automatica-embodied-agents_$version$DISTRO_$ARCHITECTURE.deb
If the attrs version from your package manager is < 23.2, install it using pip as follows:
pip install 'attrs>=23.2.0'
Install EmbodiedAgents from source
Get Dependencies
Install python dependencies
pip install numpy opencv-python-headless 'attrs>=23.2.0' jinja2 httpx setproctitle msgpack msgpack-numpy platformdirs tqdm websockets
Download Sugarcoat🍬
git clone https://github.com/automatika-robotics/sugarcoat
Install EmbodiedAgents
git clone https://github.com/automatika-robotics/embodied-agents.git
cd ..
colcon build
source install/setup.bash
python your_script.py
Quick Start 🚀
Unlike other ROS package, EmbodiedAgents provides a pure pythonic way of describing the node graph using Sugarcoat🍬. Copy the following code in a python script and run it.
```python from agents.clients.ollama import OllamaClient from agents.components import MLLM from agents.models import OllamaModel from agents.ros import Topic, Launcher
Define input and output topics (pay attention to msg_type)
text0 = Topic(name=”text0”, msg_type=”String”) image0 = Topic(name=”image_raw”, msg_type=”Image”) text1 = Topic(name=”text1”, msg_type=”String”)
Define a model client (working with Ollama in this case)
llava = OllamaModel(name=”llava”, checkpoint=”llava:latest”) llava_client = OllamaClient(llava)
Define an MLLM component (A component represents a node with a particular functionality)
mllm = MLLM( inputs=[text0, image0], outputs=[text1], model_client=llava_client, trigger=[text0], component_name=”vqa” )
Additional prompt settings
mllm.set_topic_prompt(text0, template=”"”You are an amazing and funny robot.
File truncated at 100 lines see the full file
Changelog for package automatika_embodied_agents
0.4.2 (2025-09-03)
- (feature) Adds udp streaming to IP:PORT as an option to TextToStream component when play_on_device is enabled
- (docs) Updates docs to use new web based client
- (feature) Adds processing of audio messages in web client
- (chore) Removes chainlit based client
- (feature) Adds a custom webclient to replace chainlit
- (feature) Adds persistent ros node in web client for async stream handling
- (feature) Adds warning when not using streaming string msg_type with streaming enabled in components
- (feature) Adds streaming string msg for managing streams in external clients
- (docs) Adds recipe for vision guided point navigation
- (fix) Fixes empty image input for Detection2D msg publication
- (fix) Fixes websocket receiving in text to speech
- (fix) Fixes keyword argument in detection and tracking publishing
- (feature) Adds publishing a singular detection or tracking message from the vision component
- Contributors: ahr, mkabtoul
0.4.1 (2025-07-10)
- (docs) Updates docs for using planning based MLLMs
- (feature) Adds options to get RGBD array from rgbd message callback
- (refactor) Breaks complex functions and fixes warmup result logging
- (feature) Adds support for planning mllm models, starting with robobrain2.0
- (docs) Adds streaming to conversational agent example
- Contributors: ahr, mkabtoul
0.4.0 (2025-06-18)
- (docs) Adds international readme files
- (feature) Adds better connection error messages in clients, adds installation instructions
- (chore) Adds debian packaging workflow
- (docs) Updates installation instructions
- (chore) Updates package names .. ROS Agents -> EmbodiedAgents
- (feature) Adds a GenericHTTPClient for using llm and mllm models served on any OpenAI compatible API
- (feature) Adds ollama specific inference options to OllamaModel and client
- (feature) Adds MeloTTS model to model definitions
- (feature) Adds say text method to text to speech for invoking with events
- (feature) Adds streaming playback for streaming input in speeech to text component
- (fix) Fixes clearing old output in the vision component when getting subscription data in a timed manner
- (feature) Adds tensorrt as an onnx provider option for local models
- (refactor) Removes sounddevice as a dependancy for text to speech component
- (feature) Adds local classification model for Vision component Default model: DEIM: DETR with Improved Matching for Fast Convergence by Huang et al.
- (feature) Adds warnings if device for local models is set to GPU and runtime is not available
- (feature) Adds hypothesis buffer for publishing confirmed transcripts when using streaming
- (feature) Adds asynchronous receiving for streaming websockets client in speech to text component
- (refactor) Adds getting inference params just once during node configuration
- (fix) Fixes handling of model init params and sending np arrays during inference
- (feature) Adds asynchronous publishing of response in LLM component when streaming with websocket client
- (feature) Adds local embeddings option using sentence-transformers to ChromaDB client
- (feature) Adds ChromaDB http client with ollama embeddigs
- (feature) Adds streaming with websocket client in llm component
- (fix) Fixes error message for required topics when they can be either/or
- (feature) Adds support for RGBD messages (in realsense style)
- (feature) Adds async websocket client for roboml
- (refactor) Marks child threads as daemons for smoother termination
- (feature) Adds break_character to llm component config to handle breaking streaming output into chunks for publishing
- (feature) Adds streaming to roboml http client for text data
- (feature) Adds streaming output handling to ollama client
- (refactor) Adds set_system_prompt to components and removes it from model config The same model can be called with various system prompts by different components
- (fix) Fixes typing bugs for for python 3.8 compatibility
- Contributors: ahr, aleph-ra, mkabtoul
0.3.3 (2025-01-28)
- (fix) Removes python dependencies from package manifest until package names merged in rosdistro
- Contributors: ahr
0.3.2 (2025-01-28)
- (docs) Updates docs for conversational agent and SpeechToTextConfig
- (feature) Adds vad, audio feautres and wakeword classification
File truncated at 100 lines see the full file
Package Dependencies
Deps | Name |
---|---|
ament_cmake | |
ament_cmake_python | |
rosidl_default_generators | |
rosidl_default_runtime | |
builtin_interfaces | |
std_msgs | |
sensor_msgs | |
automatika_ros_sugar |
System Dependencies
Dependant Packages
Launch files
Messages
Services
Plugins
Recent questions tagged automatika_embodied_agents at Robotics Stack Exchange
No questions yet, you can ask one on Robotics Stack Exchange.
Failed to get question list, you can ticket an issue on the github issue tracker.
![]() |
automatika_embodied_agents package from automatika_embodied_agents repoautomatika_embodied_agents |
ROS Distro
|
Package Summary
Version | 0.4.2 |
License | MIT |
Build type | AMENT_CMAKE |
Use | RECOMMENDED |
Repository Summary
Checkout URI | https://github.com/automatika-robotics/ros-agents.git |
VCS Type | git |
VCS Version | main |
Last Updated | 2025-09-16 |
Dev Status | DEVELOPED |
Released | RELEASED |
Contributing |
Help Wanted (-)
Good First Issues (-) Pull Requests to Review (-) |
Package Description
Additional Links
Maintainers
- Automatika Robotics
Authors

🇨🇳 简体ä¸ć–‡ | 🇯🇵 日本語 |
EmbodiedAgents is a fully-loaded ROS2 based framework for creating interactive physical agents that can understand, remember, and act upon contextual information from their environment.
- Production Ready Physical Agents: Designed to be used with autonomous robot systems that operate in real world dynamic environments. EmbodiedAgents makes it simple to create systems that make use of Physical AI.
- Intuitive API: Simple pythonic API to utilize local or cloud based ML models (specifically Multimodal LLMs and other transformer based architectures) on robots, with all the benefits of component lifecycle management, health monitoring and fallback mechanisms to make your agents robust.
- Self-referential and Event Driven: An agent created with EmbodiedAgents can start, stop or reconfigure its own components based on internal and external events. For example, an agent can change the ML model for planning based on its location on the map or input from the vision model. EmbodiedAgents makes it simple to create agents that are self-referential Gödel machines.
- Semantic Memory: Integrates vector databases, semantic routing and other supporting components to quickly build arbitrarily complex graphs for agentic information flow. No need to utilize bloated “GenAI” frameworks on your robot.
- Made in ROS2: Utilizes ROS2 as the underlying distributed communications backbone. Theoretically, all devices that provide a ROS2 package can be utilized to send data to ML models, with callbacks implemented for most commonly used data types and infinite extensibility.
Checkout Installation Instructions 🛠️
Get started with the Quickstart Guide 🚀
Get familiar with Basic Concepts 📚
Dive right in with Example Recipes ✨
Installation 🛠️
Install a model serving platform
The core of EmbodiedAgents is agnostic to model serving platforms. It currently supports Ollama, RoboML and any platform or cloud provider with an OpenAI compatible API (e.g. vLLM, lmdeploy etc.). Please install either of these by following the instructions provided by respective projects. Support for new platforms is being continuously added. If you would like to support a particular platform, please open an issue/PR.
Install EmbodiedAgents (Ubuntu)
For ROS versions >= humble, you can install EmbodiedAgents with your package manager. For example on Ubuntu:
sudo apt install ros-$ROS_DISTRO-automatika-embodied-agents
Alternatively, grab your favorite deb package from the release page and install it as follows:
sudo dpkg -i ros-$ROS_DISTRO-automatica-embodied-agents_$version$DISTRO_$ARCHITECTURE.deb
If the attrs version from your package manager is < 23.2, install it using pip as follows:
pip install 'attrs>=23.2.0'
Install EmbodiedAgents from source
Get Dependencies
Install python dependencies
pip install numpy opencv-python-headless 'attrs>=23.2.0' jinja2 httpx setproctitle msgpack msgpack-numpy platformdirs tqdm websockets
Download Sugarcoat🍬
git clone https://github.com/automatika-robotics/sugarcoat
Install EmbodiedAgents
git clone https://github.com/automatika-robotics/embodied-agents.git
cd ..
colcon build
source install/setup.bash
python your_script.py
Quick Start 🚀
Unlike other ROS package, EmbodiedAgents provides a pure pythonic way of describing the node graph using Sugarcoat🍬. Copy the following code in a python script and run it.
```python from agents.clients.ollama import OllamaClient from agents.components import MLLM from agents.models import OllamaModel from agents.ros import Topic, Launcher
Define input and output topics (pay attention to msg_type)
text0 = Topic(name=”text0”, msg_type=”String”) image0 = Topic(name=”image_raw”, msg_type=”Image”) text1 = Topic(name=”text1”, msg_type=”String”)
Define a model client (working with Ollama in this case)
llava = OllamaModel(name=”llava”, checkpoint=”llava:latest”) llava_client = OllamaClient(llava)
Define an MLLM component (A component represents a node with a particular functionality)
mllm = MLLM( inputs=[text0, image0], outputs=[text1], model_client=llava_client, trigger=[text0], component_name=”vqa” )
Additional prompt settings
mllm.set_topic_prompt(text0, template=”"”You are an amazing and funny robot.
File truncated at 100 lines see the full file
Changelog for package automatika_embodied_agents
0.4.2 (2025-09-03)
- (feature) Adds udp streaming to IP:PORT as an option to TextToStream component when play_on_device is enabled
- (docs) Updates docs to use new web based client
- (feature) Adds processing of audio messages in web client
- (chore) Removes chainlit based client
- (feature) Adds a custom webclient to replace chainlit
- (feature) Adds persistent ros node in web client for async stream handling
- (feature) Adds warning when not using streaming string msg_type with streaming enabled in components
- (feature) Adds streaming string msg for managing streams in external clients
- (docs) Adds recipe for vision guided point navigation
- (fix) Fixes empty image input for Detection2D msg publication
- (fix) Fixes websocket receiving in text to speech
- (fix) Fixes keyword argument in detection and tracking publishing
- (feature) Adds publishing a singular detection or tracking message from the vision component
- Contributors: ahr, mkabtoul
0.4.1 (2025-07-10)
- (docs) Updates docs for using planning based MLLMs
- (feature) Adds options to get RGBD array from rgbd message callback
- (refactor) Breaks complex functions and fixes warmup result logging
- (feature) Adds support for planning mllm models, starting with robobrain2.0
- (docs) Adds streaming to conversational agent example
- Contributors: ahr, mkabtoul
0.4.0 (2025-06-18)
- (docs) Adds international readme files
- (feature) Adds better connection error messages in clients, adds installation instructions
- (chore) Adds debian packaging workflow
- (docs) Updates installation instructions
- (chore) Updates package names .. ROS Agents -> EmbodiedAgents
- (feature) Adds a GenericHTTPClient for using llm and mllm models served on any OpenAI compatible API
- (feature) Adds ollama specific inference options to OllamaModel and client
- (feature) Adds MeloTTS model to model definitions
- (feature) Adds say text method to text to speech for invoking with events
- (feature) Adds streaming playback for streaming input in speeech to text component
- (fix) Fixes clearing old output in the vision component when getting subscription data in a timed manner
- (feature) Adds tensorrt as an onnx provider option for local models
- (refactor) Removes sounddevice as a dependancy for text to speech component
- (feature) Adds local classification model for Vision component Default model: DEIM: DETR with Improved Matching for Fast Convergence by Huang et al.
- (feature) Adds warnings if device for local models is set to GPU and runtime is not available
- (feature) Adds hypothesis buffer for publishing confirmed transcripts when using streaming
- (feature) Adds asynchronous receiving for streaming websockets client in speech to text component
- (refactor) Adds getting inference params just once during node configuration
- (fix) Fixes handling of model init params and sending np arrays during inference
- (feature) Adds asynchronous publishing of response in LLM component when streaming with websocket client
- (feature) Adds local embeddings option using sentence-transformers to ChromaDB client
- (feature) Adds ChromaDB http client with ollama embeddigs
- (feature) Adds streaming with websocket client in llm component
- (fix) Fixes error message for required topics when they can be either/or
- (feature) Adds support for RGBD messages (in realsense style)
- (feature) Adds async websocket client for roboml
- (refactor) Marks child threads as daemons for smoother termination
- (feature) Adds break_character to llm component config to handle breaking streaming output into chunks for publishing
- (feature) Adds streaming to roboml http client for text data
- (feature) Adds streaming output handling to ollama client
- (refactor) Adds set_system_prompt to components and removes it from model config The same model can be called with various system prompts by different components
- (fix) Fixes typing bugs for for python 3.8 compatibility
- Contributors: ahr, aleph-ra, mkabtoul
0.3.3 (2025-01-28)
- (fix) Removes python dependencies from package manifest until package names merged in rosdistro
- Contributors: ahr
0.3.2 (2025-01-28)
- (docs) Updates docs for conversational agent and SpeechToTextConfig
- (feature) Adds vad, audio feautres and wakeword classification
File truncated at 100 lines see the full file
Package Dependencies
Deps | Name |
---|---|
ament_cmake | |
ament_cmake_python | |
rosidl_default_generators | |
rosidl_default_runtime | |
builtin_interfaces | |
std_msgs | |
sensor_msgs | |
automatika_ros_sugar |
System Dependencies
Dependant Packages
Launch files
Messages
Services
Plugins
Recent questions tagged automatika_embodied_agents at Robotics Stack Exchange
No questions yet, you can ask one on Robotics Stack Exchange.
Failed to get question list, you can ticket an issue on the github issue tracker.
![]() |
automatika_embodied_agents package from automatika_embodied_agents repoautomatika_embodied_agents |
ROS Distro
|
Package Summary
Version | 0.4.2 |
License | MIT |
Build type | AMENT_CMAKE |
Use | RECOMMENDED |
Repository Summary
Checkout URI | https://github.com/automatika-robotics/ros-agents.git |
VCS Type | git |
VCS Version | main |
Last Updated | 2025-09-16 |
Dev Status | DEVELOPED |
Released | RELEASED |
Contributing |
Help Wanted (-)
Good First Issues (-) Pull Requests to Review (-) |
Package Description
Additional Links
Maintainers
- Automatika Robotics
Authors

🇨🇳 简体ä¸ć–‡ | 🇯🇵 日本語 |
EmbodiedAgents is a fully-loaded ROS2 based framework for creating interactive physical agents that can understand, remember, and act upon contextual information from their environment.
- Production Ready Physical Agents: Designed to be used with autonomous robot systems that operate in real world dynamic environments. EmbodiedAgents makes it simple to create systems that make use of Physical AI.
- Intuitive API: Simple pythonic API to utilize local or cloud based ML models (specifically Multimodal LLMs and other transformer based architectures) on robots, with all the benefits of component lifecycle management, health monitoring and fallback mechanisms to make your agents robust.
- Self-referential and Event Driven: An agent created with EmbodiedAgents can start, stop or reconfigure its own components based on internal and external events. For example, an agent can change the ML model for planning based on its location on the map or input from the vision model. EmbodiedAgents makes it simple to create agents that are self-referential Gödel machines.
- Semantic Memory: Integrates vector databases, semantic routing and other supporting components to quickly build arbitrarily complex graphs for agentic information flow. No need to utilize bloated “GenAI” frameworks on your robot.
- Made in ROS2: Utilizes ROS2 as the underlying distributed communications backbone. Theoretically, all devices that provide a ROS2 package can be utilized to send data to ML models, with callbacks implemented for most commonly used data types and infinite extensibility.
Checkout Installation Instructions 🛠️
Get started with the Quickstart Guide 🚀
Get familiar with Basic Concepts 📚
Dive right in with Example Recipes ✨
Installation 🛠️
Install a model serving platform
The core of EmbodiedAgents is agnostic to model serving platforms. It currently supports Ollama, RoboML and any platform or cloud provider with an OpenAI compatible API (e.g. vLLM, lmdeploy etc.). Please install either of these by following the instructions provided by respective projects. Support for new platforms is being continuously added. If you would like to support a particular platform, please open an issue/PR.
Install EmbodiedAgents (Ubuntu)
For ROS versions >= humble, you can install EmbodiedAgents with your package manager. For example on Ubuntu:
sudo apt install ros-$ROS_DISTRO-automatika-embodied-agents
Alternatively, grab your favorite deb package from the release page and install it as follows:
sudo dpkg -i ros-$ROS_DISTRO-automatica-embodied-agents_$version$DISTRO_$ARCHITECTURE.deb
If the attrs version from your package manager is < 23.2, install it using pip as follows:
pip install 'attrs>=23.2.0'
Install EmbodiedAgents from source
Get Dependencies
Install python dependencies
pip install numpy opencv-python-headless 'attrs>=23.2.0' jinja2 httpx setproctitle msgpack msgpack-numpy platformdirs tqdm websockets
Download Sugarcoat🍬
git clone https://github.com/automatika-robotics/sugarcoat
Install EmbodiedAgents
git clone https://github.com/automatika-robotics/embodied-agents.git
cd ..
colcon build
source install/setup.bash
python your_script.py
Quick Start 🚀
Unlike other ROS package, EmbodiedAgents provides a pure pythonic way of describing the node graph using Sugarcoat🍬. Copy the following code in a python script and run it.
```python from agents.clients.ollama import OllamaClient from agents.components import MLLM from agents.models import OllamaModel from agents.ros import Topic, Launcher
Define input and output topics (pay attention to msg_type)
text0 = Topic(name=”text0”, msg_type=”String”) image0 = Topic(name=”image_raw”, msg_type=”Image”) text1 = Topic(name=”text1”, msg_type=”String”)
Define a model client (working with Ollama in this case)
llava = OllamaModel(name=”llava”, checkpoint=”llava:latest”) llava_client = OllamaClient(llava)
Define an MLLM component (A component represents a node with a particular functionality)
mllm = MLLM( inputs=[text0, image0], outputs=[text1], model_client=llava_client, trigger=[text0], component_name=”vqa” )
Additional prompt settings
mllm.set_topic_prompt(text0, template=”"”You are an amazing and funny robot.
File truncated at 100 lines see the full file
Changelog for package automatika_embodied_agents
0.4.2 (2025-09-03)
- (feature) Adds udp streaming to IP:PORT as an option to TextToStream component when play_on_device is enabled
- (docs) Updates docs to use new web based client
- (feature) Adds processing of audio messages in web client
- (chore) Removes chainlit based client
- (feature) Adds a custom webclient to replace chainlit
- (feature) Adds persistent ros node in web client for async stream handling
- (feature) Adds warning when not using streaming string msg_type with streaming enabled in components
- (feature) Adds streaming string msg for managing streams in external clients
- (docs) Adds recipe for vision guided point navigation
- (fix) Fixes empty image input for Detection2D msg publication
- (fix) Fixes websocket receiving in text to speech
- (fix) Fixes keyword argument in detection and tracking publishing
- (feature) Adds publishing a singular detection or tracking message from the vision component
- Contributors: ahr, mkabtoul
0.4.1 (2025-07-10)
- (docs) Updates docs for using planning based MLLMs
- (feature) Adds options to get RGBD array from rgbd message callback
- (refactor) Breaks complex functions and fixes warmup result logging
- (feature) Adds support for planning mllm models, starting with robobrain2.0
- (docs) Adds streaming to conversational agent example
- Contributors: ahr, mkabtoul
0.4.0 (2025-06-18)
- (docs) Adds international readme files
- (feature) Adds better connection error messages in clients, adds installation instructions
- (chore) Adds debian packaging workflow
- (docs) Updates installation instructions
- (chore) Updates package names .. ROS Agents -> EmbodiedAgents
- (feature) Adds a GenericHTTPClient for using llm and mllm models served on any OpenAI compatible API
- (feature) Adds ollama specific inference options to OllamaModel and client
- (feature) Adds MeloTTS model to model definitions
- (feature) Adds say text method to text to speech for invoking with events
- (feature) Adds streaming playback for streaming input in speeech to text component
- (fix) Fixes clearing old output in the vision component when getting subscription data in a timed manner
- (feature) Adds tensorrt as an onnx provider option for local models
- (refactor) Removes sounddevice as a dependancy for text to speech component
- (feature) Adds local classification model for Vision component Default model: DEIM: DETR with Improved Matching for Fast Convergence by Huang et al.
- (feature) Adds warnings if device for local models is set to GPU and runtime is not available
- (feature) Adds hypothesis buffer for publishing confirmed transcripts when using streaming
- (feature) Adds asynchronous receiving for streaming websockets client in speech to text component
- (refactor) Adds getting inference params just once during node configuration
- (fix) Fixes handling of model init params and sending np arrays during inference
- (feature) Adds asynchronous publishing of response in LLM component when streaming with websocket client
- (feature) Adds local embeddings option using sentence-transformers to ChromaDB client
- (feature) Adds ChromaDB http client with ollama embeddigs
- (feature) Adds streaming with websocket client in llm component
- (fix) Fixes error message for required topics when they can be either/or
- (feature) Adds support for RGBD messages (in realsense style)
- (feature) Adds async websocket client for roboml
- (refactor) Marks child threads as daemons for smoother termination
- (feature) Adds break_character to llm component config to handle breaking streaming output into chunks for publishing
- (feature) Adds streaming to roboml http client for text data
- (feature) Adds streaming output handling to ollama client
- (refactor) Adds set_system_prompt to components and removes it from model config The same model can be called with various system prompts by different components
- (fix) Fixes typing bugs for for python 3.8 compatibility
- Contributors: ahr, aleph-ra, mkabtoul
0.3.3 (2025-01-28)
- (fix) Removes python dependencies from package manifest until package names merged in rosdistro
- Contributors: ahr
0.3.2 (2025-01-28)
- (docs) Updates docs for conversational agent and SpeechToTextConfig
- (feature) Adds vad, audio feautres and wakeword classification
File truncated at 100 lines see the full file
Package Dependencies
Deps | Name |
---|---|
ament_cmake | |
ament_cmake_python | |
rosidl_default_generators | |
rosidl_default_runtime | |
builtin_interfaces | |
std_msgs | |
sensor_msgs | |
automatika_ros_sugar |
System Dependencies
Dependant Packages
Launch files
Messages
Services
Plugins
Recent questions tagged automatika_embodied_agents at Robotics Stack Exchange
No questions yet, you can ask one on Robotics Stack Exchange.
Failed to get question list, you can ticket an issue on the github issue tracker.
![]() |
automatika_embodied_agents package from automatika_embodied_agents repoautomatika_embodied_agents |
ROS Distro
|
Package Summary
Version | 0.4.2 |
License | MIT |
Build type | AMENT_CMAKE |
Use | RECOMMENDED |
Repository Summary
Checkout URI | https://github.com/automatika-robotics/ros-agents.git |
VCS Type | git |
VCS Version | main |
Last Updated | 2025-09-16 |
Dev Status | DEVELOPED |
Released | RELEASED |
Contributing |
Help Wanted (-)
Good First Issues (-) Pull Requests to Review (-) |
Package Description
Additional Links
Maintainers
- Automatika Robotics
Authors

🇨🇳 简体ä¸ć–‡ | 🇯🇵 日本語 |
EmbodiedAgents is a fully-loaded ROS2 based framework for creating interactive physical agents that can understand, remember, and act upon contextual information from their environment.
- Production Ready Physical Agents: Designed to be used with autonomous robot systems that operate in real world dynamic environments. EmbodiedAgents makes it simple to create systems that make use of Physical AI.
- Intuitive API: Simple pythonic API to utilize local or cloud based ML models (specifically Multimodal LLMs and other transformer based architectures) on robots, with all the benefits of component lifecycle management, health monitoring and fallback mechanisms to make your agents robust.
- Self-referential and Event Driven: An agent created with EmbodiedAgents can start, stop or reconfigure its own components based on internal and external events. For example, an agent can change the ML model for planning based on its location on the map or input from the vision model. EmbodiedAgents makes it simple to create agents that are self-referential Gödel machines.
- Semantic Memory: Integrates vector databases, semantic routing and other supporting components to quickly build arbitrarily complex graphs for agentic information flow. No need to utilize bloated “GenAI” frameworks on your robot.
- Made in ROS2: Utilizes ROS2 as the underlying distributed communications backbone. Theoretically, all devices that provide a ROS2 package can be utilized to send data to ML models, with callbacks implemented for most commonly used data types and infinite extensibility.
Checkout Installation Instructions 🛠️
Get started with the Quickstart Guide 🚀
Get familiar with Basic Concepts 📚
Dive right in with Example Recipes ✨
Installation 🛠️
Install a model serving platform
The core of EmbodiedAgents is agnostic to model serving platforms. It currently supports Ollama, RoboML and any platform or cloud provider with an OpenAI compatible API (e.g. vLLM, lmdeploy etc.). Please install either of these by following the instructions provided by respective projects. Support for new platforms is being continuously added. If you would like to support a particular platform, please open an issue/PR.
Install EmbodiedAgents (Ubuntu)
For ROS versions >= humble, you can install EmbodiedAgents with your package manager. For example on Ubuntu:
sudo apt install ros-$ROS_DISTRO-automatika-embodied-agents
Alternatively, grab your favorite deb package from the release page and install it as follows:
sudo dpkg -i ros-$ROS_DISTRO-automatica-embodied-agents_$version$DISTRO_$ARCHITECTURE.deb
If the attrs version from your package manager is < 23.2, install it using pip as follows:
pip install 'attrs>=23.2.0'
Install EmbodiedAgents from source
Get Dependencies
Install python dependencies
pip install numpy opencv-python-headless 'attrs>=23.2.0' jinja2 httpx setproctitle msgpack msgpack-numpy platformdirs tqdm websockets
Download Sugarcoat🍬
git clone https://github.com/automatika-robotics/sugarcoat
Install EmbodiedAgents
git clone https://github.com/automatika-robotics/embodied-agents.git
cd ..
colcon build
source install/setup.bash
python your_script.py
Quick Start 🚀
Unlike other ROS package, EmbodiedAgents provides a pure pythonic way of describing the node graph using Sugarcoat🍬. Copy the following code in a python script and run it.
```python from agents.clients.ollama import OllamaClient from agents.components import MLLM from agents.models import OllamaModel from agents.ros import Topic, Launcher
Define input and output topics (pay attention to msg_type)
text0 = Topic(name=”text0”, msg_type=”String”) image0 = Topic(name=”image_raw”, msg_type=”Image”) text1 = Topic(name=”text1”, msg_type=”String”)
Define a model client (working with Ollama in this case)
llava = OllamaModel(name=”llava”, checkpoint=”llava:latest”) llava_client = OllamaClient(llava)
Define an MLLM component (A component represents a node with a particular functionality)
mllm = MLLM( inputs=[text0, image0], outputs=[text1], model_client=llava_client, trigger=[text0], component_name=”vqa” )
Additional prompt settings
mllm.set_topic_prompt(text0, template=”"”You are an amazing and funny robot.
File truncated at 100 lines see the full file
Changelog for package automatika_embodied_agents
0.4.2 (2025-09-03)
- (feature) Adds udp streaming to IP:PORT as an option to TextToStream component when play_on_device is enabled
- (docs) Updates docs to use new web based client
- (feature) Adds processing of audio messages in web client
- (chore) Removes chainlit based client
- (feature) Adds a custom webclient to replace chainlit
- (feature) Adds persistent ros node in web client for async stream handling
- (feature) Adds warning when not using streaming string msg_type with streaming enabled in components
- (feature) Adds streaming string msg for managing streams in external clients
- (docs) Adds recipe for vision guided point navigation
- (fix) Fixes empty image input for Detection2D msg publication
- (fix) Fixes websocket receiving in text to speech
- (fix) Fixes keyword argument in detection and tracking publishing
- (feature) Adds publishing a singular detection or tracking message from the vision component
- Contributors: ahr, mkabtoul
0.4.1 (2025-07-10)
- (docs) Updates docs for using planning based MLLMs
- (feature) Adds options to get RGBD array from rgbd message callback
- (refactor) Breaks complex functions and fixes warmup result logging
- (feature) Adds support for planning mllm models, starting with robobrain2.0
- (docs) Adds streaming to conversational agent example
- Contributors: ahr, mkabtoul
0.4.0 (2025-06-18)
- (docs) Adds international readme files
- (feature) Adds better connection error messages in clients, adds installation instructions
- (chore) Adds debian packaging workflow
- (docs) Updates installation instructions
- (chore) Updates package names .. ROS Agents -> EmbodiedAgents
- (feature) Adds a GenericHTTPClient for using llm and mllm models served on any OpenAI compatible API
- (feature) Adds ollama specific inference options to OllamaModel and client
- (feature) Adds MeloTTS model to model definitions
- (feature) Adds say text method to text to speech for invoking with events
- (feature) Adds streaming playback for streaming input in speeech to text component
- (fix) Fixes clearing old output in the vision component when getting subscription data in a timed manner
- (feature) Adds tensorrt as an onnx provider option for local models
- (refactor) Removes sounddevice as a dependancy for text to speech component
- (feature) Adds local classification model for Vision component Default model: DEIM: DETR with Improved Matching for Fast Convergence by Huang et al.
- (feature) Adds warnings if device for local models is set to GPU and runtime is not available
- (feature) Adds hypothesis buffer for publishing confirmed transcripts when using streaming
- (feature) Adds asynchronous receiving for streaming websockets client in speech to text component
- (refactor) Adds getting inference params just once during node configuration
- (fix) Fixes handling of model init params and sending np arrays during inference
- (feature) Adds asynchronous publishing of response in LLM component when streaming with websocket client
- (feature) Adds local embeddings option using sentence-transformers to ChromaDB client
- (feature) Adds ChromaDB http client with ollama embeddigs
- (feature) Adds streaming with websocket client in llm component
- (fix) Fixes error message for required topics when they can be either/or
- (feature) Adds support for RGBD messages (in realsense style)
- (feature) Adds async websocket client for roboml
- (refactor) Marks child threads as daemons for smoother termination
- (feature) Adds break_character to llm component config to handle breaking streaming output into chunks for publishing
- (feature) Adds streaming to roboml http client for text data
- (feature) Adds streaming output handling to ollama client
- (refactor) Adds set_system_prompt to components and removes it from model config The same model can be called with various system prompts by different components
- (fix) Fixes typing bugs for for python 3.8 compatibility
- Contributors: ahr, aleph-ra, mkabtoul
0.3.3 (2025-01-28)
- (fix) Removes python dependencies from package manifest until package names merged in rosdistro
- Contributors: ahr
0.3.2 (2025-01-28)
- (docs) Updates docs for conversational agent and SpeechToTextConfig
- (feature) Adds vad, audio feautres and wakeword classification
File truncated at 100 lines see the full file
Package Dependencies
Deps | Name |
---|---|
ament_cmake | |
ament_cmake_python | |
rosidl_default_generators | |
rosidl_default_runtime | |
builtin_interfaces | |
std_msgs | |
sensor_msgs | |
automatika_ros_sugar |
System Dependencies
Dependant Packages
Launch files
Messages
Services
Plugins
Recent questions tagged automatika_embodied_agents at Robotics Stack Exchange
No questions yet, you can ask one on Robotics Stack Exchange.
Failed to get question list, you can ticket an issue on the github issue tracker.
![]() |
automatika_embodied_agents package from automatika_embodied_agents repoautomatika_embodied_agents |
ROS Distro
|
Package Summary
Version | 0.4.2 |
License | MIT |
Build type | AMENT_CMAKE |
Use | RECOMMENDED |
Repository Summary
Checkout URI | https://github.com/automatika-robotics/ros-agents.git |
VCS Type | git |
VCS Version | main |
Last Updated | 2025-09-16 |
Dev Status | DEVELOPED |
Released | RELEASED |
Contributing |
Help Wanted (-)
Good First Issues (-) Pull Requests to Review (-) |
Package Description
Additional Links
Maintainers
- Automatika Robotics
Authors

🇨🇳 简体ä¸ć–‡ | 🇯🇵 日本語 |
EmbodiedAgents is a fully-loaded ROS2 based framework for creating interactive physical agents that can understand, remember, and act upon contextual information from their environment.
- Production Ready Physical Agents: Designed to be used with autonomous robot systems that operate in real world dynamic environments. EmbodiedAgents makes it simple to create systems that make use of Physical AI.
- Intuitive API: Simple pythonic API to utilize local or cloud based ML models (specifically Multimodal LLMs and other transformer based architectures) on robots, with all the benefits of component lifecycle management, health monitoring and fallback mechanisms to make your agents robust.
- Self-referential and Event Driven: An agent created with EmbodiedAgents can start, stop or reconfigure its own components based on internal and external events. For example, an agent can change the ML model for planning based on its location on the map or input from the vision model. EmbodiedAgents makes it simple to create agents that are self-referential Gödel machines.
- Semantic Memory: Integrates vector databases, semantic routing and other supporting components to quickly build arbitrarily complex graphs for agentic information flow. No need to utilize bloated “GenAI” frameworks on your robot.
- Made in ROS2: Utilizes ROS2 as the underlying distributed communications backbone. Theoretically, all devices that provide a ROS2 package can be utilized to send data to ML models, with callbacks implemented for most commonly used data types and infinite extensibility.
Checkout Installation Instructions 🛠️
Get started with the Quickstart Guide 🚀
Get familiar with Basic Concepts 📚
Dive right in with Example Recipes ✨
Installation 🛠️
Install a model serving platform
The core of EmbodiedAgents is agnostic to model serving platforms. It currently supports Ollama, RoboML and any platform or cloud provider with an OpenAI compatible API (e.g. vLLM, lmdeploy etc.). Please install either of these by following the instructions provided by respective projects. Support for new platforms is being continuously added. If you would like to support a particular platform, please open an issue/PR.
Install EmbodiedAgents (Ubuntu)
For ROS versions >= humble, you can install EmbodiedAgents with your package manager. For example on Ubuntu:
sudo apt install ros-$ROS_DISTRO-automatika-embodied-agents
Alternatively, grab your favorite deb package from the release page and install it as follows:
sudo dpkg -i ros-$ROS_DISTRO-automatica-embodied-agents_$version$DISTRO_$ARCHITECTURE.deb
If the attrs version from your package manager is < 23.2, install it using pip as follows:
pip install 'attrs>=23.2.0'
Install EmbodiedAgents from source
Get Dependencies
Install python dependencies
pip install numpy opencv-python-headless 'attrs>=23.2.0' jinja2 httpx setproctitle msgpack msgpack-numpy platformdirs tqdm websockets
Download Sugarcoat🍬
git clone https://github.com/automatika-robotics/sugarcoat
Install EmbodiedAgents
git clone https://github.com/automatika-robotics/embodied-agents.git
cd ..
colcon build
source install/setup.bash
python your_script.py
Quick Start 🚀
Unlike other ROS package, EmbodiedAgents provides a pure pythonic way of describing the node graph using Sugarcoat🍬. Copy the following code in a python script and run it.
```python from agents.clients.ollama import OllamaClient from agents.components import MLLM from agents.models import OllamaModel from agents.ros import Topic, Launcher
Define input and output topics (pay attention to msg_type)
text0 = Topic(name=”text0”, msg_type=”String”) image0 = Topic(name=”image_raw”, msg_type=”Image”) text1 = Topic(name=”text1”, msg_type=”String”)
Define a model client (working with Ollama in this case)
llava = OllamaModel(name=”llava”, checkpoint=”llava:latest”) llava_client = OllamaClient(llava)
Define an MLLM component (A component represents a node with a particular functionality)
mllm = MLLM( inputs=[text0, image0], outputs=[text1], model_client=llava_client, trigger=[text0], component_name=”vqa” )
Additional prompt settings
mllm.set_topic_prompt(text0, template=”"”You are an amazing and funny robot.
File truncated at 100 lines see the full file
Changelog for package automatika_embodied_agents
0.4.2 (2025-09-03)
- (feature) Adds udp streaming to IP:PORT as an option to TextToStream component when play_on_device is enabled
- (docs) Updates docs to use new web based client
- (feature) Adds processing of audio messages in web client
- (chore) Removes chainlit based client
- (feature) Adds a custom webclient to replace chainlit
- (feature) Adds persistent ros node in web client for async stream handling
- (feature) Adds warning when not using streaming string msg_type with streaming enabled in components
- (feature) Adds streaming string msg for managing streams in external clients
- (docs) Adds recipe for vision guided point navigation
- (fix) Fixes empty image input for Detection2D msg publication
- (fix) Fixes websocket receiving in text to speech
- (fix) Fixes keyword argument in detection and tracking publishing
- (feature) Adds publishing a singular detection or tracking message from the vision component
- Contributors: ahr, mkabtoul
0.4.1 (2025-07-10)
- (docs) Updates docs for using planning based MLLMs
- (feature) Adds options to get RGBD array from rgbd message callback
- (refactor) Breaks complex functions and fixes warmup result logging
- (feature) Adds support for planning mllm models, starting with robobrain2.0
- (docs) Adds streaming to conversational agent example
- Contributors: ahr, mkabtoul
0.4.0 (2025-06-18)
- (docs) Adds international readme files
- (feature) Adds better connection error messages in clients, adds installation instructions
- (chore) Adds debian packaging workflow
- (docs) Updates installation instructions
- (chore) Updates package names .. ROS Agents -> EmbodiedAgents
- (feature) Adds a GenericHTTPClient for using llm and mllm models served on any OpenAI compatible API
- (feature) Adds ollama specific inference options to OllamaModel and client
- (feature) Adds MeloTTS model to model definitions
- (feature) Adds say text method to text to speech for invoking with events
- (feature) Adds streaming playback for streaming input in speeech to text component
- (fix) Fixes clearing old output in the vision component when getting subscription data in a timed manner
- (feature) Adds tensorrt as an onnx provider option for local models
- (refactor) Removes sounddevice as a dependancy for text to speech component
- (feature) Adds local classification model for Vision component Default model: DEIM: DETR with Improved Matching for Fast Convergence by Huang et al.
- (feature) Adds warnings if device for local models is set to GPU and runtime is not available
- (feature) Adds hypothesis buffer for publishing confirmed transcripts when using streaming
- (feature) Adds asynchronous receiving for streaming websockets client in speech to text component
- (refactor) Adds getting inference params just once during node configuration
- (fix) Fixes handling of model init params and sending np arrays during inference
- (feature) Adds asynchronous publishing of response in LLM component when streaming with websocket client
- (feature) Adds local embeddings option using sentence-transformers to ChromaDB client
- (feature) Adds ChromaDB http client with ollama embeddigs
- (feature) Adds streaming with websocket client in llm component
- (fix) Fixes error message for required topics when they can be either/or
- (feature) Adds support for RGBD messages (in realsense style)
- (feature) Adds async websocket client for roboml
- (refactor) Marks child threads as daemons for smoother termination
- (feature) Adds break_character to llm component config to handle breaking streaming output into chunks for publishing
- (feature) Adds streaming to roboml http client for text data
- (feature) Adds streaming output handling to ollama client
- (refactor) Adds set_system_prompt to components and removes it from model config The same model can be called with various system prompts by different components
- (fix) Fixes typing bugs for for python 3.8 compatibility
- Contributors: ahr, aleph-ra, mkabtoul
0.3.3 (2025-01-28)
- (fix) Removes python dependencies from package manifest until package names merged in rosdistro
- Contributors: ahr
0.3.2 (2025-01-28)
- (docs) Updates docs for conversational agent and SpeechToTextConfig
- (feature) Adds vad, audio feautres and wakeword classification
File truncated at 100 lines see the full file
Package Dependencies
Deps | Name |
---|---|
ament_cmake | |
ament_cmake_python | |
rosidl_default_generators | |
rosidl_default_runtime | |
builtin_interfaces | |
std_msgs | |
sensor_msgs | |
automatika_ros_sugar |
System Dependencies
Dependant Packages
Launch files
Messages
Services
Plugins
Recent questions tagged automatika_embodied_agents at Robotics Stack Exchange
No questions yet, you can ask one on Robotics Stack Exchange.
Failed to get question list, you can ticket an issue on the github issue tracker.
![]() |
automatika_embodied_agents package from automatika_embodied_agents repoautomatika_embodied_agents |
ROS Distro
|
Package Summary
Version | 0.4.2 |
License | MIT |
Build type | AMENT_CMAKE |
Use | RECOMMENDED |
Repository Summary
Checkout URI | https://github.com/automatika-robotics/ros-agents.git |
VCS Type | git |
VCS Version | main |
Last Updated | 2025-09-16 |
Dev Status | DEVELOPED |
Released | RELEASED |
Contributing |
Help Wanted (-)
Good First Issues (-) Pull Requests to Review (-) |
Package Description
Additional Links
Maintainers
- Automatika Robotics
Authors

🇨🇳 简体ä¸ć–‡ | 🇯🇵 日本語 |
EmbodiedAgents is a fully-loaded ROS2 based framework for creating interactive physical agents that can understand, remember, and act upon contextual information from their environment.
- Production Ready Physical Agents: Designed to be used with autonomous robot systems that operate in real world dynamic environments. EmbodiedAgents makes it simple to create systems that make use of Physical AI.
- Intuitive API: Simple pythonic API to utilize local or cloud based ML models (specifically Multimodal LLMs and other transformer based architectures) on robots, with all the benefits of component lifecycle management, health monitoring and fallback mechanisms to make your agents robust.
- Self-referential and Event Driven: An agent created with EmbodiedAgents can start, stop or reconfigure its own components based on internal and external events. For example, an agent can change the ML model for planning based on its location on the map or input from the vision model. EmbodiedAgents makes it simple to create agents that are self-referential Gödel machines.
- Semantic Memory: Integrates vector databases, semantic routing and other supporting components to quickly build arbitrarily complex graphs for agentic information flow. No need to utilize bloated “GenAI” frameworks on your robot.
- Made in ROS2: Utilizes ROS2 as the underlying distributed communications backbone. Theoretically, all devices that provide a ROS2 package can be utilized to send data to ML models, with callbacks implemented for most commonly used data types and infinite extensibility.
Checkout Installation Instructions 🛠️
Get started with the Quickstart Guide 🚀
Get familiar with Basic Concepts 📚
Dive right in with Example Recipes ✨
Installation 🛠️
Install a model serving platform
The core of EmbodiedAgents is agnostic to model serving platforms. It currently supports Ollama, RoboML and any platform or cloud provider with an OpenAI compatible API (e.g. vLLM, lmdeploy etc.). Please install either of these by following the instructions provided by respective projects. Support for new platforms is being continuously added. If you would like to support a particular platform, please open an issue/PR.
Install EmbodiedAgents (Ubuntu)
For ROS versions >= humble, you can install EmbodiedAgents with your package manager. For example on Ubuntu:
sudo apt install ros-$ROS_DISTRO-automatika-embodied-agents
Alternatively, grab your favorite deb package from the release page and install it as follows:
sudo dpkg -i ros-$ROS_DISTRO-automatica-embodied-agents_$version$DISTRO_$ARCHITECTURE.deb
If the attrs version from your package manager is < 23.2, install it using pip as follows:
pip install 'attrs>=23.2.0'
Install EmbodiedAgents from source
Get Dependencies
Install python dependencies
pip install numpy opencv-python-headless 'attrs>=23.2.0' jinja2 httpx setproctitle msgpack msgpack-numpy platformdirs tqdm websockets
Download Sugarcoat🍬
git clone https://github.com/automatika-robotics/sugarcoat
Install EmbodiedAgents
git clone https://github.com/automatika-robotics/embodied-agents.git
cd ..
colcon build
source install/setup.bash
python your_script.py
Quick Start 🚀
Unlike other ROS package, EmbodiedAgents provides a pure pythonic way of describing the node graph using Sugarcoat🍬. Copy the following code in a python script and run it.
```python from agents.clients.ollama import OllamaClient from agents.components import MLLM from agents.models import OllamaModel from agents.ros import Topic, Launcher
Define input and output topics (pay attention to msg_type)
text0 = Topic(name=”text0”, msg_type=”String”) image0 = Topic(name=”image_raw”, msg_type=”Image”) text1 = Topic(name=”text1”, msg_type=”String”)
Define a model client (working with Ollama in this case)
llava = OllamaModel(name=”llava”, checkpoint=”llava:latest”) llava_client = OllamaClient(llava)
Define an MLLM component (A component represents a node with a particular functionality)
mllm = MLLM( inputs=[text0, image0], outputs=[text1], model_client=llava_client, trigger=[text0], component_name=”vqa” )
Additional prompt settings
mllm.set_topic_prompt(text0, template=”"”You are an amazing and funny robot.
File truncated at 100 lines see the full file
Changelog for package automatika_embodied_agents
0.4.2 (2025-09-03)
- (feature) Adds udp streaming to IP:PORT as an option to TextToStream component when play_on_device is enabled
- (docs) Updates docs to use new web based client
- (feature) Adds processing of audio messages in web client
- (chore) Removes chainlit based client
- (feature) Adds a custom webclient to replace chainlit
- (feature) Adds persistent ros node in web client for async stream handling
- (feature) Adds warning when not using streaming string msg_type with streaming enabled in components
- (feature) Adds streaming string msg for managing streams in external clients
- (docs) Adds recipe for vision guided point navigation
- (fix) Fixes empty image input for Detection2D msg publication
- (fix) Fixes websocket receiving in text to speech
- (fix) Fixes keyword argument in detection and tracking publishing
- (feature) Adds publishing a singular detection or tracking message from the vision component
- Contributors: ahr, mkabtoul
0.4.1 (2025-07-10)
- (docs) Updates docs for using planning based MLLMs
- (feature) Adds options to get RGBD array from rgbd message callback
- (refactor) Breaks complex functions and fixes warmup result logging
- (feature) Adds support for planning mllm models, starting with robobrain2.0
- (docs) Adds streaming to conversational agent example
- Contributors: ahr, mkabtoul
0.4.0 (2025-06-18)
- (docs) Adds international readme files
- (feature) Adds better connection error messages in clients, adds installation instructions
- (chore) Adds debian packaging workflow
- (docs) Updates installation instructions
- (chore) Updates package names .. ROS Agents -> EmbodiedAgents
- (feature) Adds a GenericHTTPClient for using llm and mllm models served on any OpenAI compatible API
- (feature) Adds ollama specific inference options to OllamaModel and client
- (feature) Adds MeloTTS model to model definitions
- (feature) Adds say text method to text to speech for invoking with events
- (feature) Adds streaming playback for streaming input in speeech to text component
- (fix) Fixes clearing old output in the vision component when getting subscription data in a timed manner
- (feature) Adds tensorrt as an onnx provider option for local models
- (refactor) Removes sounddevice as a dependancy for text to speech component
- (feature) Adds local classification model for Vision component Default model: DEIM: DETR with Improved Matching for Fast Convergence by Huang et al.
- (feature) Adds warnings if device for local models is set to GPU and runtime is not available
- (feature) Adds hypothesis buffer for publishing confirmed transcripts when using streaming
- (feature) Adds asynchronous receiving for streaming websockets client in speech to text component
- (refactor) Adds getting inference params just once during node configuration
- (fix) Fixes handling of model init params and sending np arrays during inference
- (feature) Adds asynchronous publishing of response in LLM component when streaming with websocket client
- (feature) Adds local embeddings option using sentence-transformers to ChromaDB client
- (feature) Adds ChromaDB http client with ollama embeddigs
- (feature) Adds streaming with websocket client in llm component
- (fix) Fixes error message for required topics when they can be either/or
- (feature) Adds support for RGBD messages (in realsense style)
- (feature) Adds async websocket client for roboml
- (refactor) Marks child threads as daemons for smoother termination
- (feature) Adds break_character to llm component config to handle breaking streaming output into chunks for publishing
- (feature) Adds streaming to roboml http client for text data
- (feature) Adds streaming output handling to ollama client
- (refactor) Adds set_system_prompt to components and removes it from model config The same model can be called with various system prompts by different components
- (fix) Fixes typing bugs for for python 3.8 compatibility
- Contributors: ahr, aleph-ra, mkabtoul
0.3.3 (2025-01-28)
- (fix) Removes python dependencies from package manifest until package names merged in rosdistro
- Contributors: ahr
0.3.2 (2025-01-28)
- (docs) Updates docs for conversational agent and SpeechToTextConfig
- (feature) Adds vad, audio feautres and wakeword classification
File truncated at 100 lines see the full file
Package Dependencies
Deps | Name |
---|---|
ament_cmake | |
ament_cmake_python | |
rosidl_default_generators | |
rosidl_default_runtime | |
builtin_interfaces | |
std_msgs | |
sensor_msgs | |
automatika_ros_sugar |
System Dependencies
Dependant Packages
Launch files
Messages
Services
Plugins
Recent questions tagged automatika_embodied_agents at Robotics Stack Exchange
No questions yet, you can ask one on Robotics Stack Exchange.
Failed to get question list, you can ticket an issue on the github issue tracker.
![]() |
automatika_embodied_agents package from automatika_embodied_agents repoautomatika_embodied_agents |
ROS Distro
|
Package Summary
Version | 0.4.2 |
License | MIT |
Build type | AMENT_CMAKE |
Use | RECOMMENDED |
Repository Summary
Checkout URI | https://github.com/automatika-robotics/ros-agents.git |
VCS Type | git |
VCS Version | main |
Last Updated | 2025-09-16 |
Dev Status | DEVELOPED |
Released | RELEASED |
Contributing |
Help Wanted (-)
Good First Issues (-) Pull Requests to Review (-) |
Package Description
Additional Links
Maintainers
- Automatika Robotics
Authors

🇨🇳 简体ä¸ć–‡ | 🇯🇵 日本語 |
EmbodiedAgents is a fully-loaded ROS2 based framework for creating interactive physical agents that can understand, remember, and act upon contextual information from their environment.
- Production Ready Physical Agents: Designed to be used with autonomous robot systems that operate in real world dynamic environments. EmbodiedAgents makes it simple to create systems that make use of Physical AI.
- Intuitive API: Simple pythonic API to utilize local or cloud based ML models (specifically Multimodal LLMs and other transformer based architectures) on robots, with all the benefits of component lifecycle management, health monitoring and fallback mechanisms to make your agents robust.
- Self-referential and Event Driven: An agent created with EmbodiedAgents can start, stop or reconfigure its own components based on internal and external events. For example, an agent can change the ML model for planning based on its location on the map or input from the vision model. EmbodiedAgents makes it simple to create agents that are self-referential Gödel machines.
- Semantic Memory: Integrates vector databases, semantic routing and other supporting components to quickly build arbitrarily complex graphs for agentic information flow. No need to utilize bloated “GenAI” frameworks on your robot.
- Made in ROS2: Utilizes ROS2 as the underlying distributed communications backbone. Theoretically, all devices that provide a ROS2 package can be utilized to send data to ML models, with callbacks implemented for most commonly used data types and infinite extensibility.
Checkout Installation Instructions 🛠️
Get started with the Quickstart Guide 🚀
Get familiar with Basic Concepts 📚
Dive right in with Example Recipes ✨
Installation 🛠️
Install a model serving platform
The core of EmbodiedAgents is agnostic to model serving platforms. It currently supports Ollama, RoboML and any platform or cloud provider with an OpenAI compatible API (e.g. vLLM, lmdeploy etc.). Please install either of these by following the instructions provided by respective projects. Support for new platforms is being continuously added. If you would like to support a particular platform, please open an issue/PR.
Install EmbodiedAgents (Ubuntu)
For ROS versions >= humble, you can install EmbodiedAgents with your package manager. For example on Ubuntu:
sudo apt install ros-$ROS_DISTRO-automatika-embodied-agents
Alternatively, grab your favorite deb package from the release page and install it as follows:
sudo dpkg -i ros-$ROS_DISTRO-automatica-embodied-agents_$version$DISTRO_$ARCHITECTURE.deb
If the attrs version from your package manager is < 23.2, install it using pip as follows:
pip install 'attrs>=23.2.0'
Install EmbodiedAgents from source
Get Dependencies
Install python dependencies
pip install numpy opencv-python-headless 'attrs>=23.2.0' jinja2 httpx setproctitle msgpack msgpack-numpy platformdirs tqdm websockets
Download Sugarcoat🍬
git clone https://github.com/automatika-robotics/sugarcoat
Install EmbodiedAgents
git clone https://github.com/automatika-robotics/embodied-agents.git
cd ..
colcon build
source install/setup.bash
python your_script.py
Quick Start 🚀
Unlike other ROS package, EmbodiedAgents provides a pure pythonic way of describing the node graph using Sugarcoat🍬. Copy the following code in a python script and run it.
```python from agents.clients.ollama import OllamaClient from agents.components import MLLM from agents.models import OllamaModel from agents.ros import Topic, Launcher
Define input and output topics (pay attention to msg_type)
text0 = Topic(name=”text0”, msg_type=”String”) image0 = Topic(name=”image_raw”, msg_type=”Image”) text1 = Topic(name=”text1”, msg_type=”String”)
Define a model client (working with Ollama in this case)
llava = OllamaModel(name=”llava”, checkpoint=”llava:latest”) llava_client = OllamaClient(llava)
Define an MLLM component (A component represents a node with a particular functionality)
mllm = MLLM( inputs=[text0, image0], outputs=[text1], model_client=llava_client, trigger=[text0], component_name=”vqa” )
Additional prompt settings
mllm.set_topic_prompt(text0, template=”"”You are an amazing and funny robot.
File truncated at 100 lines see the full file
Changelog for package automatika_embodied_agents
0.4.2 (2025-09-03)
- (feature) Adds udp streaming to IP:PORT as an option to TextToStream component when play_on_device is enabled
- (docs) Updates docs to use new web based client
- (feature) Adds processing of audio messages in web client
- (chore) Removes chainlit based client
- (feature) Adds a custom webclient to replace chainlit
- (feature) Adds persistent ros node in web client for async stream handling
- (feature) Adds warning when not using streaming string msg_type with streaming enabled in components
- (feature) Adds streaming string msg for managing streams in external clients
- (docs) Adds recipe for vision guided point navigation
- (fix) Fixes empty image input for Detection2D msg publication
- (fix) Fixes websocket receiving in text to speech
- (fix) Fixes keyword argument in detection and tracking publishing
- (feature) Adds publishing a singular detection or tracking message from the vision component
- Contributors: ahr, mkabtoul
0.4.1 (2025-07-10)
- (docs) Updates docs for using planning based MLLMs
- (feature) Adds options to get RGBD array from rgbd message callback
- (refactor) Breaks complex functions and fixes warmup result logging
- (feature) Adds support for planning mllm models, starting with robobrain2.0
- (docs) Adds streaming to conversational agent example
- Contributors: ahr, mkabtoul
0.4.0 (2025-06-18)
- (docs) Adds international readme files
- (feature) Adds better connection error messages in clients, adds installation instructions
- (chore) Adds debian packaging workflow
- (docs) Updates installation instructions
- (chore) Updates package names .. ROS Agents -> EmbodiedAgents
- (feature) Adds a GenericHTTPClient for using llm and mllm models served on any OpenAI compatible API
- (feature) Adds ollama specific inference options to OllamaModel and client
- (feature) Adds MeloTTS model to model definitions
- (feature) Adds say text method to text to speech for invoking with events
- (feature) Adds streaming playback for streaming input in speeech to text component
- (fix) Fixes clearing old output in the vision component when getting subscription data in a timed manner
- (feature) Adds tensorrt as an onnx provider option for local models
- (refactor) Removes sounddevice as a dependancy for text to speech component
- (feature) Adds local classification model for Vision component Default model: DEIM: DETR with Improved Matching for Fast Convergence by Huang et al.
- (feature) Adds warnings if device for local models is set to GPU and runtime is not available
- (feature) Adds hypothesis buffer for publishing confirmed transcripts when using streaming
- (feature) Adds asynchronous receiving for streaming websockets client in speech to text component
- (refactor) Adds getting inference params just once during node configuration
- (fix) Fixes handling of model init params and sending np arrays during inference
- (feature) Adds asynchronous publishing of response in LLM component when streaming with websocket client
- (feature) Adds local embeddings option using sentence-transformers to ChromaDB client
- (feature) Adds ChromaDB http client with ollama embeddigs
- (feature) Adds streaming with websocket client in llm component
- (fix) Fixes error message for required topics when they can be either/or
- (feature) Adds support for RGBD messages (in realsense style)
- (feature) Adds async websocket client for roboml
- (refactor) Marks child threads as daemons for smoother termination
- (feature) Adds break_character to llm component config to handle breaking streaming output into chunks for publishing
- (feature) Adds streaming to roboml http client for text data
- (feature) Adds streaming output handling to ollama client
- (refactor) Adds set_system_prompt to components and removes it from model config The same model can be called with various system prompts by different components
- (fix) Fixes typing bugs for for python 3.8 compatibility
- Contributors: ahr, aleph-ra, mkabtoul
0.3.3 (2025-01-28)
- (fix) Removes python dependencies from package manifest until package names merged in rosdistro
- Contributors: ahr
0.3.2 (2025-01-28)
- (docs) Updates docs for conversational agent and SpeechToTextConfig
- (feature) Adds vad, audio feautres and wakeword classification
File truncated at 100 lines see the full file
Package Dependencies
Deps | Name |
---|---|
ament_cmake | |
ament_cmake_python | |
rosidl_default_generators | |
rosidl_default_runtime | |
builtin_interfaces | |
std_msgs | |
sensor_msgs | |
automatika_ros_sugar |
System Dependencies
Dependant Packages
Launch files
Messages
Services
Plugins
Recent questions tagged automatika_embodied_agents at Robotics Stack Exchange
No questions yet, you can ask one on Robotics Stack Exchange.
Failed to get question list, you can ticket an issue on the github issue tracker.
![]() |
automatika_embodied_agents package from automatika_embodied_agents repoautomatika_embodied_agents |
ROS Distro
|
Package Summary
Version | 0.4.2 |
License | MIT |
Build type | AMENT_CMAKE |
Use | RECOMMENDED |
Repository Summary
Checkout URI | https://github.com/automatika-robotics/ros-agents.git |
VCS Type | git |
VCS Version | main |
Last Updated | 2025-09-16 |
Dev Status | DEVELOPED |
Released | RELEASED |
Contributing |
Help Wanted (-)
Good First Issues (-) Pull Requests to Review (-) |
Package Description
Additional Links
Maintainers
- Automatika Robotics
Authors

🇨🇳 简体ä¸ć–‡ | 🇯🇵 日本語 |
EmbodiedAgents is a fully-loaded ROS2 based framework for creating interactive physical agents that can understand, remember, and act upon contextual information from their environment.
- Production Ready Physical Agents: Designed to be used with autonomous robot systems that operate in real world dynamic environments. EmbodiedAgents makes it simple to create systems that make use of Physical AI.
- Intuitive API: Simple pythonic API to utilize local or cloud based ML models (specifically Multimodal LLMs and other transformer based architectures) on robots, with all the benefits of component lifecycle management, health monitoring and fallback mechanisms to make your agents robust.
- Self-referential and Event Driven: An agent created with EmbodiedAgents can start, stop or reconfigure its own components based on internal and external events. For example, an agent can change the ML model for planning based on its location on the map or input from the vision model. EmbodiedAgents makes it simple to create agents that are self-referential Gödel machines.
- Semantic Memory: Integrates vector databases, semantic routing and other supporting components to quickly build arbitrarily complex graphs for agentic information flow. No need to utilize bloated “GenAI” frameworks on your robot.
- Made in ROS2: Utilizes ROS2 as the underlying distributed communications backbone. Theoretically, all devices that provide a ROS2 package can be utilized to send data to ML models, with callbacks implemented for most commonly used data types and infinite extensibility.
Checkout Installation Instructions 🛠️
Get started with the Quickstart Guide 🚀
Get familiar with Basic Concepts 📚
Dive right in with Example Recipes ✨
Installation 🛠️
Install a model serving platform
The core of EmbodiedAgents is agnostic to model serving platforms. It currently supports Ollama, RoboML and any platform or cloud provider with an OpenAI compatible API (e.g. vLLM, lmdeploy etc.). Please install either of these by following the instructions provided by respective projects. Support for new platforms is being continuously added. If you would like to support a particular platform, please open an issue/PR.
Install EmbodiedAgents (Ubuntu)
For ROS versions >= humble, you can install EmbodiedAgents with your package manager. For example on Ubuntu:
sudo apt install ros-$ROS_DISTRO-automatika-embodied-agents
Alternatively, grab your favorite deb package from the release page and install it as follows:
sudo dpkg -i ros-$ROS_DISTRO-automatica-embodied-agents_$version$DISTRO_$ARCHITECTURE.deb
If the attrs version from your package manager is < 23.2, install it using pip as follows:
pip install 'attrs>=23.2.0'
Install EmbodiedAgents from source
Get Dependencies
Install python dependencies
pip install numpy opencv-python-headless 'attrs>=23.2.0' jinja2 httpx setproctitle msgpack msgpack-numpy platformdirs tqdm websockets
Download Sugarcoat🍬
git clone https://github.com/automatika-robotics/sugarcoat
Install EmbodiedAgents
git clone https://github.com/automatika-robotics/embodied-agents.git
cd ..
colcon build
source install/setup.bash
python your_script.py
Quick Start 🚀
Unlike other ROS package, EmbodiedAgents provides a pure pythonic way of describing the node graph using Sugarcoat🍬. Copy the following code in a python script and run it.
```python from agents.clients.ollama import OllamaClient from agents.components import MLLM from agents.models import OllamaModel from agents.ros import Topic, Launcher
Define input and output topics (pay attention to msg_type)
text0 = Topic(name=”text0”, msg_type=”String”) image0 = Topic(name=”image_raw”, msg_type=”Image”) text1 = Topic(name=”text1”, msg_type=”String”)
Define a model client (working with Ollama in this case)
llava = OllamaModel(name=”llava”, checkpoint=”llava:latest”) llava_client = OllamaClient(llava)
Define an MLLM component (A component represents a node with a particular functionality)
mllm = MLLM( inputs=[text0, image0], outputs=[text1], model_client=llava_client, trigger=[text0], component_name=”vqa” )
Additional prompt settings
mllm.set_topic_prompt(text0, template=”"”You are an amazing and funny robot.
File truncated at 100 lines see the full file
Changelog for package automatika_embodied_agents
0.4.2 (2025-09-03)
- (feature) Adds udp streaming to IP:PORT as an option to TextToStream component when play_on_device is enabled
- (docs) Updates docs to use new web based client
- (feature) Adds processing of audio messages in web client
- (chore) Removes chainlit based client
- (feature) Adds a custom webclient to replace chainlit
- (feature) Adds persistent ros node in web client for async stream handling
- (feature) Adds warning when not using streaming string msg_type with streaming enabled in components
- (feature) Adds streaming string msg for managing streams in external clients
- (docs) Adds recipe for vision guided point navigation
- (fix) Fixes empty image input for Detection2D msg publication
- (fix) Fixes websocket receiving in text to speech
- (fix) Fixes keyword argument in detection and tracking publishing
- (feature) Adds publishing a singular detection or tracking message from the vision component
- Contributors: ahr, mkabtoul
0.4.1 (2025-07-10)
- (docs) Updates docs for using planning based MLLMs
- (feature) Adds options to get RGBD array from rgbd message callback
- (refactor) Breaks complex functions and fixes warmup result logging
- (feature) Adds support for planning mllm models, starting with robobrain2.0
- (docs) Adds streaming to conversational agent example
- Contributors: ahr, mkabtoul
0.4.0 (2025-06-18)
- (docs) Adds international readme files
- (feature) Adds better connection error messages in clients, adds installation instructions
- (chore) Adds debian packaging workflow
- (docs) Updates installation instructions
- (chore) Updates package names .. ROS Agents -> EmbodiedAgents
- (feature) Adds a GenericHTTPClient for using llm and mllm models served on any OpenAI compatible API
- (feature) Adds ollama specific inference options to OllamaModel and client
- (feature) Adds MeloTTS model to model definitions
- (feature) Adds say text method to text to speech for invoking with events
- (feature) Adds streaming playback for streaming input in speeech to text component
- (fix) Fixes clearing old output in the vision component when getting subscription data in a timed manner
- (feature) Adds tensorrt as an onnx provider option for local models
- (refactor) Removes sounddevice as a dependancy for text to speech component
- (feature) Adds local classification model for Vision component Default model: DEIM: DETR with Improved Matching for Fast Convergence by Huang et al.
- (feature) Adds warnings if device for local models is set to GPU and runtime is not available
- (feature) Adds hypothesis buffer for publishing confirmed transcripts when using streaming
- (feature) Adds asynchronous receiving for streaming websockets client in speech to text component
- (refactor) Adds getting inference params just once during node configuration
- (fix) Fixes handling of model init params and sending np arrays during inference
- (feature) Adds asynchronous publishing of response in LLM component when streaming with websocket client
- (feature) Adds local embeddings option using sentence-transformers to ChromaDB client
- (feature) Adds ChromaDB http client with ollama embeddigs
- (feature) Adds streaming with websocket client in llm component
- (fix) Fixes error message for required topics when they can be either/or
- (feature) Adds support for RGBD messages (in realsense style)
- (feature) Adds async websocket client for roboml
- (refactor) Marks child threads as daemons for smoother termination
- (feature) Adds break_character to llm component config to handle breaking streaming output into chunks for publishing
- (feature) Adds streaming to roboml http client for text data
- (feature) Adds streaming output handling to ollama client
- (refactor) Adds set_system_prompt to components and removes it from model config The same model can be called with various system prompts by different components
- (fix) Fixes typing bugs for for python 3.8 compatibility
- Contributors: ahr, aleph-ra, mkabtoul
0.3.3 (2025-01-28)
- (fix) Removes python dependencies from package manifest until package names merged in rosdistro
- Contributors: ahr
0.3.2 (2025-01-28)
- (docs) Updates docs for conversational agent and SpeechToTextConfig
- (feature) Adds vad, audio feautres and wakeword classification
File truncated at 100 lines see the full file
Package Dependencies
Deps | Name |
---|---|
ament_cmake | |
ament_cmake_python | |
rosidl_default_generators | |
rosidl_default_runtime | |
builtin_interfaces | |
std_msgs | |
sensor_msgs | |
automatika_ros_sugar |
System Dependencies
Dependant Packages
Launch files
Messages
Services
Plugins
Recent questions tagged automatika_embodied_agents at Robotics Stack Exchange
No questions yet, you can ask one on Robotics Stack Exchange.
Failed to get question list, you can ticket an issue on the github issue tracker.
![]() |
automatika_embodied_agents package from automatika_embodied_agents repoautomatika_embodied_agents |
ROS Distro
|
Package Summary
Version | 0.4.2 |
License | MIT |
Build type | AMENT_CMAKE |
Use | RECOMMENDED |
Repository Summary
Checkout URI | https://github.com/automatika-robotics/ros-agents.git |
VCS Type | git |
VCS Version | main |
Last Updated | 2025-09-16 |
Dev Status | DEVELOPED |
Released | RELEASED |
Contributing |
Help Wanted (-)
Good First Issues (-) Pull Requests to Review (-) |
Package Description
Additional Links
Maintainers
- Automatika Robotics
Authors

🇨🇳 简体ä¸ć–‡ | 🇯🇵 日本語 |
EmbodiedAgents is a fully-loaded ROS2 based framework for creating interactive physical agents that can understand, remember, and act upon contextual information from their environment.
- Production Ready Physical Agents: Designed to be used with autonomous robot systems that operate in real world dynamic environments. EmbodiedAgents makes it simple to create systems that make use of Physical AI.
- Intuitive API: Simple pythonic API to utilize local or cloud based ML models (specifically Multimodal LLMs and other transformer based architectures) on robots, with all the benefits of component lifecycle management, health monitoring and fallback mechanisms to make your agents robust.
- Self-referential and Event Driven: An agent created with EmbodiedAgents can start, stop or reconfigure its own components based on internal and external events. For example, an agent can change the ML model for planning based on its location on the map or input from the vision model. EmbodiedAgents makes it simple to create agents that are self-referential Gödel machines.
- Semantic Memory: Integrates vector databases, semantic routing and other supporting components to quickly build arbitrarily complex graphs for agentic information flow. No need to utilize bloated “GenAI” frameworks on your robot.
- Made in ROS2: Utilizes ROS2 as the underlying distributed communications backbone. Theoretically, all devices that provide a ROS2 package can be utilized to send data to ML models, with callbacks implemented for most commonly used data types and infinite extensibility.
Checkout Installation Instructions 🛠️
Get started with the Quickstart Guide 🚀
Get familiar with Basic Concepts 📚
Dive right in with Example Recipes ✨
Installation 🛠️
Install a model serving platform
The core of EmbodiedAgents is agnostic to model serving platforms. It currently supports Ollama, RoboML and any platform or cloud provider with an OpenAI compatible API (e.g. vLLM, lmdeploy etc.). Please install either of these by following the instructions provided by respective projects. Support for new platforms is being continuously added. If you would like to support a particular platform, please open an issue/PR.
Install EmbodiedAgents (Ubuntu)
For ROS versions >= humble, you can install EmbodiedAgents with your package manager. For example on Ubuntu:
sudo apt install ros-$ROS_DISTRO-automatika-embodied-agents
Alternatively, grab your favorite deb package from the release page and install it as follows:
sudo dpkg -i ros-$ROS_DISTRO-automatica-embodied-agents_$version$DISTRO_$ARCHITECTURE.deb
If the attrs version from your package manager is < 23.2, install it using pip as follows:
pip install 'attrs>=23.2.0'
Install EmbodiedAgents from source
Get Dependencies
Install python dependencies
pip install numpy opencv-python-headless 'attrs>=23.2.0' jinja2 httpx setproctitle msgpack msgpack-numpy platformdirs tqdm websockets
Download Sugarcoat🍬
git clone https://github.com/automatika-robotics/sugarcoat
Install EmbodiedAgents
git clone https://github.com/automatika-robotics/embodied-agents.git
cd ..
colcon build
source install/setup.bash
python your_script.py
Quick Start 🚀
Unlike other ROS package, EmbodiedAgents provides a pure pythonic way of describing the node graph using Sugarcoat🍬. Copy the following code in a python script and run it.
```python from agents.clients.ollama import OllamaClient from agents.components import MLLM from agents.models import OllamaModel from agents.ros import Topic, Launcher
Define input and output topics (pay attention to msg_type)
text0 = Topic(name=”text0”, msg_type=”String”) image0 = Topic(name=”image_raw”, msg_type=”Image”) text1 = Topic(name=”text1”, msg_type=”String”)
Define a model client (working with Ollama in this case)
llava = OllamaModel(name=”llava”, checkpoint=”llava:latest”) llava_client = OllamaClient(llava)
Define an MLLM component (A component represents a node with a particular functionality)
mllm = MLLM( inputs=[text0, image0], outputs=[text1], model_client=llava_client, trigger=[text0], component_name=”vqa” )
Additional prompt settings
mllm.set_topic_prompt(text0, template=”"”You are an amazing and funny robot.
File truncated at 100 lines see the full file
Changelog for package automatika_embodied_agents
0.4.2 (2025-09-03)
- (feature) Adds udp streaming to IP:PORT as an option to TextToStream component when play_on_device is enabled
- (docs) Updates docs to use new web based client
- (feature) Adds processing of audio messages in web client
- (chore) Removes chainlit based client
- (feature) Adds a custom webclient to replace chainlit
- (feature) Adds persistent ros node in web client for async stream handling
- (feature) Adds warning when not using streaming string msg_type with streaming enabled in components
- (feature) Adds streaming string msg for managing streams in external clients
- (docs) Adds recipe for vision guided point navigation
- (fix) Fixes empty image input for Detection2D msg publication
- (fix) Fixes websocket receiving in text to speech
- (fix) Fixes keyword argument in detection and tracking publishing
- (feature) Adds publishing a singular detection or tracking message from the vision component
- Contributors: ahr, mkabtoul
0.4.1 (2025-07-10)
- (docs) Updates docs for using planning based MLLMs
- (feature) Adds options to get RGBD array from rgbd message callback
- (refactor) Breaks complex functions and fixes warmup result logging
- (feature) Adds support for planning mllm models, starting with robobrain2.0
- (docs) Adds streaming to conversational agent example
- Contributors: ahr, mkabtoul
0.4.0 (2025-06-18)
- (docs) Adds international readme files
- (feature) Adds better connection error messages in clients, adds installation instructions
- (chore) Adds debian packaging workflow
- (docs) Updates installation instructions
- (chore) Updates package names .. ROS Agents -> EmbodiedAgents
- (feature) Adds a GenericHTTPClient for using llm and mllm models served on any OpenAI compatible API
- (feature) Adds ollama specific inference options to OllamaModel and client
- (feature) Adds MeloTTS model to model definitions
- (feature) Adds say text method to text to speech for invoking with events
- (feature) Adds streaming playback for streaming input in speeech to text component
- (fix) Fixes clearing old output in the vision component when getting subscription data in a timed manner
- (feature) Adds tensorrt as an onnx provider option for local models
- (refactor) Removes sounddevice as a dependancy for text to speech component
- (feature) Adds local classification model for Vision component Default model: DEIM: DETR with Improved Matching for Fast Convergence by Huang et al.
- (feature) Adds warnings if device for local models is set to GPU and runtime is not available
- (feature) Adds hypothesis buffer for publishing confirmed transcripts when using streaming
- (feature) Adds asynchronous receiving for streaming websockets client in speech to text component
- (refactor) Adds getting inference params just once during node configuration
- (fix) Fixes handling of model init params and sending np arrays during inference
- (feature) Adds asynchronous publishing of response in LLM component when streaming with websocket client
- (feature) Adds local embeddings option using sentence-transformers to ChromaDB client
- (feature) Adds ChromaDB http client with ollama embeddigs
- (feature) Adds streaming with websocket client in llm component
- (fix) Fixes error message for required topics when they can be either/or
- (feature) Adds support for RGBD messages (in realsense style)
- (feature) Adds async websocket client for roboml
- (refactor) Marks child threads as daemons for smoother termination
- (feature) Adds break_character to llm component config to handle breaking streaming output into chunks for publishing
- (feature) Adds streaming to roboml http client for text data
- (feature) Adds streaming output handling to ollama client
- (refactor) Adds set_system_prompt to components and removes it from model config The same model can be called with various system prompts by different components
- (fix) Fixes typing bugs for for python 3.8 compatibility
- Contributors: ahr, aleph-ra, mkabtoul
0.3.3 (2025-01-28)
- (fix) Removes python dependencies from package manifest until package names merged in rosdistro
- Contributors: ahr
0.3.2 (2025-01-28)
- (docs) Updates docs for conversational agent and SpeechToTextConfig
- (feature) Adds vad, audio feautres and wakeword classification
File truncated at 100 lines see the full file
Package Dependencies
Deps | Name |
---|---|
ament_cmake | |
ament_cmake_python | |
rosidl_default_generators | |
rosidl_default_runtime | |
builtin_interfaces | |
std_msgs | |
sensor_msgs | |
automatika_ros_sugar |
System Dependencies
Dependant Packages
Launch files
Messages
Services
Plugins
Recent questions tagged automatika_embodied_agents at Robotics Stack Exchange
No questions yet, you can ask one on Robotics Stack Exchange.
Failed to get question list, you can ticket an issue on the github issue tracker.
![]() |
automatika_embodied_agents package from automatika_embodied_agents repoautomatika_embodied_agents |
ROS Distro
|
Package Summary
Version | 0.4.2 |
License | MIT |
Build type | AMENT_CMAKE |
Use | RECOMMENDED |
Repository Summary
Checkout URI | https://github.com/automatika-robotics/ros-agents.git |
VCS Type | git |
VCS Version | main |
Last Updated | 2025-09-16 |
Dev Status | DEVELOPED |
Released | RELEASED |
Contributing |
Help Wanted (-)
Good First Issues (-) Pull Requests to Review (-) |
Package Description
Additional Links
Maintainers
- Automatika Robotics
Authors

🇨🇳 简体ä¸ć–‡ | 🇯🇵 日本語 |
EmbodiedAgents is a fully-loaded ROS2 based framework for creating interactive physical agents that can understand, remember, and act upon contextual information from their environment.
- Production Ready Physical Agents: Designed to be used with autonomous robot systems that operate in real world dynamic environments. EmbodiedAgents makes it simple to create systems that make use of Physical AI.
- Intuitive API: Simple pythonic API to utilize local or cloud based ML models (specifically Multimodal LLMs and other transformer based architectures) on robots, with all the benefits of component lifecycle management, health monitoring and fallback mechanisms to make your agents robust.
- Self-referential and Event Driven: An agent created with EmbodiedAgents can start, stop or reconfigure its own components based on internal and external events. For example, an agent can change the ML model for planning based on its location on the map or input from the vision model. EmbodiedAgents makes it simple to create agents that are self-referential Gödel machines.
- Semantic Memory: Integrates vector databases, semantic routing and other supporting components to quickly build arbitrarily complex graphs for agentic information flow. No need to utilize bloated “GenAI” frameworks on your robot.
- Made in ROS2: Utilizes ROS2 as the underlying distributed communications backbone. Theoretically, all devices that provide a ROS2 package can be utilized to send data to ML models, with callbacks implemented for most commonly used data types and infinite extensibility.
Checkout Installation Instructions 🛠️
Get started with the Quickstart Guide 🚀
Get familiar with Basic Concepts 📚
Dive right in with Example Recipes ✨
Installation 🛠️
Install a model serving platform
The core of EmbodiedAgents is agnostic to model serving platforms. It currently supports Ollama, RoboML and any platform or cloud provider with an OpenAI compatible API (e.g. vLLM, lmdeploy etc.). Please install either of these by following the instructions provided by respective projects. Support for new platforms is being continuously added. If you would like to support a particular platform, please open an issue/PR.
Install EmbodiedAgents (Ubuntu)
For ROS versions >= humble, you can install EmbodiedAgents with your package manager. For example on Ubuntu:
sudo apt install ros-$ROS_DISTRO-automatika-embodied-agents
Alternatively, grab your favorite deb package from the release page and install it as follows:
sudo dpkg -i ros-$ROS_DISTRO-automatica-embodied-agents_$version$DISTRO_$ARCHITECTURE.deb
If the attrs version from your package manager is < 23.2, install it using pip as follows:
pip install 'attrs>=23.2.0'
Install EmbodiedAgents from source
Get Dependencies
Install python dependencies
pip install numpy opencv-python-headless 'attrs>=23.2.0' jinja2 httpx setproctitle msgpack msgpack-numpy platformdirs tqdm websockets
Download Sugarcoat🍬
git clone https://github.com/automatika-robotics/sugarcoat
Install EmbodiedAgents
git clone https://github.com/automatika-robotics/embodied-agents.git
cd ..
colcon build
source install/setup.bash
python your_script.py
Quick Start 🚀
Unlike other ROS package, EmbodiedAgents provides a pure pythonic way of describing the node graph using Sugarcoat🍬. Copy the following code in a python script and run it.
```python from agents.clients.ollama import OllamaClient from agents.components import MLLM from agents.models import OllamaModel from agents.ros import Topic, Launcher
Define input and output topics (pay attention to msg_type)
text0 = Topic(name=”text0”, msg_type=”String”) image0 = Topic(name=”image_raw”, msg_type=”Image”) text1 = Topic(name=”text1”, msg_type=”String”)
Define a model client (working with Ollama in this case)
llava = OllamaModel(name=”llava”, checkpoint=”llava:latest”) llava_client = OllamaClient(llava)
Define an MLLM component (A component represents a node with a particular functionality)
mllm = MLLM( inputs=[text0, image0], outputs=[text1], model_client=llava_client, trigger=[text0], component_name=”vqa” )
Additional prompt settings
mllm.set_topic_prompt(text0, template=”"”You are an amazing and funny robot.
File truncated at 100 lines see the full file
Changelog for package automatika_embodied_agents
0.4.2 (2025-09-03)
- (feature) Adds udp streaming to IP:PORT as an option to TextToStream component when play_on_device is enabled
- (docs) Updates docs to use new web based client
- (feature) Adds processing of audio messages in web client
- (chore) Removes chainlit based client
- (feature) Adds a custom webclient to replace chainlit
- (feature) Adds persistent ros node in web client for async stream handling
- (feature) Adds warning when not using streaming string msg_type with streaming enabled in components
- (feature) Adds streaming string msg for managing streams in external clients
- (docs) Adds recipe for vision guided point navigation
- (fix) Fixes empty image input for Detection2D msg publication
- (fix) Fixes websocket receiving in text to speech
- (fix) Fixes keyword argument in detection and tracking publishing
- (feature) Adds publishing a singular detection or tracking message from the vision component
- Contributors: ahr, mkabtoul
0.4.1 (2025-07-10)
- (docs) Updates docs for using planning based MLLMs
- (feature) Adds options to get RGBD array from rgbd message callback
- (refactor) Breaks complex functions and fixes warmup result logging
- (feature) Adds support for planning mllm models, starting with robobrain2.0
- (docs) Adds streaming to conversational agent example
- Contributors: ahr, mkabtoul
0.4.0 (2025-06-18)
- (docs) Adds international readme files
- (feature) Adds better connection error messages in clients, adds installation instructions
- (chore) Adds debian packaging workflow
- (docs) Updates installation instructions
- (chore) Updates package names .. ROS Agents -> EmbodiedAgents
- (feature) Adds a GenericHTTPClient for using llm and mllm models served on any OpenAI compatible API
- (feature) Adds ollama specific inference options to OllamaModel and client
- (feature) Adds MeloTTS model to model definitions
- (feature) Adds say text method to text to speech for invoking with events
- (feature) Adds streaming playback for streaming input in speeech to text component
- (fix) Fixes clearing old output in the vision component when getting subscription data in a timed manner
- (feature) Adds tensorrt as an onnx provider option for local models
- (refactor) Removes sounddevice as a dependancy for text to speech component
- (feature) Adds local classification model for Vision component Default model: DEIM: DETR with Improved Matching for Fast Convergence by Huang et al.
- (feature) Adds warnings if device for local models is set to GPU and runtime is not available
- (feature) Adds hypothesis buffer for publishing confirmed transcripts when using streaming
- (feature) Adds asynchronous receiving for streaming websockets client in speech to text component
- (refactor) Adds getting inference params just once during node configuration
- (fix) Fixes handling of model init params and sending np arrays during inference
- (feature) Adds asynchronous publishing of response in LLM component when streaming with websocket client
- (feature) Adds local embeddings option using sentence-transformers to ChromaDB client
- (feature) Adds ChromaDB http client with ollama embeddigs
- (feature) Adds streaming with websocket client in llm component
- (fix) Fixes error message for required topics when they can be either/or
- (feature) Adds support for RGBD messages (in realsense style)
- (feature) Adds async websocket client for roboml
- (refactor) Marks child threads as daemons for smoother termination
- (feature) Adds break_character to llm component config to handle breaking streaming output into chunks for publishing
- (feature) Adds streaming to roboml http client for text data
- (feature) Adds streaming output handling to ollama client
- (refactor) Adds set_system_prompt to components and removes it from model config The same model can be called with various system prompts by different components
- (fix) Fixes typing bugs for for python 3.8 compatibility
- Contributors: ahr, aleph-ra, mkabtoul
0.3.3 (2025-01-28)
- (fix) Removes python dependencies from package manifest until package names merged in rosdistro
- Contributors: ahr
0.3.2 (2025-01-28)
- (docs) Updates docs for conversational agent and SpeechToTextConfig
- (feature) Adds vad, audio feautres and wakeword classification
File truncated at 100 lines see the full file
Package Dependencies
Deps | Name |
---|---|
ament_cmake | |
ament_cmake_python | |
rosidl_default_generators | |
rosidl_default_runtime | |
builtin_interfaces | |
std_msgs | |
sensor_msgs | |
automatika_ros_sugar |
System Dependencies
Dependant Packages
Launch files
Messages
Services
Plugins
Recent questions tagged automatika_embodied_agents at Robotics Stack Exchange
No questions yet, you can ask one on Robotics Stack Exchange.
Failed to get question list, you can ticket an issue on the github issue tracker.
![]() |
automatika_embodied_agents package from automatika_embodied_agents repoautomatika_embodied_agents |
ROS Distro
|
Package Summary
Version | 0.4.2 |
License | MIT |
Build type | AMENT_CMAKE |
Use | RECOMMENDED |
Repository Summary
Checkout URI | https://github.com/automatika-robotics/ros-agents.git |
VCS Type | git |
VCS Version | main |
Last Updated | 2025-09-16 |
Dev Status | DEVELOPED |
Released | RELEASED |
Contributing |
Help Wanted (-)
Good First Issues (-) Pull Requests to Review (-) |
Package Description
Additional Links
Maintainers
- Automatika Robotics
Authors

🇨🇳 简体ä¸ć–‡ | 🇯🇵 日本語 |
EmbodiedAgents is a fully-loaded ROS2 based framework for creating interactive physical agents that can understand, remember, and act upon contextual information from their environment.
- Production Ready Physical Agents: Designed to be used with autonomous robot systems that operate in real world dynamic environments. EmbodiedAgents makes it simple to create systems that make use of Physical AI.
- Intuitive API: Simple pythonic API to utilize local or cloud based ML models (specifically Multimodal LLMs and other transformer based architectures) on robots, with all the benefits of component lifecycle management, health monitoring and fallback mechanisms to make your agents robust.
- Self-referential and Event Driven: An agent created with EmbodiedAgents can start, stop or reconfigure its own components based on internal and external events. For example, an agent can change the ML model for planning based on its location on the map or input from the vision model. EmbodiedAgents makes it simple to create agents that are self-referential Gödel machines.
- Semantic Memory: Integrates vector databases, semantic routing and other supporting components to quickly build arbitrarily complex graphs for agentic information flow. No need to utilize bloated “GenAI” frameworks on your robot.
- Made in ROS2: Utilizes ROS2 as the underlying distributed communications backbone. Theoretically, all devices that provide a ROS2 package can be utilized to send data to ML models, with callbacks implemented for most commonly used data types and infinite extensibility.
Checkout Installation Instructions 🛠️
Get started with the Quickstart Guide 🚀
Get familiar with Basic Concepts 📚
Dive right in with Example Recipes ✨
Installation 🛠️
Install a model serving platform
The core of EmbodiedAgents is agnostic to model serving platforms. It currently supports Ollama, RoboML and any platform or cloud provider with an OpenAI compatible API (e.g. vLLM, lmdeploy etc.). Please install either of these by following the instructions provided by respective projects. Support for new platforms is being continuously added. If you would like to support a particular platform, please open an issue/PR.
Install EmbodiedAgents (Ubuntu)
For ROS versions >= humble, you can install EmbodiedAgents with your package manager. For example on Ubuntu:
sudo apt install ros-$ROS_DISTRO-automatika-embodied-agents
Alternatively, grab your favorite deb package from the release page and install it as follows:
sudo dpkg -i ros-$ROS_DISTRO-automatica-embodied-agents_$version$DISTRO_$ARCHITECTURE.deb
If the attrs version from your package manager is < 23.2, install it using pip as follows:
pip install 'attrs>=23.2.0'
Install EmbodiedAgents from source
Get Dependencies
Install python dependencies
pip install numpy opencv-python-headless 'attrs>=23.2.0' jinja2 httpx setproctitle msgpack msgpack-numpy platformdirs tqdm websockets
Download Sugarcoat🍬
git clone https://github.com/automatika-robotics/sugarcoat
Install EmbodiedAgents
git clone https://github.com/automatika-robotics/embodied-agents.git
cd ..
colcon build
source install/setup.bash
python your_script.py
Quick Start 🚀
Unlike other ROS package, EmbodiedAgents provides a pure pythonic way of describing the node graph using Sugarcoat🍬. Copy the following code in a python script and run it.
```python from agents.clients.ollama import OllamaClient from agents.components import MLLM from agents.models import OllamaModel from agents.ros import Topic, Launcher
Define input and output topics (pay attention to msg_type)
text0 = Topic(name=”text0”, msg_type=”String”) image0 = Topic(name=”image_raw”, msg_type=”Image”) text1 = Topic(name=”text1”, msg_type=”String”)
Define a model client (working with Ollama in this case)
llava = OllamaModel(name=”llava”, checkpoint=”llava:latest”) llava_client = OllamaClient(llava)
Define an MLLM component (A component represents a node with a particular functionality)
mllm = MLLM( inputs=[text0, image0], outputs=[text1], model_client=llava_client, trigger=[text0], component_name=”vqa” )
Additional prompt settings
mllm.set_topic_prompt(text0, template=”"”You are an amazing and funny robot.
File truncated at 100 lines see the full file
Changelog for package automatika_embodied_agents
0.4.2 (2025-09-03)
- (feature) Adds udp streaming to IP:PORT as an option to TextToStream component when play_on_device is enabled
- (docs) Updates docs to use new web based client
- (feature) Adds processing of audio messages in web client
- (chore) Removes chainlit based client
- (feature) Adds a custom webclient to replace chainlit
- (feature) Adds persistent ros node in web client for async stream handling
- (feature) Adds warning when not using streaming string msg_type with streaming enabled in components
- (feature) Adds streaming string msg for managing streams in external clients
- (docs) Adds recipe for vision guided point navigation
- (fix) Fixes empty image input for Detection2D msg publication
- (fix) Fixes websocket receiving in text to speech
- (fix) Fixes keyword argument in detection and tracking publishing
- (feature) Adds publishing a singular detection or tracking message from the vision component
- Contributors: ahr, mkabtoul
0.4.1 (2025-07-10)
- (docs) Updates docs for using planning based MLLMs
- (feature) Adds options to get RGBD array from rgbd message callback
- (refactor) Breaks complex functions and fixes warmup result logging
- (feature) Adds support for planning mllm models, starting with robobrain2.0
- (docs) Adds streaming to conversational agent example
- Contributors: ahr, mkabtoul
0.4.0 (2025-06-18)
- (docs) Adds international readme files
- (feature) Adds better connection error messages in clients, adds installation instructions
- (chore) Adds debian packaging workflow
- (docs) Updates installation instructions
- (chore) Updates package names .. ROS Agents -> EmbodiedAgents
- (feature) Adds a GenericHTTPClient for using llm and mllm models served on any OpenAI compatible API
- (feature) Adds ollama specific inference options to OllamaModel and client
- (feature) Adds MeloTTS model to model definitions
- (feature) Adds say text method to text to speech for invoking with events
- (feature) Adds streaming playback for streaming input in speeech to text component
- (fix) Fixes clearing old output in the vision component when getting subscription data in a timed manner
- (feature) Adds tensorrt as an onnx provider option for local models
- (refactor) Removes sounddevice as a dependancy for text to speech component
- (feature) Adds local classification model for Vision component Default model: DEIM: DETR with Improved Matching for Fast Convergence by Huang et al.
- (feature) Adds warnings if device for local models is set to GPU and runtime is not available
- (feature) Adds hypothesis buffer for publishing confirmed transcripts when using streaming
- (feature) Adds asynchronous receiving for streaming websockets client in speech to text component
- (refactor) Adds getting inference params just once during node configuration
- (fix) Fixes handling of model init params and sending np arrays during inference
- (feature) Adds asynchronous publishing of response in LLM component when streaming with websocket client
- (feature) Adds local embeddings option using sentence-transformers to ChromaDB client
- (feature) Adds ChromaDB http client with ollama embeddigs
- (feature) Adds streaming with websocket client in llm component
- (fix) Fixes error message for required topics when they can be either/or
- (feature) Adds support for RGBD messages (in realsense style)
- (feature) Adds async websocket client for roboml
- (refactor) Marks child threads as daemons for smoother termination
- (feature) Adds break_character to llm component config to handle breaking streaming output into chunks for publishing
- (feature) Adds streaming to roboml http client for text data
- (feature) Adds streaming output handling to ollama client
- (refactor) Adds set_system_prompt to components and removes it from model config The same model can be called with various system prompts by different components
- (fix) Fixes typing bugs for for python 3.8 compatibility
- Contributors: ahr, aleph-ra, mkabtoul
0.3.3 (2025-01-28)
- (fix) Removes python dependencies from package manifest until package names merged in rosdistro
- Contributors: ahr
0.3.2 (2025-01-28)
- (docs) Updates docs for conversational agent and SpeechToTextConfig
- (feature) Adds vad, audio feautres and wakeword classification
File truncated at 100 lines see the full file
Package Dependencies
Deps | Name |
---|---|
ament_cmake | |
ament_cmake_python | |
rosidl_default_generators | |
rosidl_default_runtime | |
builtin_interfaces | |
std_msgs | |
sensor_msgs | |
automatika_ros_sugar |
System Dependencies
Dependant Packages
Launch files
Messages
Services
Plugins
Recent questions tagged automatika_embodied_agents at Robotics Stack Exchange
No questions yet, you can ask one on Robotics Stack Exchange.
Failed to get question list, you can ticket an issue on the github issue tracker.
![]() |
automatika_embodied_agents package from automatika_embodied_agents repoautomatika_embodied_agents |
ROS Distro
|
Package Summary
Version | 0.4.2 |
License | MIT |
Build type | AMENT_CMAKE |
Use | RECOMMENDED |
Repository Summary
Checkout URI | https://github.com/automatika-robotics/ros-agents.git |
VCS Type | git |
VCS Version | main |
Last Updated | 2025-09-16 |
Dev Status | DEVELOPED |
Released | RELEASED |
Contributing |
Help Wanted (-)
Good First Issues (-) Pull Requests to Review (-) |
Package Description
Additional Links
Maintainers
- Automatika Robotics
Authors

🇨🇳 简体ä¸ć–‡ | 🇯🇵 日本語 |
EmbodiedAgents is a fully-loaded ROS2 based framework for creating interactive physical agents that can understand, remember, and act upon contextual information from their environment.
- Production Ready Physical Agents: Designed to be used with autonomous robot systems that operate in real world dynamic environments. EmbodiedAgents makes it simple to create systems that make use of Physical AI.
- Intuitive API: Simple pythonic API to utilize local or cloud based ML models (specifically Multimodal LLMs and other transformer based architectures) on robots, with all the benefits of component lifecycle management, health monitoring and fallback mechanisms to make your agents robust.
- Self-referential and Event Driven: An agent created with EmbodiedAgents can start, stop or reconfigure its own components based on internal and external events. For example, an agent can change the ML model for planning based on its location on the map or input from the vision model. EmbodiedAgents makes it simple to create agents that are self-referential Gödel machines.
- Semantic Memory: Integrates vector databases, semantic routing and other supporting components to quickly build arbitrarily complex graphs for agentic information flow. No need to utilize bloated “GenAI” frameworks on your robot.
- Made in ROS2: Utilizes ROS2 as the underlying distributed communications backbone. Theoretically, all devices that provide a ROS2 package can be utilized to send data to ML models, with callbacks implemented for most commonly used data types and infinite extensibility.
Checkout Installation Instructions 🛠️
Get started with the Quickstart Guide 🚀
Get familiar with Basic Concepts 📚
Dive right in with Example Recipes ✨
Installation 🛠️
Install a model serving platform
The core of EmbodiedAgents is agnostic to model serving platforms. It currently supports Ollama, RoboML and any platform or cloud provider with an OpenAI compatible API (e.g. vLLM, lmdeploy etc.). Please install either of these by following the instructions provided by respective projects. Support for new platforms is being continuously added. If you would like to support a particular platform, please open an issue/PR.
Install EmbodiedAgents (Ubuntu)
For ROS versions >= humble, you can install EmbodiedAgents with your package manager. For example on Ubuntu:
sudo apt install ros-$ROS_DISTRO-automatika-embodied-agents
Alternatively, grab your favorite deb package from the release page and install it as follows:
sudo dpkg -i ros-$ROS_DISTRO-automatica-embodied-agents_$version$DISTRO_$ARCHITECTURE.deb
If the attrs version from your package manager is < 23.2, install it using pip as follows:
pip install 'attrs>=23.2.0'
Install EmbodiedAgents from source
Get Dependencies
Install python dependencies
pip install numpy opencv-python-headless 'attrs>=23.2.0' jinja2 httpx setproctitle msgpack msgpack-numpy platformdirs tqdm websockets
Download Sugarcoat🍬
git clone https://github.com/automatika-robotics/sugarcoat
Install EmbodiedAgents
git clone https://github.com/automatika-robotics/embodied-agents.git
cd ..
colcon build
source install/setup.bash
python your_script.py
Quick Start 🚀
Unlike other ROS package, EmbodiedAgents provides a pure pythonic way of describing the node graph using Sugarcoat🍬. Copy the following code in a python script and run it.
```python from agents.clients.ollama import OllamaClient from agents.components import MLLM from agents.models import OllamaModel from agents.ros import Topic, Launcher
Define input and output topics (pay attention to msg_type)
text0 = Topic(name=”text0”, msg_type=”String”) image0 = Topic(name=”image_raw”, msg_type=”Image”) text1 = Topic(name=”text1”, msg_type=”String”)
Define a model client (working with Ollama in this case)
llava = OllamaModel(name=”llava”, checkpoint=”llava:latest”) llava_client = OllamaClient(llava)
Define an MLLM component (A component represents a node with a particular functionality)
mllm = MLLM( inputs=[text0, image0], outputs=[text1], model_client=llava_client, trigger=[text0], component_name=”vqa” )
Additional prompt settings
mllm.set_topic_prompt(text0, template=”"”You are an amazing and funny robot.
File truncated at 100 lines see the full file
Changelog for package automatika_embodied_agents
0.4.2 (2025-09-03)
- (feature) Adds udp streaming to IP:PORT as an option to TextToStream component when play_on_device is enabled
- (docs) Updates docs to use new web based client
- (feature) Adds processing of audio messages in web client
- (chore) Removes chainlit based client
- (feature) Adds a custom webclient to replace chainlit
- (feature) Adds persistent ros node in web client for async stream handling
- (feature) Adds warning when not using streaming string msg_type with streaming enabled in components
- (feature) Adds streaming string msg for managing streams in external clients
- (docs) Adds recipe for vision guided point navigation
- (fix) Fixes empty image input for Detection2D msg publication
- (fix) Fixes websocket receiving in text to speech
- (fix) Fixes keyword argument in detection and tracking publishing
- (feature) Adds publishing a singular detection or tracking message from the vision component
- Contributors: ahr, mkabtoul
0.4.1 (2025-07-10)
- (docs) Updates docs for using planning based MLLMs
- (feature) Adds options to get RGBD array from rgbd message callback
- (refactor) Breaks complex functions and fixes warmup result logging
- (feature) Adds support for planning mllm models, starting with robobrain2.0
- (docs) Adds streaming to conversational agent example
- Contributors: ahr, mkabtoul
0.4.0 (2025-06-18)
- (docs) Adds international readme files
- (feature) Adds better connection error messages in clients, adds installation instructions
- (chore) Adds debian packaging workflow
- (docs) Updates installation instructions
- (chore) Updates package names .. ROS Agents -> EmbodiedAgents
- (feature) Adds a GenericHTTPClient for using llm and mllm models served on any OpenAI compatible API
- (feature) Adds ollama specific inference options to OllamaModel and client
- (feature) Adds MeloTTS model to model definitions
- (feature) Adds say text method to text to speech for invoking with events
- (feature) Adds streaming playback for streaming input in speeech to text component
- (fix) Fixes clearing old output in the vision component when getting subscription data in a timed manner
- (feature) Adds tensorrt as an onnx provider option for local models
- (refactor) Removes sounddevice as a dependancy for text to speech component
- (feature) Adds local classification model for Vision component Default model: DEIM: DETR with Improved Matching for Fast Convergence by Huang et al.
- (feature) Adds warnings if device for local models is set to GPU and runtime is not available
- (feature) Adds hypothesis buffer for publishing confirmed transcripts when using streaming
- (feature) Adds asynchronous receiving for streaming websockets client in speech to text component
- (refactor) Adds getting inference params just once during node configuration
- (fix) Fixes handling of model init params and sending np arrays during inference
- (feature) Adds asynchronous publishing of response in LLM component when streaming with websocket client
- (feature) Adds local embeddings option using sentence-transformers to ChromaDB client
- (feature) Adds ChromaDB http client with ollama embeddigs
- (feature) Adds streaming with websocket client in llm component
- (fix) Fixes error message for required topics when they can be either/or
- (feature) Adds support for RGBD messages (in realsense style)
- (feature) Adds async websocket client for roboml
- (refactor) Marks child threads as daemons for smoother termination
- (feature) Adds break_character to llm component config to handle breaking streaming output into chunks for publishing
- (feature) Adds streaming to roboml http client for text data
- (feature) Adds streaming output handling to ollama client
- (refactor) Adds set_system_prompt to components and removes it from model config The same model can be called with various system prompts by different components
- (fix) Fixes typing bugs for for python 3.8 compatibility
- Contributors: ahr, aleph-ra, mkabtoul
0.3.3 (2025-01-28)
- (fix) Removes python dependencies from package manifest until package names merged in rosdistro
- Contributors: ahr
0.3.2 (2025-01-28)
- (docs) Updates docs for conversational agent and SpeechToTextConfig
- (feature) Adds vad, audio feautres and wakeword classification
File truncated at 100 lines see the full file
Package Dependencies
Deps | Name |
---|---|
ament_cmake | |
ament_cmake_python | |
rosidl_default_generators | |
rosidl_default_runtime | |
builtin_interfaces | |
std_msgs | |
sensor_msgs | |
automatika_ros_sugar |
System Dependencies
Dependant Packages
Launch files
Messages
Services
Plugins
Recent questions tagged automatika_embodied_agents at Robotics Stack Exchange
No questions yet, you can ask one on Robotics Stack Exchange.
Failed to get question list, you can ticket an issue on the github issue tracker.
![]() |
automatika_embodied_agents package from automatika_embodied_agents repoautomatika_embodied_agents |
ROS Distro
|
Package Summary
Version | 0.4.2 |
License | MIT |
Build type | AMENT_CMAKE |
Use | RECOMMENDED |
Repository Summary
Checkout URI | https://github.com/automatika-robotics/ros-agents.git |
VCS Type | git |
VCS Version | main |
Last Updated | 2025-09-16 |
Dev Status | DEVELOPED |
Released | RELEASED |
Contributing |
Help Wanted (-)
Good First Issues (-) Pull Requests to Review (-) |
Package Description
Additional Links
Maintainers
- Automatika Robotics
Authors

🇨🇳 简体ä¸ć–‡ | 🇯🇵 日本語 |
EmbodiedAgents is a fully-loaded ROS2 based framework for creating interactive physical agents that can understand, remember, and act upon contextual information from their environment.
- Production Ready Physical Agents: Designed to be used with autonomous robot systems that operate in real world dynamic environments. EmbodiedAgents makes it simple to create systems that make use of Physical AI.
- Intuitive API: Simple pythonic API to utilize local or cloud based ML models (specifically Multimodal LLMs and other transformer based architectures) on robots, with all the benefits of component lifecycle management, health monitoring and fallback mechanisms to make your agents robust.
- Self-referential and Event Driven: An agent created with EmbodiedAgents can start, stop or reconfigure its own components based on internal and external events. For example, an agent can change the ML model for planning based on its location on the map or input from the vision model. EmbodiedAgents makes it simple to create agents that are self-referential Gödel machines.
- Semantic Memory: Integrates vector databases, semantic routing and other supporting components to quickly build arbitrarily complex graphs for agentic information flow. No need to utilize bloated “GenAI” frameworks on your robot.
- Made in ROS2: Utilizes ROS2 as the underlying distributed communications backbone. Theoretically, all devices that provide a ROS2 package can be utilized to send data to ML models, with callbacks implemented for most commonly used data types and infinite extensibility.
Checkout Installation Instructions 🛠️
Get started with the Quickstart Guide 🚀
Get familiar with Basic Concepts 📚
Dive right in with Example Recipes ✨
Installation 🛠️
Install a model serving platform
The core of EmbodiedAgents is agnostic to model serving platforms. It currently supports Ollama, RoboML and any platform or cloud provider with an OpenAI compatible API (e.g. vLLM, lmdeploy etc.). Please install either of these by following the instructions provided by respective projects. Support for new platforms is being continuously added. If you would like to support a particular platform, please open an issue/PR.
Install EmbodiedAgents (Ubuntu)
For ROS versions >= humble, you can install EmbodiedAgents with your package manager. For example on Ubuntu:
sudo apt install ros-$ROS_DISTRO-automatika-embodied-agents
Alternatively, grab your favorite deb package from the release page and install it as follows:
sudo dpkg -i ros-$ROS_DISTRO-automatica-embodied-agents_$version$DISTRO_$ARCHITECTURE.deb
If the attrs version from your package manager is < 23.2, install it using pip as follows:
pip install 'attrs>=23.2.0'
Install EmbodiedAgents from source
Get Dependencies
Install python dependencies
pip install numpy opencv-python-headless 'attrs>=23.2.0' jinja2 httpx setproctitle msgpack msgpack-numpy platformdirs tqdm websockets
Download Sugarcoat🍬
git clone https://github.com/automatika-robotics/sugarcoat
Install EmbodiedAgents
git clone https://github.com/automatika-robotics/embodied-agents.git
cd ..
colcon build
source install/setup.bash
python your_script.py
Quick Start 🚀
Unlike other ROS package, EmbodiedAgents provides a pure pythonic way of describing the node graph using Sugarcoat🍬. Copy the following code in a python script and run it.
```python from agents.clients.ollama import OllamaClient from agents.components import MLLM from agents.models import OllamaModel from agents.ros import Topic, Launcher
Define input and output topics (pay attention to msg_type)
text0 = Topic(name=”text0”, msg_type=”String”) image0 = Topic(name=”image_raw”, msg_type=”Image”) text1 = Topic(name=”text1”, msg_type=”String”)
Define a model client (working with Ollama in this case)
llava = OllamaModel(name=”llava”, checkpoint=”llava:latest”) llava_client = OllamaClient(llava)
Define an MLLM component (A component represents a node with a particular functionality)
mllm = MLLM( inputs=[text0, image0], outputs=[text1], model_client=llava_client, trigger=[text0], component_name=”vqa” )
Additional prompt settings
mllm.set_topic_prompt(text0, template=”"”You are an amazing and funny robot.
File truncated at 100 lines see the full file
Changelog for package automatika_embodied_agents
0.4.2 (2025-09-03)
- (feature) Adds udp streaming to IP:PORT as an option to TextToStream component when play_on_device is enabled
- (docs) Updates docs to use new web based client
- (feature) Adds processing of audio messages in web client
- (chore) Removes chainlit based client
- (feature) Adds a custom webclient to replace chainlit
- (feature) Adds persistent ros node in web client for async stream handling
- (feature) Adds warning when not using streaming string msg_type with streaming enabled in components
- (feature) Adds streaming string msg for managing streams in external clients
- (docs) Adds recipe for vision guided point navigation
- (fix) Fixes empty image input for Detection2D msg publication
- (fix) Fixes websocket receiving in text to speech
- (fix) Fixes keyword argument in detection and tracking publishing
- (feature) Adds publishing a singular detection or tracking message from the vision component
- Contributors: ahr, mkabtoul
0.4.1 (2025-07-10)
- (docs) Updates docs for using planning based MLLMs
- (feature) Adds options to get RGBD array from rgbd message callback
- (refactor) Breaks complex functions and fixes warmup result logging
- (feature) Adds support for planning mllm models, starting with robobrain2.0
- (docs) Adds streaming to conversational agent example
- Contributors: ahr, mkabtoul
0.4.0 (2025-06-18)
- (docs) Adds international readme files
- (feature) Adds better connection error messages in clients, adds installation instructions
- (chore) Adds debian packaging workflow
- (docs) Updates installation instructions
- (chore) Updates package names .. ROS Agents -> EmbodiedAgents
- (feature) Adds a GenericHTTPClient for using llm and mllm models served on any OpenAI compatible API
- (feature) Adds ollama specific inference options to OllamaModel and client
- (feature) Adds MeloTTS model to model definitions
- (feature) Adds say text method to text to speech for invoking with events
- (feature) Adds streaming playback for streaming input in speeech to text component
- (fix) Fixes clearing old output in the vision component when getting subscription data in a timed manner
- (feature) Adds tensorrt as an onnx provider option for local models
- (refactor) Removes sounddevice as a dependancy for text to speech component
- (feature) Adds local classification model for Vision component Default model: DEIM: DETR with Improved Matching for Fast Convergence by Huang et al.
- (feature) Adds warnings if device for local models is set to GPU and runtime is not available
- (feature) Adds hypothesis buffer for publishing confirmed transcripts when using streaming
- (feature) Adds asynchronous receiving for streaming websockets client in speech to text component
- (refactor) Adds getting inference params just once during node configuration
- (fix) Fixes handling of model init params and sending np arrays during inference
- (feature) Adds asynchronous publishing of response in LLM component when streaming with websocket client
- (feature) Adds local embeddings option using sentence-transformers to ChromaDB client
- (feature) Adds ChromaDB http client with ollama embeddigs
- (feature) Adds streaming with websocket client in llm component
- (fix) Fixes error message for required topics when they can be either/or
- (feature) Adds support for RGBD messages (in realsense style)
- (feature) Adds async websocket client for roboml
- (refactor) Marks child threads as daemons for smoother termination
- (feature) Adds break_character to llm component config to handle breaking streaming output into chunks for publishing
- (feature) Adds streaming to roboml http client for text data
- (feature) Adds streaming output handling to ollama client
- (refactor) Adds set_system_prompt to components and removes it from model config The same model can be called with various system prompts by different components
- (fix) Fixes typing bugs for for python 3.8 compatibility
- Contributors: ahr, aleph-ra, mkabtoul
0.3.3 (2025-01-28)
- (fix) Removes python dependencies from package manifest until package names merged in rosdistro
- Contributors: ahr
0.3.2 (2025-01-28)
- (docs) Updates docs for conversational agent and SpeechToTextConfig
- (feature) Adds vad, audio feautres and wakeword classification
File truncated at 100 lines see the full file
Package Dependencies
Deps | Name |
---|---|
ament_cmake | |
ament_cmake_python | |
rosidl_default_generators | |
rosidl_default_runtime | |
builtin_interfaces | |
std_msgs | |
sensor_msgs | |
automatika_ros_sugar |
System Dependencies
Dependant Packages
Launch files
Messages
Services
Plugins
Recent questions tagged automatika_embodied_agents at Robotics Stack Exchange
No questions yet, you can ask one on Robotics Stack Exchange.
Failed to get question list, you can ticket an issue on the github issue tracker.
![]() |
automatika_embodied_agents package from automatika_embodied_agents repoautomatika_embodied_agents |
ROS Distro
|
Package Summary
Version | 0.4.2 |
License | MIT |
Build type | AMENT_CMAKE |
Use | RECOMMENDED |
Repository Summary
Checkout URI | https://github.com/automatika-robotics/ros-agents.git |
VCS Type | git |
VCS Version | main |
Last Updated | 2025-09-16 |
Dev Status | DEVELOPED |
Released | RELEASED |
Contributing |
Help Wanted (-)
Good First Issues (-) Pull Requests to Review (-) |
Package Description
Additional Links
Maintainers
- Automatika Robotics
Authors

🇨🇳 简体ä¸ć–‡ | 🇯🇵 日本語 |
EmbodiedAgents is a fully-loaded ROS2 based framework for creating interactive physical agents that can understand, remember, and act upon contextual information from their environment.
- Production Ready Physical Agents: Designed to be used with autonomous robot systems that operate in real world dynamic environments. EmbodiedAgents makes it simple to create systems that make use of Physical AI.
- Intuitive API: Simple pythonic API to utilize local or cloud based ML models (specifically Multimodal LLMs and other transformer based architectures) on robots, with all the benefits of component lifecycle management, health monitoring and fallback mechanisms to make your agents robust.
- Self-referential and Event Driven: An agent created with EmbodiedAgents can start, stop or reconfigure its own components based on internal and external events. For example, an agent can change the ML model for planning based on its location on the map or input from the vision model. EmbodiedAgents makes it simple to create agents that are self-referential Gödel machines.
- Semantic Memory: Integrates vector databases, semantic routing and other supporting components to quickly build arbitrarily complex graphs for agentic information flow. No need to utilize bloated “GenAI” frameworks on your robot.
- Made in ROS2: Utilizes ROS2 as the underlying distributed communications backbone. Theoretically, all devices that provide a ROS2 package can be utilized to send data to ML models, with callbacks implemented for most commonly used data types and infinite extensibility.
Checkout Installation Instructions 🛠️
Get started with the Quickstart Guide 🚀
Get familiar with Basic Concepts 📚
Dive right in with Example Recipes ✨
Installation 🛠️
Install a model serving platform
The core of EmbodiedAgents is agnostic to model serving platforms. It currently supports Ollama, RoboML and any platform or cloud provider with an OpenAI compatible API (e.g. vLLM, lmdeploy etc.). Please install either of these by following the instructions provided by respective projects. Support for new platforms is being continuously added. If you would like to support a particular platform, please open an issue/PR.
Install EmbodiedAgents (Ubuntu)
For ROS versions >= humble, you can install EmbodiedAgents with your package manager. For example on Ubuntu:
sudo apt install ros-$ROS_DISTRO-automatika-embodied-agents
Alternatively, grab your favorite deb package from the release page and install it as follows:
sudo dpkg -i ros-$ROS_DISTRO-automatica-embodied-agents_$version$DISTRO_$ARCHITECTURE.deb
If the attrs version from your package manager is < 23.2, install it using pip as follows:
pip install 'attrs>=23.2.0'
Install EmbodiedAgents from source
Get Dependencies
Install python dependencies
pip install numpy opencv-python-headless 'attrs>=23.2.0' jinja2 httpx setproctitle msgpack msgpack-numpy platformdirs tqdm websockets
Download Sugarcoat🍬
git clone https://github.com/automatika-robotics/sugarcoat
Install EmbodiedAgents
git clone https://github.com/automatika-robotics/embodied-agents.git
cd ..
colcon build
source install/setup.bash
python your_script.py
Quick Start 🚀
Unlike other ROS package, EmbodiedAgents provides a pure pythonic way of describing the node graph using Sugarcoat🍬. Copy the following code in a python script and run it.
```python from agents.clients.ollama import OllamaClient from agents.components import MLLM from agents.models import OllamaModel from agents.ros import Topic, Launcher
Define input and output topics (pay attention to msg_type)
text0 = Topic(name=”text0”, msg_type=”String”) image0 = Topic(name=”image_raw”, msg_type=”Image”) text1 = Topic(name=”text1”, msg_type=”String”)
Define a model client (working with Ollama in this case)
llava = OllamaModel(name=”llava”, checkpoint=”llava:latest”) llava_client = OllamaClient(llava)
Define an MLLM component (A component represents a node with a particular functionality)
mllm = MLLM( inputs=[text0, image0], outputs=[text1], model_client=llava_client, trigger=[text0], component_name=”vqa” )
Additional prompt settings
mllm.set_topic_prompt(text0, template=”"”You are an amazing and funny robot.
File truncated at 100 lines see the full file
Changelog for package automatika_embodied_agents
0.4.2 (2025-09-03)
- (feature) Adds udp streaming to IP:PORT as an option to TextToStream component when play_on_device is enabled
- (docs) Updates docs to use new web based client
- (feature) Adds processing of audio messages in web client
- (chore) Removes chainlit based client
- (feature) Adds a custom webclient to replace chainlit
- (feature) Adds persistent ros node in web client for async stream handling
- (feature) Adds warning when not using streaming string msg_type with streaming enabled in components
- (feature) Adds streaming string msg for managing streams in external clients
- (docs) Adds recipe for vision guided point navigation
- (fix) Fixes empty image input for Detection2D msg publication
- (fix) Fixes websocket receiving in text to speech
- (fix) Fixes keyword argument in detection and tracking publishing
- (feature) Adds publishing a singular detection or tracking message from the vision component
- Contributors: ahr, mkabtoul
0.4.1 (2025-07-10)
- (docs) Updates docs for using planning based MLLMs
- (feature) Adds options to get RGBD array from rgbd message callback
- (refactor) Breaks complex functions and fixes warmup result logging
- (feature) Adds support for planning mllm models, starting with robobrain2.0
- (docs) Adds streaming to conversational agent example
- Contributors: ahr, mkabtoul
0.4.0 (2025-06-18)
- (docs) Adds international readme files
- (feature) Adds better connection error messages in clients, adds installation instructions
- (chore) Adds debian packaging workflow
- (docs) Updates installation instructions
- (chore) Updates package names .. ROS Agents -> EmbodiedAgents
- (feature) Adds a GenericHTTPClient for using llm and mllm models served on any OpenAI compatible API
- (feature) Adds ollama specific inference options to OllamaModel and client
- (feature) Adds MeloTTS model to model definitions
- (feature) Adds say text method to text to speech for invoking with events
- (feature) Adds streaming playback for streaming input in speeech to text component
- (fix) Fixes clearing old output in the vision component when getting subscription data in a timed manner
- (feature) Adds tensorrt as an onnx provider option for local models
- (refactor) Removes sounddevice as a dependancy for text to speech component
- (feature) Adds local classification model for Vision component Default model: DEIM: DETR with Improved Matching for Fast Convergence by Huang et al.
- (feature) Adds warnings if device for local models is set to GPU and runtime is not available
- (feature) Adds hypothesis buffer for publishing confirmed transcripts when using streaming
- (feature) Adds asynchronous receiving for streaming websockets client in speech to text component
- (refactor) Adds getting inference params just once during node configuration
- (fix) Fixes handling of model init params and sending np arrays during inference
- (feature) Adds asynchronous publishing of response in LLM component when streaming with websocket client
- (feature) Adds local embeddings option using sentence-transformers to ChromaDB client
- (feature) Adds ChromaDB http client with ollama embeddigs
- (feature) Adds streaming with websocket client in llm component
- (fix) Fixes error message for required topics when they can be either/or
- (feature) Adds support for RGBD messages (in realsense style)
- (feature) Adds async websocket client for roboml
- (refactor) Marks child threads as daemons for smoother termination
- (feature) Adds break_character to llm component config to handle breaking streaming output into chunks for publishing
- (feature) Adds streaming to roboml http client for text data
- (feature) Adds streaming output handling to ollama client
- (refactor) Adds set_system_prompt to components and removes it from model config The same model can be called with various system prompts by different components
- (fix) Fixes typing bugs for for python 3.8 compatibility
- Contributors: ahr, aleph-ra, mkabtoul
0.3.3 (2025-01-28)
- (fix) Removes python dependencies from package manifest until package names merged in rosdistro
- Contributors: ahr
0.3.2 (2025-01-28)
- (docs) Updates docs for conversational agent and SpeechToTextConfig
- (feature) Adds vad, audio feautres and wakeword classification
File truncated at 100 lines see the full file
Package Dependencies
Deps | Name |
---|---|
ament_cmake | |
ament_cmake_python | |
rosidl_default_generators | |
rosidl_default_runtime | |
builtin_interfaces | |
std_msgs | |
sensor_msgs | |
automatika_ros_sugar |
System Dependencies
Dependant Packages
Launch files
Messages
Services
Plugins
Recent questions tagged automatika_embodied_agents at Robotics Stack Exchange
No questions yet, you can ask one on Robotics Stack Exchange.
Failed to get question list, you can ticket an issue on the github issue tracker.