ROS Voice Assistant
Overview
ros_voice_assistant is a ROS 2 package that enables natural language interaction with a robot using advanced AI models. It integrates Google Text-to-Speech (TTS), OpenAI's Whisper for speech recognition (via Groq), and LLaMA 70B (via Groq) for natural language understanding. A Flask-based web interface allows users to record and send voice commands to the robot in real-time.
Features
β Natural language voice interaction using large language models
π£οΈ Speech-to-text via Whisper
π Text-to-speech via Google TTS
π Flask web interface for command input
π§ Inference powered by Groq LPU accelerators
πΊοΈ Location-aware commands using ROS maps
Installation
- Install ROS dependencies
Update your packages list and install the required packages:
- Git Clone
Download this repository into your workspace and build it:
cd src
git clone https://gitea.locker98.com/locker98/ros-voice-assistant.git
cd ros-voice-assistant
pip install -r requirements.txt
cd ../../
colcon build
- Export Groq API Key
Youβll need an API key from Groq:
- Setting Up Arm
If you want to run the Voice Assistant with a robotic arm from Universal Robot you have to install
locker98_tools
package from https://gitea.locker98.com/locker98/locker98_tools_ws.git.
Map File Structure
The maps
directory should contain the following files, all sharing the same base name:
maps/
βββ rhodes_map.pgm # Map image used by SLAM and navigation
βββ rhodes_map.yaml # Metadata file with scale, origin, and image path
βββ rhodes_map.json # Location data (generated by map_node) used by voice assistant
- The
.pgm
and.yaml
files are standard output from SLAM Toolbox or map_server tools. - The
.json
file is generated by themap_node
tool and is required for thevoice_assistant.launch.py
to load predefined location data with descriptions.
Running the Package
1. Launch the Voice Assistant
To start the voice assistant, use the following command with your custom namespace and map file:
ros2 launch voice_assistant_bringup voice_assistant.launch.py namespace:=a200_0000 location_file:=$HOME/husky_configs/maps/rhodes2_full.json
Note:
Standard SLAM maps typically include a.pgm
image file and a.yaml
metadata file. The.yaml
file contains information such as the origin and resolution (scale) of the map image.To keep naming consistent, the voice assistant expects a
.json
file that contains known location data. This file should share the same base name as the.pgm
and.yaml
files (e.g.,rhodes2_full.pgm
,rhodes2_full.yaml
, andrhodes2_full.json
).Note: Browsers will not let you use a microphone on a non https website so you might need to follow these instructions here to fix it.
2. Manage Locations via Flask Interface
To interactively add or remove locations on the map, run the map_node
. This uses the .yaml
file from SLAM Toolbox as input:
ros2 run voice_assistant map_node --ros-args -p location_file:=$HOME/husky_configs/maps/rhodes2_full.yaml
Note:
The.yaml
file provides the necessary information about the mapβs origin and resolution, and points to the corresponding.pgm
file.The
map_node
tool will generate a.json
file that contains location data and is compatible with the voice assistant. This.json
file will have the same base name as your existing.pgm
and.yaml
files.
3. Launch the Voice Assistant Without Arm
To start the voice assistant without the robotic arm, use the following command with your custom namespace and map file:
ros2 launch voice_assistant_bringup voice_assistant.launch.py namespace:=a200_0000 location_file:=$HOME/husky_configs/maps/rhodes2_full.json enable_arm:=False
Note:
This allows you to run the Voice Assistant without installing thelocker98_tools
package.