David Gonzalez-Aguirre is a principal engineer at Intel Labs, specializing in robot perception. Co-authors Javier Felip Leon is a senior research scientist specializing in robotics, Javier Felix Rendon is a research scientist specializing in mechatronics, Roderico Garcia Leal is a research scientist specializing in haptics, and Julio Zamora is a principal engineer specializing in robotics.
Highlights
- Using programming-by-demonstration for collaborative robots (cobots), Intel Labs researchers employ a novel framework for mixed reality for local and remote teleoperation and task automation.
- By adding advanced form factors for sensing and haptic feedback, researchers can establish a tangible connection with the physical world during the creation of robot action primitives.
- A task description framework based on modular and adaptable robot action primitives is set up to synthesize complex tasks.
Researchers at Intel Labs are addressing a key challenge in modern manufacturing: making collaborative robot programming more intuitive and accessible, especially for complex tasks involving fragile objects in ultra-clean environments such as semiconductor manufacturing. They have developed a novel framework that integrates mixed reality interfaces, tactile sensing and haptic feedback, and a modular robot action primitive system. Presented at the 2025 IEEE International Conference on Robotics and Automation, this approach enables programming cobots through intuitive human demonstrations, allowing users to define tasks such as grasping and placement by physically showing the robot what to do, rather than writing code. The result is a system that can achieve high precision, empowers non-robotic experts to create dependable automation recipes efficiently, and offers enhanced handling of delicate materials, further reducing risks compared to human operators in demanding settings. This research holds significant promise for democratizing robot programmability across industries handling sensitive items, such as semiconductor, pharmaceutical, and chemical sectors.
Programming collaborative robots for complex tasks like precise manipulation of fragile objects in demanding environments such as semiconductor manufacturing facilities presents significant challenges. Traditional programming methods often require specialized robotics expertise and can be costly to implement and scale, especially for high-mix, low-volume scenarios involving diverse product sizes and materials. Furthermore, handling delicate items demands highly sensitive tactile perception and precise control that current sensor technologies and end-effectors often struggle to provide consistently and without contamination. This gap limits the broader adoption of cobots for sensitive automation tasks and highlights the need for more intuitive and accessible programming paradigms that can empower a wider range of users.
Figure 1. Teaching robots with touch and sight. Cobot programming-by-demonstration through cost-effective and easy-to-deploy mixed reality interfaces are grounded on novel form factors for tactile sensing and haptic feedback interfaces. This approach empowers non-experts to rapidly and intuitively create robot action primitives, composing dependable cobot task-flows for automation during inspection and manufacturing.
To overcome these limitations and democratize cobot programmability, the researchers have developed a novel framework that provides a more natural way for humans to teach robots. At its core is the use of mixed reality interfaces for programming-by-demonstration and teleoperation. This allows users to define robot actions by simply showing them in a shared digital-physical space, removing restrictions related to physical scale, vantage points, and even the need for the human and robot to be in the same location. Crucially, this framework incorporates advanced tactile sensing and haptic feedback through novel devices. These provide rich, localized tactile sensations to the user's hand, giving a tangible connection to the robot's interactions with the physical world during programming and teleoperation, which is vital for handling fragile items with care. These demonstrations are structured into a simple modular task description framework based on adaptable robot action primitives, allowing complex workflows to be built from fundamental actions such as grasping and placing.
Figure 2. The real-time visualization of 3D assets within a unified spatio-temporal kinematic frame enables the overlay of reconstructed surfaces from RGBD data, creating a mixed-reality environment for cobot teleoperation. a) The kinematic registration reference frame of the head-mounted display. b) Handheld controls facilitate visualization of feasible actions. c) The 3D surface reconstruction combining depth and color streams blends reality with registered CAD/CAM models, showing kinematic frames for grasp primitives in relation to the tray, labeled as d) pre-grasp, e) grasp and f) post-grasp. g) Finally, the user’s hand in mixed reality demonstrates the system’s low latency.
Inside the System: How Intuitive Cobot Programming Comes to Life
To achieve highly precise yet easy-to-understand programming for collaborative robots, especially for sensitive tasks, this research integrates several cutting-edge technologies. Think of them as the core building blocks that allow humans to teach robots by showing, feeling, and directing, rather than writing complex code. This approach makes robot automation more accessible to people without specialized robotics knowledge.
At the heart of the system is a mixed reality (MR) interface used for programming-by-demonstration and real-time teleoperation (see Figure 2). Using a head-mounted display, users step into a blended physical and digital world where they can see the real robot and workspace overlaid with virtual elements like robot paths, target locations, and even a digital twin of the environment. This provides an intuitive visual environment for demonstrating tasks directly to the robot. Key elements of the MR interface include perceptual immersiveness, allowing users to view the scene from any vantage point and scale. In addition, the interface employs spatio-temporal consistency, which uses 3D vision and object recognition (like AprilTags) to register objects and the robot in a common reference frame, ensuring visualization and feedback are accurate regardless of position or scale. Tangible actionability, where integrated handheld controllers with haptic feedback allow users to send commands and mimic human actions in the shared space, is also included in the MR interface. Users can guide the robot's movements remotely and precisely using these handheld controllers, translating human intentions into robot actions within this shared mixed reality space. This system eliminates limitations of physical scale, user vantage points, and even the need for the human and robot to be in the same physical location using a digital twin (see Figure 3).
Figure 3. Task representation via action primitives. A sequence of modular primitives is combined to execute complex tasks autonomously. Each primitive is grounded on perceived objects, allowing for flexible and adaptable task execution. The zoomed-in views display the JSON files generated, describing the task sequence and the parameters for each primitive.
The system also incorporates advanced tactile sensing specifically designed to overcome the limitations of existing solutions for handling small, fragile objects in demanding environments such as semiconductor manufacturing. Traditional sensors often lack the necessary sensitivity, dynamic range, and cleanliness requirements for these applications. The novel sensor-actor units in this framework are designed to achieve human-like precision in delicate object handling. They feature integrated controllable illumination and a replaceable transducer, capable of capturing contact pressure at a high frequency (1 KHz) without using materials that could cause contamination. This allows for automated material handling that reduces risks associated with cleanliness, damage, and human error, optimizing pickup position, orientation, and lighting dynamically for sample sizes ranging from tip contact 5-7 mm up to 200 mm when gripper is fully opened (see Figure 4 top image).
Figure 4. Enabling robots to feel and share the sensation. a) The integrated active illumination and b) replaceable-transducer in the tactile sensor capture contact's pressure at high frequency (1 Khz) without contaminating materials. The haptic handheld feedback device allows the user to sense c) contacts and forces applied to and by the robot with d) low-latency (approximately 2 ms) and low-cognitive load e) opening new channels of effective communication between the human and robot in mixed reality.
Coupled with tactile sensing is a novel haptic feedback device integrated into the handheld controllers. Current mixed reality controllers typically provide only limited vibrotactile feedback, which isn't detailed enough for complex manipulation tasks. This prototype utilizes a flexible membrane with strategically placed linear resonant actuators (LRAs). These actuators are positioned to align with the areas of the human hand most sensitive to touch, such as the fingertips, stimulating mechanoreceptors to provide localized and varied tactile sensations. This 16-stimulation points arrangement allows the device to convey multi-dimensional haptic cues, such as indicating the direction and intensity of contact forces perceived by the robot's end-effector. The design ensures low latency by using embedded haptic patterns and a soft mechanism to isolate vibrations, supporting low cognitive load interactions and improving user situational awareness and spatial coordination during teleoperation. This tangible connection through touch is crucial for creating robot action primitives and achieving precise interactions with the physical world (see Figure 4 bottom image).
This research successfully demonstrates a unique framework integrating mixed reality programming-by-demonstration, advanced tactile sensing, and haptic feedback to address critical challenges in collaborative robot automation for sensitive industries. By enabling high precision, providing an intuitive immersive user interface, and allowing for the efficient creation of robot tasks through real-world demonstrations, this approach significantly enhances the capabilities of cobots. The ultimate goal is to democratize robot programmability, making advanced automation accessible to a wider range of users and applications across industries handling delicate items, such as semiconductor manufacturing.
Looking forward, future work will continue to refine these integrated technologies and explore how more advanced AI models can leverage the rich data from multi-cue human demonstrations to further improve robot autonomy and adaptability, pushing toward even more capable and easily programmable robots.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.