Download Person-tracking and gesture-driven interaction with a mobile robot PDF

TitlePerson-tracking and gesture-driven interaction with a mobile robot
LanguageEnglish
File Size3.2 MB
Total Pages91
Document Text Contents
Page 1

Faculty of Engineering

Master Degree in

Artificial Intelligence and Robotics

Person-tracking and gesture-driven

interaction with a mobile robot using the

Kinect sensor

Supervisor Candidate

Prof. Luca Iocchi Taigo Maria Bonanni

Academic Year 2010/2011

Page 2

To this journey,

which reached the end.

To all those adventures

that have yet to come.

Page 45

Design and System Architecture 3.3 Software Components

ptz proxy : provides control for 3 hobby-type servos, for example to command

the actuators of a pan-tilt-zoom camera.

Compared to the other frameworks presented further on, chosen for their

strengths with respect to other products, Player is an obvious choice when

one wants a direct and simple interaction with a robot.

The other possible approach is the implementation of the drivers for all

Communication

through drivers

Communication

through TCP

connection

Communication

through client

proxy

Player

Server

Player

Client

Application

(a) Player interfaced with a
Hokuyo Urg Laser

(b) Player interfaced with a SICK
Laser

Figure 3.9: Two examples of connection with two different laser sensors.
Either in this case the Player provides client-side the same interface for both
sensors.

40

Page 46

Design and System Architecture 3.3 Software Components

the devices installed in the robot itself; clearly, this approach is extremely

time consuming, feasible only when dealing with highly critical scenarios,

where it is preferable to design ad-hoc software instead of relying on third-

party frameworks. Moreover, using Player we always have the possibility

of testing our application in different scenarios, like rescue robotics, simply

changing the robot, without worrying about modifications to the software of

our implementation.

3.3.2 OpenNI

As explained in section 2.2, both HRI and human-computer interaction

are focusing towards a novel interaction paradigm, through communication

means which have to be natural and intuitive for the humans, defining the

so-called Natural Interaction. This is the main purpose of OpenNI 2, where

NI stands for Natural Interaction, a cross-platform framework developed

by PrimeSense, which provides APIs for implementing applications, mostly

based on speech/gesture recognition and body tracking.

OpenNI enables a two-directional communication with, on the one hand:

• Video and audio sensors for perceiving the environment (have to be
compliant with the standards of the framework)

• Middlewares which, once acquired data from the aforementioned sen-
sors, return meaningful informations, for example about the motion of

a target

On the other hand, see Figure 3.10, OpenNI communicates with applica-

tions which, through OpenNI and middlewares, extract data from the sensors

and uses them for their purposes. OpenNI offers to the programmers the

portability of applications written using its libraries: a sensor used to per-

form video acquisition can be easily substituted, without the need of modify

the code.

Following the breakthrough of the Kinect, beyond OpenNI arose a broad

variety of frameworks, enabling the communication with the device, as OpenK-

2http://www.openni.org/

41

Page 90

List of Figures

2.1 Illustration of the main steps of an object-tracking algorithm . 15

2.2 Di�erent target representations. (a) Centroid, (b) Set of points,

(c) Rectangular model, (d) Elliptical model, (e) Complex model,

(f) Skeleton, (g) Points-based contour, (h) Complete contour,

(i) Silhouette. [Courtesy of Alper Yilmaz] . . . . . . . . . . . 16

2.3 HMM for gesture recognition composed of �ve states . . . . . 23

3.1 Complete schema of the application. . . . . . . . . . . . . . . 30

3.2 A view of the system architecture composed of Erratic, Kinect

and a Pan-Tilt Unit. . . . . . . . . . . . . . . . . . . . . . . . 31

3.3 A view of the ERA equipped with a Hokuyo URG Laser. . . . 33

3.4 A view of the Kinect. . . . . . . . . . . . . . . . . . . . . . . . 34

3.5 The infrared rays projection on the scene, recognizable by the

bright dots, which identi�es also the �eld of view of the Kinect. 35

3.6 View of the projection pattern of the laser trasmitter. . . . . . 36

3.7 Pan-Tilt system equipped on our ERA. . . . . . . . . . . . . . 37

3.8 Two examples of possible connection with two di�erent robots.

It is worth noting that, client-side, the interface provided is the

same. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.9 Two examples of connection with two di�erent laser sensors.

Either in this case the Player provides client-side the same

interface for both sensors. . . . . . . . . . . . . . . . . . . . . 40

3.10 Abstract view of the layers of OpenNI communication. . . . . 42

3.11 Layered view of NITE Middleware, focusing on its integration

with OpenNI. . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

85

Page 91

LIST OF FIGURES

4.1 Main steps of the person-tracking subsystem. . . . . . . . . . . 46

4.2 Reference frame of the Kinect. . . . . . . . . . . . . . . . . . . 48

4.3 ∆Pan computation: CN represents the position offset of the

target between previous and current frame, OC is the depth

of the target in the current frame. The angle is derived com-

puting the arctangent of CN over OC. [∆Tilt is computed

analogously, with respect to Y and Z axes] . . . . . . . . . . . 50

4.4 Result of user’s detection and computation of his center of

mass, labeled by 1, using OpenNI. . . . . . . . . . . . . . . . . 51

4.5 Depth informations of the scene acquired by the Kinect. . . . 54

4.6 Background elimination performed by the algorithm. . . . . . 55

4.7 Approximation to a rectangle/square of the most promising

blob returned by the blob expansion algorithm. . . . . . . . . 58

5.1 Illustration of the steady gesture. . . . . . . . . . . . . . . . . 62

5.2 The execution of the swipes along the horizontal plane. . . . . 63

5.3 The execution of the swipes along the vertical plane. . . . . . 64

5.4 Illustration of the steady gesture. . . . . . . . . . . . . . . . . 65

5.5 Main steps of the gesture-driven interaction subsystem. . . . . 66

6.1 Illustration of the tracking experiment design. . . . . . . . . . 73

6.2 Map of the lab basement where we performed the joint exper-

iment, highlighting the path to cover. . . . . . . . . . . . . . . 76

86

Similer Documents