Human Motion Capture and Identification for Assistive Systems Design in Rehabilitation. Pubudu N. Pathirana
human eyes directly.
Furthermore, Kinect utilised the other two techniques to further process the information to generate depth maps. These two tools include depth from focus and depth from stereo [121]. The principle of the former is that the further away the object is, the more blurred it will be [125], while the latter utilised parallax to estimate the depth information.
Different from the first version of Kinect, the second version (refer to Figure 1.6) measures the depth information with the time‐of‐flight (ToF) technique [189], which is stated as the distance that can be measured by knowing the speed of light and the duration the light uses to travel from the active emitter to the target. Estimated in Lachat et al. [189], this version of Kinect utilised the indirect time‐of‐flight, which measures the “phase shift between the emitted and received signal”. The depth is computed as
(1.1)
where f is the modulation frequency, c is the light speed and Δϕ is the determined phase shift.
Figure 1.5 An example of the projected pattern of bright spots on an object [328]. Source: Shpunt and Zalevsky [328].
Figure 1.6 Appearance of Kinect version 2. Source: Evan‐Amos, Image taken from https://commons.wikimedia.org/wiki/File:Xbox-One-Kinect.jpg.
As for the accuracy of joint positions tracked by the first version of Kinect, some studies have been done. Different studies have come to different conclusions on the accuracy of skeleton joint tracking. For instance, Webster et al. [376] reported that the accuracy was around 0.0275 m after removing offset by resetting the alignment of the average points for each record set. Therefore, they concluded that Kinect version 1 was sufficient for clinical and in‐home use. Obdrzalek et al. [263] evaluated the accuracy of the first version of Kinect, PhaseSpace Recap and Autodesk MotionBuilder in the environment of coaching elderly people. The result reported in their paper is that the error of the skeleton built by both Kinect and MotionBuilder is around 5 cm. However, in general postures, the accuracy is about 10 cm due to unavoidable factors, such as occlusions. Therefore, they suggested that the current skeletonisation approach enabled Kinect to measure general trends of movements, while an improved skeletonisation algorithm should be investigated if Kinect was used for quantitative estimations. Furthermore, Xu et al. [388] evaluated the accuracy of both the first and second versions of Kinect for static postures. From their experiment, they concluded that the accuracy of the first version Kinect varies from posture to posture. For instance, the error was only 26 mm for a shoulder centre in an upright standing posture, while it was 452 mm for the right foot joint in a sitting posture with the right leg on top of the left one. By comparison, for the second version of Kinect, when the right foot was raised, the error of the left elbow was only 26 mm, while it was 418 mm when the right leg was on the left one. As a result, it was concluded that though the resolution of the second version of Kinect had been improved significantly, its tracking accuracy of the joint centre had not improved.
The comparison of the specifications between the two versions of Kinects is shown in Table 1.3. From the comparison, it is obvious that the second version of Kinect provides a larger viewing angle, a higher resolution in both depth images and colour images, and more tracking joints.
Though Kinect was initially developed for gaming, it is widely applied in tele‐rehabilitation as a non‐invasive and affordable motion capture device. A telerehabilitation system (KiReS) using Kinect as the motion capture device has been proposed. On the patient side, two avatars were displayed to represent the motion recorded by the therapist (reference motion) and that performed by the patient. Therefore, the patient was able to see the differences between his/her motion and the reference. Eventually, the incorrect movements could be corrected over time. On the therapist side, new motions could be created to suit the patient's conditions by composing various existing movements or recording completely new ones. Luna‐Oliva et al. [217] utilised Kinect Sports ITM, Joy RideTM and Disneyland AdventuresTM to provide telerehabilitation services to children with cerebral palsy in their school. Their experimental results showed that it is feasible to use Kinect as a therapeutic tool for children with cerebral palsy and the improvements in global motor function could be the result of using this tool. Ortiz‐Gutiérrez et al. [268] applied Kinect in providing telerehabilitation services to patients with postural control disorders. The experiment results showed an improvement over a general balance in both groups. In the experimental group, the significant differences resulted from visual preference and the contribution of vestibular information.
1.3.2 RGB camera and microphone
Apart from Kinect, conventional RGB cameras and microphones are also pervasively used, especially in the early stages of the history of telerehabilitation when virtual reality devices had not been well developed and pervasively utilised. One of the potential reasons is that they are easy to install and are cost‐effective and well‐developed.
In the early stages of the development of telerehabilitation, the plain old telephone system (POTS) was widely used as the infrastructure of videoconferencing, which was sufficient to provide a teleconsultation. Delaplain et al. [93] made a pioneering trial between two islands to conduct 59 medical tele‐consultations in the form of a videoconference. In this trial, diagnostic and therapeutic decisions were made in a number of specialities, including physical therapy. This is deemed to be one of the first examples of applying videoconferencing in telerehabilitation (although, at that time, the word “telerehabilitation” had not been invented) with cameras and microphones. Its success illustrates the feasibility of using a videoconference in telerehabilitation. Later, in 2002, Clark et al. [72] successfully managed a teletherapy case for 17 months. In this case, a POTS was set up between the therapist's site with a desktop videophone and Mrs M's home with a traditional telephone and a television to provide post‐stroke telerehabilitation services in the form of a two‐way interactive videoconference. However, the lesson learnt from the case is that the use of this novel approach has a number of requirements for the patients, as well as the caregiver, which may be a potential issue in developing telerehabilitation systems in the future. As a conclusion, they mentioned that although telerehabilitation cannot totally replace the conventional way to deliver rehabilitation services, it indeed contributes to traditional therapy. Furthermore, Savard et al. [316] reported two cases of using a videoconference to provide a teleconsultation service for neurological diagnoses. Different from the previous two studies where the videoconference systems were utilised by patients at home, the systems in Savard et al. [316] were installed in clinics. Therefore, patients had to visit the clinics in order to use these facilities. Although it was found that the time taken for a tele‐consultation was similar to an in‐person consultation, the former was more efficient as multiple parties were able to participate in the consultation simultaneously. However, completing remote tests for clinicians would be an important factor that gives telerehabilitation an advantage.
Table 1.3 Comparison of basic technical specifications between two versions of Kinects. Sources: Based on Sempena et al. [322];