On the Benefits of Speech and Touch Interaction with Communication Services for Mobility Impaired Users

Although technology for communication has evolved tremendously over the past decades, mobility impaired individuals still face many difficulties interacting with communication services, either due to HCI issues or intrinsic design problems with the services. In this paper we present the results of a usability study, conducted with a group of five mobility impaired users, comprising paraplegic and quadriplegic individuals. The study participants carried out a set of tasks with a multimodal (speech, touch, gesture, keyboard and mouse) and multiplatform (mobile, desktop) prototype, offering an integrated access to communication and entertainment services, such as email, agenda, conferencing and social media. The prototype was designed to take into account the requirements captured from these users, with the objective of evaluating if the use of multimodal interfaces for communication and social media services, could improve the interaction with such services. Our study revealed that a multimodal prototype system, offering natural interaction modalities, especially supporting speech and touch, can in fact improve access to the presented services, contributing to a better social inclusion of mobility impaired individuals.


Introduction
Over the past decades, computer-mediated means of communication have evolved from simple text-based services, like Bulletin Board Systems (BBSs), Internet Relay Chat (IRC) and e-mail, to more powerful, multimedia-enabled services such as audiovideo conferencing, instant messaging (IM) and, more recently, social media services (SMSs).This communications revolution has made it virtually possible to reach, anyone, anywhere, anytime, with great ease and, in most cases, with reduced cost, when compared with traditional means of communication.Such evolution has led to increased interaction between people, in general, and opens up opportunities for those with mobility impairments to improve their social inclusion, thus reducing the impact caused by real-world barriers.However, mobility-impaired individuals still face several usability issues with current communication technologies [9].This paper presents a prototype of multimodal applications, both for the desktop and the mobile platform, that seamlessly integrates access to e-mail, agenda, audio-video conference and social media services, with a unified user interface, specifically designed based on the requirements of people with mobility-impairments (Pires et al., 2010).The prototype applications are capable of accepting user input and producing output through several modalities in a concurrent fashion, thus adapting to mobility impaired individuals' different requirements.This paper also presents a user study that we conducted to evaluate our prototype, with the aim to uncover whether its multimodal user interface offers benefits for mobility impaired individuals.
The methods applied in our usability evaluation study include open and semi-open questionnaires, interviews and naturalistic observation [10], while the users performed a set of pre-defined tasks, using several services and hardware devices, enabling different HCI modalities.We considered several types of computer-mediated communication services, namely, email, audio-video conference and social media services.
Our research reveals that multimodal interfaces, in particular the speech modality, are capable of significantly improving the perceived usability of an application by mobility impaired individuals.We have concluded that the multimodal interaction approach helps bypassing recurring problems that were observed experimentally, such as, difficulties in typing key combinations in the keyboard, or in hitting small buttons on mobile user interfaces.The multimodal approach also helps reducing the impact caused by an application's learning curve, effectively allowing users to choose which modality better suits their needs and limitations, depending on the use context.
The remainder of this paper is organized as follows.Section 2 presents some background and related work in the area of inclusive technologies for mobility impaired users.Section 3 gives a brief overview of the architecture and user interface of the prototype multimodal application, including its requirements and constraints.Section 4 describes the usability evaluation study, presenting details about the participants, the tasks that were performed and the analysis methods.Section 5 presents and discusses the study results.Finally, Section 6 presents the conclusions and draws some lines of future work.

Background and Related Work
Mobility impaired individuals have some physical limitations that influence the way in which they interact with computers and other devices.Reduced ability to handle input devices and corresponding HCI modalities, such as keyboards, mice, tactile devices and gestures-based devices, restricts the mobility impaired individuals' access to ICTs.The interaction and technological barriers, coupled with mobility difficulties in real-world environments, can severely limit these individuals' independence as well as lead to social isolation, which may originate a depressive state [12], [2].
To address some of these issues, several electronic inclusion initiatives have been launched by the European Union (E.U.), especially since the year 2000, focusing on aspects such as universal broadband access, accessibility enhancements, Ambient Assisted Living, thus enabling a better quality of life for these individuals [4], [3].
Although interaction with computing devices has evolved from simple keyboard and mouse, to more natural interaction modalities such as speech coupled with gesture or touch [7], [1], in some situations, adopting more natural means of interaction does not necessarily lead to better user interaction [9] .One way to overcome these issues, as well as reducing the impact of some of the limitations disabled people face, is by using multimodal interfaces [11], [8].This approach allows users to interact through a specific mixture of one or more HCI modalities, which are more appropriate to the user' interaction environment, to the semantic context of the dialog between the user and the system, his/her personal preferences or even disabilities.Thus, with multimodal user interfaces, should users be unable to speak, they could instead use a gesture interface or, in situations where they could not properly coordinate their arms or hands, a speech interface could be used instead.The advantage of multimodal interfaces is not only the ability to enrich the usability experience by allowing multiple means of interaction, but also the ability to use them in a seamless way, without explicitly requiring users to specify upfront the type of interface to use.Despite the advantages of this interaction paradigm and of natural interfaces, designers should also take into account universal design guidelines [5], to ensure that they are not excluding any user groups.

Prototype Design and Development
We developed a multimodal and multi-platform prototype that integrates access to email, agenda, audio-video conference and social media services, with a unified user interface, specifically designed based on the requirements of mobility impaired individuals.Such requirements were gathered through a prior user study [9], which involved a group of paraplegic and quadriplegic individuals.This section summarizes the guidelines and constrains that guided the prototype design and presents details about the application development and its user interface.

Design Guidelines and Constraints
As general requirements, we specified that the prototype should be available on mobile and desktop platforms, providing email, agenda, audio-video conference and social media services.Also, we took into account the target users of the prototype, and thus, we followed the guidelines and constraints regarding interface design for mobility impaired individuals, which were unveiled in a prior user study [9].In that study participants initially answered a short questionnaire focusing on their current ICT use patterns and difficulties, and subsequently performed a small set of scripted tasks to evaluate their current use of ICTs and interaction modalities.They were asked to use services, such as e-mail, agenda, audio-video conference and social media services' applications, as well as to use interaction devices enabling different HCI modalities, like traditional keyboards and mice, touch screens, gesture with smartphones equipped with accelerometers and speech recognition systems.From this study, followed a set of design recommendations: (1) Large graphic icons -The icons in the mobile and desktop interfaces should be large enough to be correctly used by both target groups, and additionally, not requiring precise movements and actions; (2) UI readability at some distance -Additionally, the interfaces should be readable at some distance, using large and clear text, allowing operation from fixed locations at some distance from the user, such as quadriplegics' wheelchair arms; (3) Multi-touch interaction -Interaction through multi-touch should be carefully implemented so as to not become a usability barrier for quadriplegic users who are unable to perform these gestures with ease.Users should be able to perform the same tasks offered through multi-touch with other available modalities; (4) 3D gesture interaction -Special care should be taken in the design of the 3D gesture interaction on the mobile platform, as most quadriplegic users aren't able to perform these gestures with ease or at all.It should be possible to disable 3D gesture interaction, without compromising offered functionalities, by offering access to these functionalities through other modalities; (5) Key combinations -The desktop platform interface should avoid key combinations, as most quadriplegic individuals have many difficulties using them; (6) Unified multi-platform interface -The mobile UI should allow easy usage, offering a feature set as close as possible to the desktop UI, in order to increase users' mobility; (7) Use of HCI modalities -All interaction modalities should be concurrent on both platforms, allowing users to resort to their preferred means of interaction.The most traditional interfaces such as keyboard and mouse should be supported as alternative means of interaction.We also produced a set of recommendations regarding communication services' HCI: (1) Email interfaces should be similar to existing ones, but simpler, with just essential features (subject, text, attach option and recipients); (2) Conference interfaces must be simple, with clear separated audio call and video call buttons (audio and video conference is an important service for mobility impaired users, since it is easier and more convenient to use than instant messaging); (3) Social media services interfaces must be simple enough to use and not resort to service specific jargon; (4) Social media services interfaces should be carefully designed to have a low volume of information on each page/window to reduce the user's learning curve.

Application Development
The need for mobility expressed by the participants, coupled with the requirement for a portable system that can be easily installed in different kinds of systems with minimal setup overhead, led to two possible architectural choices: a centralized architecture, where every component in the UI, logic, and data storage layers would have to be duplicated over all desired platforms, or a distributed architecture, including several server-side and client-side elements.The latter was thus deemed more favorable than the former due to a lower overhead in development, deployment and client setup, as only a server-side component would be required to process speech (both synthesis and recognition), modalities and some of the core logic and data storage components such as service access, publishing and session maintenance between devices.This architecture can be seen in greater detail in Figure 1.The prototype is divided in two main regions: (1) home -which represents the devices, equipment and HCI the user will have at home; and (2) backend -where backend servers and services are located.At home we have a mobile device (smartphone), a desktop device (touch screen computer) and a server (PLA server) that works as a mediator between mobile, desktop and backend.On the backend we have Exchange Server 2010 for agenda and email services and Office Communications Server 2007 R2 for conference services.Backend services also resorted to Microsoft's SQL Server 2008 as a means to store social media service credentials and configuration, as well as to several libraries that interact with service Application Programming Interfaces (APIs) (in the current prototype version, we support Twitter's, TwitVid's, TwitPic's Bit.Ly's and YouTube's API).
Technology wise, the prototype's server side components were built on a solid Windows Server 2008 foundation.Unified Communications API (UCMA) was used as a way to establish a voice channel between the desktop device and the speech server running on the PLA server, enabled with Microsoft European Portuguese (ptpt) speech recognition and synthesis systems.Also, Exchange Web Service Managed API was used by the PLA server, in order to interact with the Exchange Server.PLA server offers a unified web service API to all device clients.Client wise, the desktop application was developed using the Windows Presentation Foundation (WPF) and the .NET framework.The mobile application was developed for Windows Mobile devices, and also resorts to the .NET framework.
Taking into consideration the design recommendations in the previous section, we have developed two prototype applications, one for each platform.Being multimodal in their nature, all interfaces for these platforms, offer speech, touch and hardware input capabilities.Mobile interfaces also have 3D gestures available, through an accelerometer sensor.Figures 2 and 3 show the desktop application's home screen and social media service message authoring GUI, respectively.Here we can see the use of large icons, text and user controls, as per design recommendations from [9].These recommendations were also followed in the design of the mobile application, as can be seen in Figures 4 and 5, which respectively represent the application's home screen and email authoring GUI on the mobile platform.

Usability Evaluation Study
To evaluate the usability of the prototype described in the previous section, and its adequacy for mobility impaired individuals, we have carried out a small user study, comprising of five individual user sessions.The study participants were asked to complete a set of tasks with the prototype, focusing on the usage of email, agenda, audio-video conference and social media services.The following HCI modalities were always available, both in desktop and in mobility environments:  Keyboard and mouse (hardware devices)  Speech (in certain application contexts, command and control was enabled, whereas in others, simple dictation was possible)  Touch Additionally, 3D gestures could be issued by manipulating the mobile device, in certain application contexts.This section presents details about our user study methodology and results.

Participants
Five participants with mobility impairments, as well as a non-impaired control participant, took part in the user study.All participants were recruited from a panel of associate members of Associação Salvador (http://www.associacaosalvador.com/), a non-profit social solidarity organization dedicated to support the interests and rights of people with reduced mobility in Portugal, which is partnering this research.
Detailed information about the study participants is provided below (see Table 1).
Their profiles in terms of gender, age, professional activity and computer skills, were diverse.

Analysis Methods
During each session, video and audio were recorded for further analysis.From the structured tasks (email and agenda), quantitative results were extracted.We considered the following: (1) time to complete a task (in minutes), from the time each participant was instructed to do a task, until the task was completed; (2) number of aids -number of times participants asked for an aid or were helped; (3) modality count -number of times a modality was used to accomplish a single action (e.g.select a text box or a button count).A modality was counted only when all three modalities (keyboard + mouse; touch; speech) were available for the same action.All three modalities were mutually exclusive and counted as well.In case of failure of one modality, only the first chosen modality was counted.Quantitative results shouldn't be strictly interpreted, as the sample size isn't statistically relevant, and are considered as mere guidelines towards future evaluation.
As qualitative results we considered: (1) result -level of task completion; (2) observations -our point of view of participants' performance on doing the task; (3) participants' opinion -opinions given by participants while performing the tasks.

Study Tasks
Participants were asked to perform a set of pre-defined tasks, in order to properly use the prototype functionalities.Since the prototype was available on mobile and desktop, participants were invited to use both platforms.Also, participants were free to choose which modality to use at any time, allowing us to better understand why they chose a specific set of modalities in each use context.All tasks were performed with no specific order, in order to minimize possible task sequence bias in the interpretation of the results across participants.Prior to performing the study tasks, each participant was given a brief overview (~5 minutes) of which features and modalities the prototype offered.Participants were also invited to experiment with modalities they hadn't previously used, in order to familiarize themselves with the prototype.The user tasks were performed in individual sessions with each participant and in a controlled environment.The user tasks are described below.
In the email task, participants were asked to check their email inbox, create a new email and send it to an existent contact and to an email address that wasn't available in the contact listing.In the agenda task, participants had to create a new appointment and then delete it.They were also asked to navigate through their agenda.Both these tasks were designed to be conducted on the desktop platform and the script was the same for all participants, so that the execution times could be compared.
In the conference task, participants were asked to make an audio call and a video call, using either the desktop or the mobile platform (see section 4 for details).This task was designed with the purpose of testing if there are any advantages in using mobile devices as extenders, that is, remote controllers for the desktop application.
In the mobile environment task, participants were invited to freely check their mailbox, send an email to a contact, and create a new appointment on the agenda.As the name implies, this task was conducted on the mobile platform, with the purpose of giving participants the possibility to explore the prototype's mobile version.
In the Social Media Services task, participants were asked to perform a set of scripted activities from supported services like Twitter or YouTube, and derivatives like TwitPic or TwitVid.The tasks conducted focused in the areas of message viewing and publishing, as well as content viewing, search and publishing, namely, photos and videos, both in on-line services and in a local gallery.
After executing these tasks, participants were asked to fill a small questionnaire to gather information on how easy and pleasurable it was to use each HCI modality and associated hardware, as well as each service.

Email Task (Desktop)
Being email a well-known application, most participants did not encounter major problems when dealing with the interface.As shown in Figure 6, mean execution time differences between paraplegics and quadriplegics are quite low (i.e., forty-five seconds).Also, differences between the control subject (non-impaired individual) and impaired participants are low (i.e., twenty-four seconds).
These results suggest that by giving mobility-impaired individuals a simpler and multimodal-enabled email interface, we´ve managed to improve their interaction with this kind of applications.Furthermore, participants used more often speech and touch, rather than traditional hardware (keyboard + mouse) interfaces, especially quadriplegics (see Table 2).Regarding typing, there are some issues to consider with quadriplegics.When the physical limitation level is higher and participants cannot use their hands and arms (e.g., as it was the case of subject 1), they resort to speech.When the quadriplegia level is lower (e.g., subject 4), they can resort to speech or touch, not only just to write, but also to select items.Also, subject 4 said that it was easier for him to type with a virtual keyboard rather than with dictation.Other participants, however, were more interested in speech dictation interaction, rather than in other modalities.So, in these cases, multimodality gives users the freedom of choice and the possibility of alternating between modalities, as they see fit.

Agenda Task (Desktop)
Regarding the agenda interface, participants considered it easier to interact with, than email, as well as with applications they usually deal with.Similarly to the email task, in this task participants used more often alternative modalities (speech, touch) than the traditional ones (see Table 3).
Execution time differences between quadriplegics and paraplegics (eleven seconds) and between impaired and control participant (eight seconds) are considered negligible.Taking into account that subject 4 has an advanced quadriplegia state, and did not have problems dealing with the interface, we can consider that participants managed to overcome their current limitations by using alternative modalities.Due to the nature of this task, in which participants had to select items, touch and speech seemed to be the best indicated modalities.As opposed to the previous task, we've

Conference Task
For this task we hypothesized the following: "if speech is not available on the desktop, can a remote controller be used as a replacement?".As observed in other tasks, individuals with advanced quadriplegia can only use speech or gaze-based devices.As such, what would be preferable if these modalities were not available?In order to test this, the conference interface was only available on the desktop without speech input, but it could be controlled using a smartphone.Quadriplegics considered that this approach could have many advantages, namely when they are away from their computer or simply cannot use speech.On the other hand, paraplegics did not feel that this approach was very useful.Nonetheless, all participants considered the interface and the overall interaction simple and easy.

Mobile Environment Task
In the mobile task, participants were invited to try email and agenda on the mobile device, by doing common tasks like reading received email or creating a new appointment, according to a previously elaborated script.Both email and agenda offered similar functionalities to those offered on the desktop, but with reduced functionality in some cases.
During this task we noticed that quadriplegics experienced difficulties using touch on mobile devices with resistive displays and also using 3D gestures.3D gestures were available as a non-exclusive way of selecting items on lists.We observed that quadriplegics unintentionally triggered 3D gestures events when handling the smartphone.
Due to technical limitations, speech was only available by a "push-to-talk" (PTT) feature, making the application a little harder, and in some way, confusing to use.Also dictation mode was not available.
Despite these issues, all participants considered the interfaces simple and easier to interact with, when compared with existing commercial applications, which allow access to communications' services, both in terms of interaction (taking into account requirements such as buttons size) and interface.Also, having a set of important functionalities to mobility-impaired individuals, available on a mobile device, was considered to be very important.

Social Media Services Task
In this task, participants overall enjoyed the application, finding it easy to use, with a low learning curve, having clear, large and easy to use UI controls.Participants preferred to use modalities they were more used to, such as the keyboard, be it the virtual on-screen variant or the physical keyboard.They felt, however, that touch interaction was more natural, in detriment of the mouse, a tendency that was verified in all participants, as the mouse wasn't used at all during the proposed tasks.
With regard to speech interaction, several factors influenced the overall experience, such as the participant's tone, volume and speech rate, among other factors.Voice interaction in command-and-control mode proved very effective, should the participant be able to project his or her voice with enough volume to be captured by the device's microphone.Depending on the subtask, dictation mode either worked as expected, or didn't produce adequate results.
Participants, however, felt that the combination of different input and output modalities can significantly improve their daily interaction with SMSs, attributing advantages to this interaction approach such as the ability to change to more suited modalities based on privacy needs, to perform tasks that they don't know exactly how to perform with a specific modality, to allow them to multi-task on their daily lives, or to improve mobility, be it to interact with a mobile device while driving or while traveling.

Questionnaire
At the end of each individual session, we asked each participant to fill in a questionnaire.First, we asked the participant to rate how difficult it was for them to use each evaluated modality, according to a 6 point Likert scale [6], ranging from Impossible (1) to Very Easy (6).As can be seen in Figure 7, participants overall found the desktop modalities easier to use, as opposed to mobile, mainly due to the higher definition of the desktop's peripherals (microphone and screen), when compared to the mobile device.
The second question focused on how much participants enjoyed using each modality.This question also resorted to a 6 point Likert scale, ranging from Hated (1) to Loved It (6).What we found here was that, overall, participants' opinions matched the results from the previous questions in all, but speech interaction, leading us to believe that, due to certain limitations with speech dictation interaction, participants were underwhelmed with how functional speech interaction could be to them.These results can be seen in Figure 8.
Subsequently, we asked whether participants considered that the prototype application would improve their daily tasks.The overall consensus was that it would.However, some quadriplegic participants noted that they would find the mobile prototype more useful to them, derived from their on-the-go communication needs.Question 4 focused on the prototype's UI ease of use.In this case, participants also answered that they found the UIs easy to use.
Finally, when asked which additional features they would like the prototype to offer, the majority of participants said they would like the prototype to offer access to additional social media services, like Facebook, as well as to other types of services, like daily newspapers, weather, movies, but also to other types of media content like on-line and off-line music playback.

Conclusions and Future Work
In this paper we presented the results of a usability study conducted with a sample of mobility-impaired users who were asked to carry simple tasks using a set of multi-platform (desktop, mobility) and multimodal (keyboard + mouse, touch, speech, gesture), prototype applications.The prototype followed design guidelines regarding HCI identified in previous work [9].
As anticipated based on the results of previous work, we observed that the preferred modalities for mobility-impaired individuals are speech and touch.It was very interesting to observe that participants opted to use alternative HCI modalities than the traditional ones, such as keyboard and mouse.Regarding 3D gestures and 2D gestures quadriplegics had some issues when using them, and so, as recommended in the design guidelines, those modalities had less obtrusive alternatives that offered the same feature set in the prototype.
When dealing with email, agenda, conference and social media services, participants felt these interfaces were simpler and easier to interact with, than current ones.Regarding the mobile device, we note that enhanced speech interaction, namely through the support of better dictation experience, can potentially improve the perceived usability of this platform, and as such, this platform could have an important role with quadriplegic individuals, by increasing their mobility.
As future work, besides improving the current prototype, to overcome some issues with the mobile platform, feature set extensions should be taken into account, mainly to support other kinds of services that were considered important by the study participants.Further work should also focus on long-term studies, by evaluating how the prototype works on participants' daily lives, as well as to determine if the results reported in this paper, can be extrapolated to a larger user audience and other languages and cultures.
current pt-pt speech engine works well for item selection and command and control, rather than for dictation.

Table 1 .
Participants of the User Study panel.

Table 2 .
Email task modality count.