TEXT-BASED ENVIRONMENT FOR BLIND PERSONS: CONCEPTION AND OPERATING SYSTEM DESIGN

Research article
Issue: № 1 (8), 2013
Published:
2013/02/08
PDF

Pozhidaev M. S.

Post-graduate academic degree in technical sciences, ALT Linux Company, Tomsk State University

TEXT-BASED ENVIRONMENT FOR BLIND PERSONS: CONCEPTION AND OPERATING SYSTEM DESIGN

Abstract

The paper covers conclusions of the attempt to create a prototype of text-based user environment without graphical objects and operating system based on it. Acquired results prove conception of handy device for wide range of blind users. Technical details of further research are suggested.

Keywords: accessibility, blind persons, GNU/Linux, Java, operating systems.

Introduction

Various information technologies became an essential part of social and professional life but level of their accessibility for blind and visual impaired persons remains still insufficient. Some results and experience in this area are acquired so far, nevertheless, blind and visual impaired users are not able to fill themselves completely free using PCs and mobile devices. That fact can be treated as one of the reasons tackling high integration of disabled people into social life.

Currently the popular way is to install some screen reading software such as Jaws for Windows by Freedom Scientific [11], VoiceOver by Apple Inc. [2]as well as Orca [15] — the favorite solution of GNU/Linux users. This software takes the information on screen and transforms it into speech form, describing any action user does. Today there is the ubiquitous Graphic User Interface (GUI) almost on every computer and it is the most important problem we should take into account here. This type of user interface (UI) was designed to be controlled mostly by mouse as well as other pointing devices. This input method is rather convenient for sighted users but for blind persons it is not the case. They are enforced to navigate over graphical objects on a screen such as window widgets, menus and dialogs by keyboard only and this work takes a lot of extra time. The mentioned way can be considered as appropriate for work at home or at office but in crowded and noisy environment such as at international airport, show or conference it is turned out completely unacceptable. In addition running OS with full GUI takes more hardware resources than it is needed for accessible solution itself. This paper offers an accessible OS of a new design based on GNU/Linux and covers the experience already got by creating preliminary prototypes.

  1. First prototypes

The work on the initial prototype was launched in 2008 as an attempt to create light accessible distribution for blind users [21, 22, 23, 24]. It uses GNU Emacs [8] as a main user environment with added special extension called Emacspeak [6] developed by T. V. Raman from Google Inc. The system is based on the Linux kernel and is applicable for installation without any external sighted help except boot device selecting in BIOS setup utility which does not use any features from external software and evidently is not compatible with any speech based accessibility. The installation process is performed by copying live-CD environment to a hard drive as it is with several minor fixes. We will not describe this technique in details here since it is widely popular and well-known approach. The only important thing is that cloning process doesn't require any UI and can be done by single command launch with provided set of necessary parameters.

The environment was enhanced with some additional software developed by the authors as part of work on the system being described. This software includes tools for text books listening with text-to-speech (TTS) engines, main menu plug-in for quick applications launch, media player control service (will be described later), the set of auxiliary scripts and so called speech server. Speech server is needed to manage stream of speech commands in real-time mode and to prevent simultaneous speech signals overlapping. It doesn't synthesize speech by itself and uses external TTS engines, in particular RHVOice [16].

Although this distribution failed as instrument for wide range of users it brings important experience proving the general conception is right. As known, GNU Emacs aims to be a flexible text editor but during developing process list of its features went out of editor purposes. Now it contains file manager, mail and news reader, calendar, FTP client and very restricted web browser. The crucial advantage allowing consider it as an accessible environment is an ability to bring to user every working object in text form. For example, mail message can be easily constructed as text file with recipient address on the first line, a subject on the second and with message content on all other lines. Nearly all needed operations could be reorganized this way if they don’t imply graphic material. GNU Emacs also shows in text form web-pages aligning content as one text document and excluding pictures. While user navigates over prepared text object the speech extension listens all cursor movements and forwards stream of commands to speech server for further translation into audio form for user. If user moves from one text line to another he hears line text under new position and hears new character in case of moving left-right on one line.

This system became a working platform for its developers for several years and it makes possible to get some conclusions about its advantages and disadvantages. The main advantage is a very high speed of working process. With some user training speed approaches to speed of usual sighted work. It can be explained by user interface simplicity since it doesn't have any GUI elements any more. A user doesn't take care about what objects there are on a screen and how they are organized. Even more, any association between objects and their graphic representation is not required at all. At same time light environment yields low system resources consumption and it makes possible using of cheap mobile devices. One of the installations was performed on ASUS EEE PC 1025C netbook ($330 in Amazon shop) and it was successfully used in travel and during conferences participating. That experience implies tasks successfully done during flights and work inside of crowded and noisy airports including three of busiest airports in Europe (London Heathrow, Paris Charles de Gaulle and Frankfurt-am-Main) and three international Moscow airports. The set of performed tasks covers maintaining wifi-connection for mail and news reading, handy notes, mp3 and text books reading and rarely voice recording.

Actually, a wide range of accessible tasks can be easily explained by large variety of software available among GNU/Linux distributions but some additional possibilities should be mentioned especially detailed. There are two graphic software packages: Latex [12] and Lilypond [13], capacitated for taking input material in form of text files. Latex is a publishing system for physics and mathematics books or papers and Lilypond is a music score typesetter. Both of them gives a very high quality of an output and remains accessible for blind users. Known experience proves they are a real way to create materials even at scale of dissertation. Final sighted checking of an issue is still required but work is done mostly by blind persons on their own. That software is a very special case due to high required user skills but we would like to mention it here. Latex can produce also presentations in pdf-format (proved by known experience of blind users), can include graphic figures constructed by text form commands and embed music objects from Lilypond.

We took a look what this system offers, but more important is the set of discovered disadvantages making prepared system inappropriate to be used widely. The developers team is about to give up any active work on GNU Emacs approach due to following reasons:

  1. Weak protection against improper user actions. Even a very easy inaccurate step breaks proper system behaviour. GNU Emacs completely relies on user awareness what he does.
  2. The platform is not suitable for developing complex user applications. The GNU Emacs is a Lisp run-time environment purposed for managing a set of text areas. All of such areas called "buffers" linked weakly with each other and any attempt to create an application with more than one buffer gives generally unstable product with a lot of glitches. The system must be capable for quick extensions creating because user needs various add-ons for social networks instruments, blogs reading and other web-services access.
  3. Web-browser without Java Script support. Web-browsers became something grater than applications for HTML-pages viewing and may be described in turms of a platforms for launching web-applications. GNU Emacs has a text-mode browser that can be used for HTML-pages parsing but nothing else.
  4. User unfriendly interface. The environment of GNU Emacs can be considered suitable only for professional users with high level of experience in GNU/Linux. No national language support is available.
  5. No support for some closed but freely available application such as Skype and maybe some others. Skype has an accessibility support but it cannot be covered by any GNU Emacs utilities.
  6. No support for popular office documents formats.

Since a lot of free and open source software is available publicly it should be involved in new accessible environment creation. In two sections below we will describe new UI approach and general design of accessible OS free from disadvantages listed above as far as it possible.

  1. New user environment

New user environment should consist of following items on a screen:

  1. A set of tiled working areas.
  2. Easy access command line for quick operations launch.
  3. A popups support for dialogs and interactions raised depending on applications functions.
  4. Main and context menus.

Nearly all of these things came from GNU Emacs as they are but they should be managed by general application mechanism which GNU Emacs doesn't have. Every text area should be strongly associated with some application and must be closed on application shutdown. All text areas are displayed in tiled mode on the screen and currently active application decides which of them and how must be shown. User should be allowed to switch from one application to another quickly (for example, by Alt+Tab key combination) with consequential switching of all corresponding visible areas on a screen. One more important feature is a quick text search through text in any visible area without application support.

The term "text area" we used here implies rectangular area on the screen filled with some text shown by monospaced font. Text color and size should be available for changing easily by system-wide commands without any application support. On screen content can be used by users with low vision (not totally blind). For low vision users very useful can also be a feature of highlighting with easy grey background row and column where cursor is placed.

A command line support plays highly important role in suggested approach. In contrast to widely popular GNU/Linux shell expressions command line we are describing should accept short words suitable for quick typing. Inside of noisy room it is easier to press Alt+x on keyboard to invoke command line, type "news", "mail", "message or whatever else user wants to launch new application than to look for required item in menus listening speech output. The keyboard combination Alt+x is taken from GNU Emacs and no doubt it is really convenient way. We consider command line here as particular case of "popup" areas. Popup areas are text areas shown on screen at bottom for a short time and purposed for additional user interactions and conversations. Various types of menus can also be implemented as popup areas.

Java SE [10] is chosen as the main programming language for developing accessible environment itself as well as for all user applications. There are a lot of well-known Java libraries available publicly providing various functions needed for user applications. We mean here JavaMail [9] for mail reading, Rome [18] for RSS parsing, Apache POI [1] for office documents filters, etc. A user interface layout for each particular application is not described here since in most cases it is quite obvious. For example, mail reader should consist of three areas: mail groups, a list of mail messages in a group and selected message text. File manager looks like usual twin-panel manager with additional area at a bottom of screen for list of active tasks (files copying, moving, etc). General user navigation over text objects and whole system can be taken from GNU Emacs and left almost without any changes. It is a very good part of GNU Emacs legacy except of one thing: spoken text must be constructed by application and it may be different than text on screen. In GNU Emacs user hears always a text on screen.

  1. The system core

We should take a close look at system core working behind accessible UI. It consists of several components listed below:

  1. Event queue launched inside of Java virtual machine in separate thread.
  2. LDAP interface for flexible storing of user data like address book, calendar, etc.
  3. D-Bus [7] interface to a number of system services.
  4. Auxiliary service for media player control.
  5. Speech server.
  6. Speech-enabled window manager for X.org.
  7. Small screen reader based on AT-SPI [3] services.

Event queue stands here for central dispatcher of events and the most important of them, of course, are user input commands. That queue should be also accessible for events exchanging between several applications being running in the environment. It is very important thing because this feature is the most proper way for multi-threaded applications synchronization.

Now there is a general moving in GNU/Linux world toward D-Bus system services and that fact makes a lot of things easier for system developing. Using D-Bus which has good support in Java we can get a easy control over key system services like Udisks [19] for removable drives, Network Manager [14] for network connections and others. Systemd service [17] should be mentioned here as an especially hopeful suggestion, however it needs more time to clarify whether Systemd is really able to become a reliable low-level component or not.

Media player service must take care of any user commands to listen music file or speaking book. It should be controlled also through D-Bus and invoke real media player (very likely VLC [20]). Media player can not be launched directly since blind user needs some features for speaking books (literature recorded by narrator as audio files) including bookmarks managing, etc. In addition there is a special format of books for blind persons called Daisy [5] and it also should be supported at the level of media player control service.

  1. Web-browser and AT-SPI applications

There is a couple of special cases when user has to deal with tradition GUI anyway:

  1. Web-browser.
  2. Closed popular applications (Skype).

Without these applications support our system can not be considered completed. In case of web-browser its nature doesn't allow express its functionality in any text only form. Skype due to its license restrictions should be taken as a whole binary application. So we have to implement some support making these application available for blind users although both of them are exceptions in proposed conception of text only interface. Fortunately GNU/Linux GUI has additional service called AT-SPI helping disabled persons operate with GUI applications and it remains still available for us. If we choose Firefox as suitable web-browser we can use it and Skype through AT-SPI functions. Regarding web-browser we should notice to be fair there is one more solution — ChromeVox add-on [4] for Chrome/Chromium browser, but its less comfortable because it generates speech by itself and doesn't expose any data outside of browser. That means there is no way to pick up browser data from external tools, for example, to be copied into custom buffers like clipboard. Since we rejected any existing GUI we can not use prepared screen readers used for transforming AT-SPI data into speech and have to implement its replacement.

The last thing we need to get our design full is a tiny window manager. Main Java virtual machine we have described in previous sections and additional applications like Firefox or Skype are independent processes with their own windows on a screen controlled by X.org server. A special speech enabled window manager should be created to link all of them into one whole suite. Java environment could be considered as main application playing predominant role and other applications became available easily with simple way to switch between them with informative speech notifications. No need to develop such window manager from scratch since there are a lot of simple suitable examples.

Conclusions

The design described here is in active development currently and authors intent to create a prototype suited for illustration what such system could be used for (the project is designated as "Luwrain"). Meantime a lot of required components are already created since almost all of the tasks are the widely known problems solved by large variety of FOSS projects. The only thing should be written completely from scratch is the Java user environment and its conception is the main research goal.

A couple of related questions stays undiscussed with these paper. First of them is the braille display support. These theme doesn't need any research since Os just has to support corresponding devices. The second is a using of tablet computers. This is a big problem because this sort of devices can be used only by tracking graphical objects position on a screen. Any way this area needs a lot of deep research respecting experience of the projects providing accessibility for multi-touch devices. It is especially important due to threat to low-cost laptops by tablets in current market situation. As a matter of fact some companies are about to prefer tablets production rather low-cost laptops.

However, a lot of cheap embedded computers based on ARM architecture are coming and they can be considered as real companions for wide range of disabled persons. GUI doesn't play valuable role in their using and we may expect demand of any text-based solutions.

Литература

  1. Apache POI - the Java API for Microsoft Documents // http://poi.apache.org/
  2. Apple - Accessibility - VoiceOver - In Depth // http://www.apple.com/accessibility/voiceover/
  3. Assistive Technology Service Provider Interface // http://www.linuxfoundation.org/collaborate/workgroups/accessibility/atk/at-spi
  4. ChromeVox // http://www.chromevox.com/
  5. DAISY Consortium // http://www.daisy.org/
  6. Emacspeak — The Complete Audio Desktop // http://emacspeak.sourceforge.net/
  7. freedesktop.org - Software/dbus // http://www.freedesktop.org/wiki/Software/dbus
  8. GNU Emacs // http://www.gnu.org/software/emacs/
  9. JavaMail API // http://www.oracle.com/technetwork/java/javamail/index.html
  10. Java Standard Edition // http://www.oracle.com/technetwork/java/javase/index.html
  11. JAWS Screen Reading Software by Freedom Scientific // http://www.freedomscientific.com/products/fs/jaws-product-page.asp
  12. LaTeX — A document preparation system // http://www.latex-project.org/
  13. Lilypond // http://www.lilypond.org
  14. Network Manager // http://projects.gnome.org/NetworkManager/
  15. Orca // https://live.gnome.org/Orca
  16. RHVoice // https://github.com/Olga-Yakovleva/RHVoice
  17. Systemd // http://freedesktop.org/wiki/Software/systemd/
  18. The Rome projects // http://rometools.org/
  19. Udisks // http://www.freedesktop.org/wiki/Software/udisks
  20. VideoLAN - Official page for VLC media player // http://www.videolan.org/vlc/
  21. Пожидаев М. С. Описание работы по созданию специализированного дистрибутивного решения для незрячих пользователей на основе дистрибутивов ALT Linux// Информационные технологии для инвалидов по зрению в современном мире. Проблемы и перспективы: Материалы 7-ой научно-практической конференции (4 декабря 2008). - М.: 2008. - С. 82-86
  22. Пожидаев М. С., Камынин А. Н. Развитие окружения ALT Linux Homeros лиц с ограничениями по зрению//Седьмая конференция разработчиков свободных программ на Трубеже. Переславль, 26-27 июля 2010 г.: Тезисы докладов. - М. : Институт логики, 2010. - С. 78-80
  23. Пожидаев М. С. Дистрибутив со вспомогательными технологиями ALT Linux Homeros Friend: первый опыт // Восьмая конференция разработчиков свободных программ: Тезисы докладов / Обнинск 25-26 июля 2011 г. М.: Альт Линукс, 2011. С. 9-11
  24. Пожидаев М. С. Разработка перспективы развития дистрибутива со вспомогательными технологиями ALT Linux Homeros // Открытые технологии : Сб. материалов Восьмой Международной конференции разработчиков и пользователей программного обеспечения Linux Vacation / Eastern Europe 2012, Гродно, 07-10 июня 2012 г. / под общ. ред. Д. А. Костюка. - Брест : Альтернатива. - с. 23-25

References