A Journal Header Reader program for the blind

Access to scientific journal article headers


Last Update: Monday, 29-May-2006 06:13:13 PDT




Authors:
Thomas Kahlisch
Gunthild Vogel
Address:
Dresden University of Technology
Department of Computer Science
Institute of Information Systems
Mommsenstr. 13
Germany 01062 Dresden
Phone: +49-351-4575-410
E-Mail: journal@iis350.inf.tu-dresden.de

1. Motivation

The advantage of structured markup in SGML (Standard Generalized Markup Language) [1] has recently become clear [2]. This technology is being used to automatically convert documents into accessible forms for blind people.

In Germany one of the first sets of documents available in SGML is the scientific journal article headers from the "Springer Verlag Journal Preview Service" [3]. This article gives a description of the "Journal Header Reader" application. We developed this application to make scientific documents in several formats accessible to blind people. The following chapter gives an overview of the SGML facilities used in our project.

2. SGML facilities: ICADD and MAJOUR-header

2.1. ICADD: guidelines - for making SGML documents accessible to blind people

The "International Committee for Accessible Document Design" ("ICADD") has issued guidelines to transfer SGML data into the format of the ICADD22 Document Type Definition (DTD) [4].

Note: Further information about the International Committee 
      for Accessible Document Design can be obtained from the 
      Chairman, Tom Wesley, University of Bradford 
      Bradford, BD7 1DP, United Kingdom, 
      Phone:  +44 1274 383902, 
      Email: t.a.b.wesley@bradford.ac.uk.

This DTD defines a flat general element structure of a book or an article, needed to produce documents in braille, large print or synthetic speech.

The transformation process of the document instances into the ICADD22 format is based on the ICADD facilities defined in the international standard "NISO / ANSI / ISO 12083" [5]. The facilities describe a mechanism which adds SDA "SGML Document Access" attributes into the source DTD. The attributes contain the needed information for the translation into braille or speech.

By using these attributes it is not necessary to change any part of the document instances. The transformation process from the original document into several special formats can be done automatically.

One example of this mechanism is the "HTML to ICADD TRANSFORMATION SERVICE" developed by Jeff Suttor from the University of California, Los Angeles [6]. By using a HTML form, the service translates a valid document instance, according to HTML 2.0, into Braille.

2.2. The MAJOUR-header DTD

The MAJOUR-header DTD is designed by the European Workgroup on SGML ("EWS") [7]. The DTD defines the structure of a scientific journal article header. The complex structure describes subsets of elements for headers of both general and jurisprudence articles.

A general article header includes the following information:

All the items mentioned above are marked up as small elements in the DTD structure. This mechanism provides easier convertion of the documents into a database system for further retrievals.

One MAJOUR-header application is the "Springer Verlag Journal Preview Service" [3] of the German publisher Springer Verlag Berlin Heidelberg. This service provides a preview of the article headers for more than 100 scientific journals through the Internet.

Most of these documents are written in English. However many of the article headers are bilingual. For example, a header may have the title, the abstract (summary) and the keywords in English and in German.

In each language a different Braille notation is used and not every reader is familiar with all international notations. Therefore, when making bilingual documents accessible to blind users these notations should be taken into consideration.

3. Access to scientific journal article headers for blind people

3.1. The output formats

In our "Journal Header Reader" project, we convert the MAJOUR-header instances into the ICADD22 format, by using the ICADD guidelines. After that, the documents are be converted into three different formats. The formats are ASCII, Braille grade I and Braille grade III.

The ASCII format is needed to produce a speech output. This format can also be displayed on the computer screen and on Braille display with the capability to handle 8 dot Computerbraille.

The grade I and III Braille formats can be used on a Braille display or Braille printer. Grade I, II and III Braille are notations with 6 dot Braille cells. These formats are the usual notations used by braille readers.

Braille grade I is a format for blind people who are familiar with reading uncontracted Braille. This format differs to the ASCII version in the following points:

Braille grade III is a format for experienced readers. It provides a set of more than 300 contractions or abbreviations of the German language. This format is only used for the German parts of the documents.

All of the English parts of the documents are in American grade I Braille. Because of the fact that most of the German Braille readers are not familiar with the American or English contracted Braille, this notation is not used in this project.

3.2. The transformation process

The transformation process is implemented as a batch processor. The translation language OmniMark is used for the SGML convertion. The file handling and database facilities are written in C++.

The batch processor has the following steps:

STEP 1: MAJOUR-header validation

The document instances from the publisher are not 100% valid according to the MAJOUR-header DTD. The first step is to convert the documents into the appropriate format.

A second task of the first processing step is the creation of entries for the protocol and the content databases. These databases are needed for the further processing of the document instances and can be used in further applications. For each article a new entry in these databases is created.

The protocol database is used by the later steps of the transformation process. One entry of the database contains the following information:

The Content database stores the information that is needed for the search facility in the reading program "Journal Header Reader". The following information is saved for each article:

By using this database in the reading program, the user is able to do research from the entire set of available documents.

STEP 2: MAJOUR-header to ICADD

By using the SDA attributes the complex structure of the MAJOUR-header document instances are converted into the flat ICADD22 DTD format. To prevent the loss of information additional explanations are generated to the destination document.

In doing this, most of the small source elements are translated into list items with brief explanations. Explanations are strings of characters which describe the content of the whole list or the particular items.

Because of the rigid structure of the ICADD22 DTD, two particular problems had to be solved during this transformation step.

The content of the elements TITLE and AUTHOR of a MAJOUR-header document instance is more complex than the same elements in the ICADD format.

The MAJOUR-header TITLE element may have titles in several languages, footnotes and further references.

An AUTHOR element of the MAJOUR-header structure may consist of several names of authors, footnotes and further references.

To produce valid document instances, according to the ICADD22 DTD, without loss of information, the ICADD output for these elements was specially treated.

Handling titles in several languages:

The content of the MAJOUR-header element TITLE is translated into a LANGUAGE and HEADING 1 element for each of the given titles.

Exp. 1.: (title part of an ICADD article header)
        ...
        >ti<several titles of one article>/ti<
        >lang lang="DE"<>/lang<
        >h1<Der Deutsche Titel des Artikels>/h1<
        >lang lang="EN"<>/lang<
        >h1<The English title of the article>/h1<
        ...

The example Exp. 1. shows that the >TI< and the >lang< elements of the ICADD instance are used as occurence flags without content belonging to the article header.

Handling lists of authors:

The names of the authors appear in the ICADD document instances as a list of items. A LISTHEADING element explains the content of the list. If there is more information for one author occuring in the source document, a sub list for each author is generated.

Exp. 2.: (list of authors with further information)
        ...
        >list<
        >lhead<List of authors</lhead<
        >litem<
                >list<
                >litem<first author >xref id="a1"<
                >/litem<
                >litem<>fn id?"a1"<footnote first author>/fn<
                >/litem<
                >/list<
        >/litem<
        >litem<
                >list<
                ...     -- second author --  
                >/list<
        >/litem<
        >/list<
        ...

The example Exp. 2. shows a list of authors of one article header. The optional element >au< of the ICADD DTD is never used in this application.

STEP 3: ICADD to ASCII & Braille

The third step converts the ICADD22 document instances into the three different formats:

For the transformation into the different Braille formats the HBS [8] software from the German Fernuniversität Hagen is used. This program provides the capability to translate bilingual documents into several Braille formats.

Each format is stored as a separate file.

After the execution of the three processing steps the article marked up according to the MAJOUR-header DTD, is available in the following output versions:

The content database and the three documents accessible to the blind user are available to the reading program "Journal Header Reader".

3.3. The Journal Header Reader program

The "Journal Header Reader" is a program that makes the access to the documents generated by the transformation process of the MAJOUR-header document instances possible. The reader program can be used by blind and sighted users.

3.3.1. Hardware and software requirements

The program runs on MS/DOS and can be used in a network. The system does not have a graphic user interface and can not be used with MS Windows.

The program output is available on the computer screen, the braille display and via a German speech synthesiser.

A character based output is used for the computer screen.

Each Braille display with the capability to handle MS-DOS computers can be used.

The speech converter is the sound blaster based system TALKINGBLASTER. The system does not have the ability to handle English text. But it provides a high quality German speech output.

3.3.2. The User interface

The reader program provides a main menu with five items, which appear at the top line on the screen. A pull down menu is available for each of these items.

The user can move into the menus by using the cursor keys. The main features of the program are also available by using quick hot key functions.

If a new menu item is activated by the user, the information is immediately displayed on the Braille display and spoken by the speech converter.

The capability to present the current information on the special output devices is an important feature to make software, which is driven by menus, accessible to blind computer users.

The main menu provides the following items and sub functions:

The HELP functions explain how to use the chosen topic. The information is available on the screen, the Braille display and via the speech converter.

The function SEARCH provides the features to choose an article. To find the information the program uses the content database, created in step 2 of the transformation process.

read an article header, the user has a variety of different choices in the VIEW menu.

text and speech viewer provides the information as an ASCII text on the computer screen, the Braille display and via the speech converter.

The text viewer presents the article header without using the speech.

Depending upon the chosen Braille format (grad I or III), the braille viewer displays the text on the Braille display and the screen.

The OPTIONS menu allows the user to change the appearance of the menus, the parameter of the speech (volume, voice, speed) and the prefered Braille format (grad I or grade III).

4. Conclusion

The concept of the transformation process and the reader program can be adapted and improved into further projects. By integrating an English speech synthesiser into the reading program, it would improve the handling of the bilingual documents.

The capability of handling different formats is an important advantage of the system. Different blind people may have different backgrounds and requirements.

The publisher Springer Verlag is going to publish the entire article in SGML and it will be an interesting task to extend the developed program to become a full journal reader.

For this task, it is necessary to extend the ICADD guidelines with mechanisms to handle tables and mathematics.

One other topic for further development should be to establish strategies for navigation and hypertext capabilities into reading systems accessible to blind people.

We are very interested in exchanging our experiences with people who are involved in similar projects.


Bibliography

[1] ISO
ISO 8879:1986 Information Processing - Text and Office
Systems - Standard Generalized Markup Language (SGML)
Geneva, 15 October 1986.
[2] B. Bauwens, J. Engelen, F. Evenepoel, C. Tobin, T. Wesley
Structuring Documents: the Key to Increasing Access to
Information for the Print Disabled
in: 4th International Conference ICCHP '94
Proceedings
Vienna, Austria, September 14-16, 1994
Springer-Verlag Berlin Heidelberg, 1994.
[3] Springer Verlag
Springer Verlag Journals Preview Service
ftp and gopher: trick.ntp.springer.de
E-Mail: svjps@vax.ntp.springer.de
[4] ISO
ANSI / NISO / ISO 12083
Electronic Manuscript Preparation and Markup
Annex 8: Facilities for Braille, large print and
computer voice
1994.
[5] ISO
ANSI / NISO / ISO 12083
Electronic Manuscript Preparation and Markup
1994.
[6] HTML to ICADD TRANSFORMATION SERVICE
Jeff Suttor
University of California - Los Angeles
URL: "http://www.ucla.edu/ICADD/html2icadd.html"
[7] European Workgroup on SGML
MAJOUR - Modular Application for Journals
DTD for Article Headers
1991
[8] HBS - Hagener Brailleschrift System
Fernuniversität Hagen - Gesamthochschule
Zentrum für Fernstudienentwicklung


[Find...]

[Map...]

[News...]
Information Button
[Info...]
Icon RSAC Classification
[RSAC]
Icon ICRA Classification
[ICRA]
[ This web site does NOT uses cookies ]


Our friend web sites (support or financing): EDC-Consulting | Web4Unity | Time Management