UMA: a system for universal mathematics accessibility

May 26, 2017 | Autor: Gopal Gupta | Categoria: Visual Impairment, System Development, Universal Access
Share Embed


Descrição do Produto

See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/228811408

UMA: A system for universal mathematics accessibility Article · January 2004

CITATIONS

READS

14

66

8 authors, including: Arthur I. Karshmer

Gopal Gupta

University of San Francisco

University of Texas at Dallas

74 PUBLICATIONS 872 CITATIONS

217 PUBLICATIONS 2,119 CITATIONS

SEE PROFILE

SEE PROFILE

Enrico Pontelli

Klaus Miesenberger

New Mexico State University

Johannes Kepler University Linz

345 PUBLICATIONS 2,715 CITATIONS

129 PUBLICATIONS 357 CITATIONS

SEE PROFILE

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

MOOCAP View project

All content following this page was uploaded by Enrico Pontelli on 29 December 2016. The user has requested enhancement of the downloaded file. All in-text references underlined in blue are added to the original document and are linked to publications on ResearchGate, letting you access and read them immediately.

UMA: A System for Universal Mathematics Accessibility A.I. Karshmer

G. Gupta

E. Pontelli

University of South Florida [email protected]

University of Texas at Dallas [email protected]

New Mexico State University [email protected]

K. Miesenberger

N. Ammalai and D. Gopal

M. Batusic and B. Stoger ¨

Universitat ¨ Linz [email protected]

Logical Software Solutions narayan|[email protected]

Universitat ¨ Linz [email protected] [email protected]

B. Palmer

H-F. Guo

New Mexico State University [email protected]

University of Nebraska Omaha [email protected]

ABSTRACT We describe the UMA system, a system developed under a multi-institution collaboration for making mathematics universally accessible. The UMA system includes translators that freely inter-convert mathematical documents transcribed in formats used by unsighted individuals (Nemeth, Marburg) to those used by sighted individuals (LaTeX, MathML, OpenMath) and vice versa. The UMA system also includes notation-independent tools for aural navigation of mathematics. In this paper, we give an overview of the UMA system and the techniques used for realizing it.

Categories and Subject Descriptors I.7.2 [Computing Methodologies]: Document and Text Processing—Document Preparation

General Terms Human Factors, Design

Keywords Visually impaired, math accessibility

1.

INTRODUCTION

The Internet has changed the whole model of education, and the advent of wireless technology has made the internet a pervasive presence. Now education includes access to a wealth of information previously locked up in libraries, and materials can be tailored for individual or group needs by

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. ASSETS’04, October 18–20, 2004, Atlanta, Georgia, USA. Copyright 2004 ACM 1-58113-911-X/04/0010 ...$5.00.

55

professors. Professors now act as guides through the collective knowledge of our generation. Showing students how to access this vast source of knowledge, and harness it to their needs is now part of the educational experience. This implies a greater need to sustain effective communication between students and educators, crossing barriers of diversity in abilities and backgrounds. In this project we concentrate on promoting universal accessibility of mathematics, with particular focus on the needs of individuals with visual impairments. Digital representation of mathematics, in the sighted world, commonly relies on markup notations. These introduce a well-defined class of symbols used to capture either the typographical (e.g., as in LaTeX) or the structural (e.g., as in MathML) aspects of mathematical expressions. The markedup expressions are machine understandable, e.g., to produce visually appealing and intuitive presentations. Unfortunately, these representation media are inherently visual and rather inaccessible. To make mathematics accessible to the visually impaired, a number of notations have been designed that extend Braille so that mathematics can be encoded. Some of these notations pre-date the computer revolution. Of these, the Nemeth math notation [14] and the Marburg notation [7] are the most prominent and most widely used. However, these notations make use of highly context sensitive formats, primarily due to the fact that they wish to constantly keep the visually impaired reader informed of the context in which the current item occurs; as such they are not easily amenable for automated processing. Thus, blind and sighted mathematician “talk different languages”. We address this problem as follows: (1) Develop technology capable to seamlessly inter-convert between different representation formats for mathematics— including visual markups (e.g., LaTeX) and Braille formats (e.g., Nemeth). (2) Provide visually impaired mathematicians with tools to uniformly access and navigate mathematical expressions, independently from what their original encoding was.

Common Interchange Format

MathML Nemeth

OpenMath Marburg

LaTeX

Figure 1: System Organization

2.

THE UMA SYSTEM

ple, as part of the Scientific Notebook system. However, the problem of translating Nemeth Math to LaTeX was regarded as unsolvable until Gupta, Karshmer, and Guo developed a semantics-based approach to solve this problem based on logic programming [16]. The system developed by them still has a few limitations: (i) spatially arranged expressions, such as determinants and matrices, have to be entered in a linear form and spatial arithmetic/algebra is not allowed, (ii) only stand alone expressions can be translated; embedded text is not allowed. The aim behind our current effort is to build a comprehensive, usable system that overcomes these limitations.

The development of the UMA system is a multi-institution collaboration whose organization is shown in Figure 1. The UMA system has two subsystems. The first subsystem— the Interconversion Platform (IP)—handles all the aspects related to the interoperation between different formats for mathematical representation. The IP guarantees seamless transition between both digital formats for mathematics as well as Braille-based formats. The interoperability is based on the use of a common interchange format (CIF), used to bridge between any pairs of formats. Thus, it permits a mathematical document written in any of the format shown in Figure 1 to be translated to any of the other formats. The CIF is based on the OpenMath [3] standard. The second subsystem—the Navigation Platform—is employed to provide interactive visual and aural navigation of mathematical entities; the navigation relies on the representation of the expressions in the common interchange formats. The navigation is directly applied to the OpenMath format— allowing us to take advantage of the rich format provided by this notation (e.g., explicit definitions of operators).

3. 3.1

3.2

System Architecture

Mathematical expressions can involve complex fractions, lengthy derivations or spatial arrangements like matrices or determinants. A Braille based mathematics document consists of both literary Braille as well as Nemeth Math Braille code. Our system first separates textual literary Braille, spatial arrangements, and the rest of Nemeth Braille code with the help of special begin and end markers that the user is required to insert, converts each of them separately to LaTeX, and finally puts each piece together to create the final output (see Figure 2).

HANDLING NEMETH CODE

3.3

Introduction

Processing Literary Braille

Literary Braille code (Level 2 ASCII Braille) is first translated to normal text and later refined to fit in the LaTeX document. The transcription algorithm strictly follows the American Literary Braille usage rules [2]. Note that converting ASCII Braille to text automatically is a non-trivial task. The high level algorithm used by the Insight system is as follows: 1. Hash tables are created for characters, numerals, punctuation marks, contractions, single-cell words and short-

In this project we decided to approach the problem of communication of Nemeth code by providing the capability of mutual inter-conversion with the LaTeX typographical format. This choice is without loss of generality, thanks to the ability to freely interconvert between LaTeX and OpenMath (described later). It turns out that converting LaTeX to Nemeth code is much easier than the other way round. Latex to Nemeth translators have indeed been built and marketed, for exam-

56

5. When the Braille character does not yield a match with any of the hash table keys, the character is ignored.

Input Mathematical Document in Braille Form

3.4 PROLOG MODULE *********************** Converts Nemeth Braille to LaTeX Spatial Nemeth Input

Nemeth I/P

I/P MANAGER sends i/p line by line

ASCII I/P

TYPE CLASSIFIER

LaTeX Output

O/P MANAGER

LaTeX O/P

Matrices, Determinants converted to linear form Spatial I/P Spatial arithmetic/algebraic operations converted to LaTeX

Issues in Translating Braille Code

The Nemeth Braille Math coded part of a document is required to be enclosed within $$–$$ or $$$–$$$. If the $$– $$ enclosure is used, the mathematical text will be displayed in continuation with the text in the Latex document. If $$$– $$$ is used, the mathematical text will be displayed centred in a new line. Math expressions in Nemeth Braille form present in a document and enclosed within $$s or $$$s are converted to LaTeX using a module implemented in Prolog [16]. The module uses definite clause grammars [16] and denotational semantics to map Nemeth Math to LaTeX [11]. However, spatial mathematics is required by this module to be input in linear form. The first step in implementing such a translator [10] is to develop a parser for Nemeth Braille [9]. Nemeth code follows syntax conventions that put it in the class of formal languages called context-sensitive (non-context-free) languages. Additionally, the Nemeth code is space sensitive, ambiguous, and permits spatially arranged input (matrices, determinants, arithmetic sums). These features make the grammar for Nemeth code very complex, so that specifying its syntax turns out to be quite difficult and was thought to be impossible. An example of context sensitivity in Nemeth can be seen in how subscripts and superscripts are represented [10]. However, definite clause grammars [11] allow parsers for context sensitive languages to be built with ease, while semantics-based techniques allow parse trees generated to be mapped to Latex expressions with ease. Details can be found in [11, 10] and are omitted due to lack of space. The Prolog module, however, cannot handle spatially arranged mathematics which has to be processed separately. Spatial arrangements of two kinds are included in Nemeth Braille code: (i) mathematical expressions such as matrices and determinants, and (ii) spatial arrangement for arithmetic/algebra operations like addition, subtraction, multiplication and division (for example, grade school sums and long division). In both cases, the numbers and expressions have to be horizontally and vertically aligned. Such aligned structures are hard to handle through a grammar. With respect to matrices and determinants, the Nemeth Braille to LaTeX translator written in Prolog will accept the input in linear form (i.e., as a list of lists where each inner list corresponds to a row). Thus, if spatially arranged matrices and determinants are translated into linear form, then they can be processed using the Prolog module. The UMA system does precisely that. The document writer has to enclose spatial matrices and determinants within special symbols, namely, @@–@@. The UMA system has a module written in C++ which identifies such parts, extracts the various expressions in the matrices/determinants, transforms them into linear format, and passes the result to the Prolog module for further processing to convert to Latex. Spatial arithmetic/algebra is handled in a similar manner, except that the user has to enclose such parts within @@@–@@@ symbols. We believe that it is quite reasonable to require the users to enclose different parts of the document within special symbols, because the goal of the UMA system is not to convert legacy mathematical documents, rather it is to enable students and scholars of mathematics to communicate with their sighted counterparts.

C++ MODULE ***************** Converts Literary ASCII Braille to LaTeX

Combined O/P LaTeX O/P

Final O/P LaTeX Document

Figure 2: Nemeth to LaTeX

form words with the ASCII Braille symbol as key and the equivalent text as the value. E.g., the short word Hash table holds ’brl’ as key and ’Braille’ as its value. The hash tables are used to translate, one word at a time, the ASCII Braille part of a document containing math and text. 2. The words are pre-processed to find their type viz., capitalized words (with leading comma symbol), word starting with a capital letter (with leading comma symbol) or a number (with leading # symbol). 3. If the word is determined to be a number, each character in the word is matched with the keys in numeric hash table and the corresponding value is sent to the output log. 4. Else, steps i through iii are attempted in order. At any point if a hit occurs (either a character or the word matches with a key in one of the hash tables), the corresponding value is sent to the output log. (i) The whole word (let’s say of length n, from 0 to n-1 characters) is examined to yield a match with the keys in short-form words table. Prefixes of this word in decreasing order are tried for a match, up to a 2 character length prefix. (ii) The first character of the word is matched with the keys in character table (a certain hit, as the character table contains all characters as keys). The rest of the word goes through step i-ii, until all characters are converted to text equivalents. (iii) However, when the characters are matched alone, a number of rules apply. Let us denote the character under consideration by c. If c equals “ or ∧, the next letter has to be analyzed to match with the special sequences ∧w for word, 00 w for work etc. If c is the last character of the word, then it is matched with punctuation table first and if this fails, it is matched with the character table. For example, the letter 4 in Braille has two mappings depending on the context. It can either maps to ’dis’ or ’.’. Thus, ’disk4’ translates to ”disk.” in text (4 is equated to ’.’) while ”4s]t,n” translates to ”dissertation” (4 is equated to ’dis’). Thus certain characters imply punctuation marks when they appear as the last character of the word.

57

3.5

Sample Conversion

called ’lowered digits’ made by lowering a number by one dot. This is possible, because all glyphs from ’a’ to ’j’ occupy only four upper dots in a Braille cell (dots 1, 2, 4 and 5 consecutively). Lowered numbers can have two meanings: 1. Prefixed by a number indicator they designate an ordinal number 2. Written without the number indicator directly after a projective mark they designate a normal number at an index or exponent position.

We next illustrate the operation of the UMA-Nemeth system. The example Braille file is an extract from a typical mathematical document, a problem set prepared by a blind tutor to be given to sighted students. The document is in Braille form, consisting of both Nemeth Braille (to represent mathematical expressions) and ASCII Literary Braille (to represent English text). This document is fed as input to our system to obtain a document in LaTeX as output. A sample conversion done by our system is shown in Table 1 (the problems are taken from an analytical geometry textbook).

4.

Table 2: Modifier Symbols Alter the Meaning Lat. Lat. Greek Greek Card. Ord. cap. small cap. small num. num. A a A α 1 1 $a ’a a ha #a #,

HANDLING MARBURG CODE

4.1

Marburg Notation

The Marburg Braille mathematical notation is a traditional 6-dot Braille code made by blind academicians at the Marburg Institute for the Blind in Germany [7]. Marburg notation is a very compact and intuitive approach for reading and writing maths in a traditional way (on paper). It is optimised for reading and especially for doing mathematics by blind persons. This quality is based on a high context sensitivity of expressions causing several problems for automatically or semi automatically processing.

4.1.1

Operators: Binary and relational operators consist of one or more Braille glyphs. Almost all of them obey the rule that they have to be preceded by a blank space and that the cell immediately after them may not be blank. Table 3: Operators A = r2 π α≥0 $a = r0;h p h a o = # j

Fundamentals

The Marburg notation is a mixture of the presentational and content mathematical mark-up. Most of the time it is possible to directly map the 2D presentation of a formula into Braille code but nevertheless one has also some semantic elements at his disposal, such as the power indicator in addition to the superscript one. Like any other specialised traditional Braille code the Marburg notation has also its greatest difficulty in the limitation of its symbol set to 64 Braille glyphs. To overcome this shortness Marburg notation helps itself by means of two extension concepts: 1. Augmentation of the symbol set by the use of prefix modifiers or indicators and 2. Assigning different semantic meanings to a Braille glyph according to its position in the expression or to the presence or absence of a blank space before or behind it. An extra group of indicators and delimiter marks has the task of building 2D mathematical constructs such as subscripts, superscripts and fractions. The concept of imaging any non-linear material in Marburg notation is named ’projection technique’ and all non-linear elements (visual stuff which is not horizontally aligned along the base line) are called ’projectives’. This technique is a significant improvement over the linear letter-by-letter reading, giving the reader a preview of the next formula segment.

4.1.2

Some Symbols and Rules of Marburg Notation

Letters and Numbers: Following the tradition of 6-dot Braille, Marburg notation uses glyphs a–j to represent the digits 1– 0, respectively. Various indicators determine the meaning of the basic symbols. There are indicators for numbers, for Latin, Greek and Gothic letters (capital and small for each alphabet) and additionally for the font type (such as bold, upright). The speciality of the Marburg notation are so-

58

Function Names and Textual Elements: There are various options in the Marburg notation for coding in upright font type. • For the case of coding ordinary function names there are many precast shortcuts available consisting of one of the shortcut indicators followed by one or more Braille glyphs • All function names and measures can be coded by using the special ’word constructor’ 7 followed by the text. • Yet another indicator or delimiter pair is at your disposal if you have to write the text of a comment or generally a segment of text mixed up with maths: ’ . TEXT ! , MATH Parentheses and other delimiters: Parentheses, brackets, curly braces, vertical lines and some other delimiters in Marburg notation image, as far as possible, the shape of their visual counterparts.

(· · · ) 2...‘

Table 4: Delimiters {· · · } [· · · ] |···| || · · · || 00 9 . . . o {. . . } 00 . . . % . . . %\

Projectives–Mathematical 2D Elements: The most complicated feature of Marburg notation is the representation of 2D formula parts such as superscripts/power exponents, subscripts, roots, etc. A special role among the projectives is played by fractions. The framework of rules and exceptions for fractions is too large to be described here. A change of a 2D level is marked either by a projective symbol that indicates the beginning of a higher level or by a projective-end symbol indicating the return to the former level. This concept would be easy to understand without its powerful machinery worked out by the latest revision of Marburg notation in 1986 [7].

Table 1: Sample input Braille document and output LaTeX document Input Braille Mathematical Document Output LaTeX File #aj4 ,%[t|two /rai
Lihat lebih banyak...

Comentários

Copyright © 2017 DADOSPDF Inc.