[Main topics] [Navigation]

NLA - Implementation considerations

Tagging Data
Data Type Key
Sorting Text
Properties of an Environment
Hierarchy of Defaults
Adaptability
Modular Keyboard
Typewriter Area of Keyboard

This section contains examples of suggested implementations for various requirements.

This paper calls for the rich functionality of a full implementation of the required NLA. Neither software developers nor end-users can wait until this final implementation is available. Hence there is need for a staged approach. As an example the following steps for implementing multilingual support can be envisioned:

  1. all national languages enabled
  2. different national languages on different systems in the network
  3. different national languages on different work stations in the network
  4. different national languages in different windows of same work station
  5. more than one national language within one window of work station.

Similar remarks apply to the coding issue. Also for this area of the NLA a stepwise implementation must be considered:

[To top/bottom of page] Tagging Data

Textual data need a multitude of attributes (tags):

Most of these may be specified globally (for an entire installation), but all of them must also be applicable to specific data items (except the sorting scheme). This is especially relevant in data base applications. It may be necessary to change the code page for personal names. Multilingual operation always must be assumed:

As stated also elsewhere in this paper an implementation of the NLA should not introduce new coding schemes. This is also relevant for the tags of data. Existing (e.g. ISO) language- and country codes must be used where possible.

Joining data base tables, especially for distributed, multilingual data bases requires that all elements be tagged, at least temporarily. It is desirable that

Hence it is necessary to indicate the fact whether children tags do exist or do not exist.

There is also a need to identify undisplayable information like

Such data items must be prevented from any translation.

[To top/bottom of page] Data Type Key

For some text operating functions (like search, sort) the detailed information about the data (such as accents, case, special symbols) is needed in different stages of the process. All this information is there in the fully formed characters like an upper case A with grave accent (À) and can be associated with the code of that character. However, for processes like sort it is more convenient to have the information separated into «keys».

Let us call this the «processable» form of the data, whereas the fully formed characters represent the «external» form of the file. For some applications, like data base applications, it may be desirable to have the data already in processable form. For other applications it may be more economic to «convert on the fly».Whether the translation between these forms is applicable to data files directly (as in this description) or exists only «on the fly» during further processing of the text, is left to the implementor.

Many functions like sort or search in an NLA can only be performed correctly, if text is stored in its «richest possible form». [LaBonté-3] proposes sort keys for text data which define information about:

  1. lower case form of text
  2. diacritics (position, nature)
  3. case of each character
  4. special symbols (position, nature)

These keys separate the information spread across the characters of the original string. The created strings (keys) contain only one type of information. These keys are independent of the code page used to encode text data.

This method easily compares to numeric data types like Floating Point Real, which consists of:

  1. mantissa
  2. exponent
  3. sign

Assume for example the French word Côté. Then the elements of the proposed data type become:

Element Description Example
PK: primary key base characters cote
SK: secondary key accents «17»«16»«19»«16»
TK: tertiary key case «09»«08»«08»«08»
QK: quarternary key non alphabetic symbols «00»

The coding of these data elements is demonstrated here with numeric values in angle brackets. The values are chosen not to conflict among themselves.

The secondary key (SK) may use a different sequence with respect to the language (left to right or right to left correspondence to the base characters in PK). For this key numeric values below those for characters are assigned to the accents. The values depend on the language. For French appropriate values are:

«16» unaccented
«17» acute accent
«18» grave accent
«19» grave accent
«20» diaeresis
«21» cedilla

Values for the tertiary order key (TK) are

«08» lower case
«09» upper csse

The quaternary key (QK) starts with a delimiter «00» to be distinguished from the values before, followed by a sequence of «position»«value» - pairs. This can be best demonstrated with another example, the English word vice-president:

PK vicepresident
SK 13 * «16»
TK 13 * «08»
QK «00»«04»-

[To top/bottom of page] Sorting Text

Current sort methods both in the ASCII and EBCDIC environment do not create sequences as expected by the user. They use code points (binary values) of the characters. However, many applications are based on this behaviour. So when implementing sort methods according to NLA rules the default processes must retain the old (wrong) results.

[To top/bottom of page] Properties of an Environment

The following properties must be independent from each other. However they may be grouped (for example by language and country):

Language
  • character set
  • character attributes
  • collating sequence (sort)
Country, culture
  • numeric punctuation
  • monetary punctuation
  • date representation
  • time representation
Other
  • coding
  • messages

The list of properties forming an environment must be extensible, for example for user defined categories.

These properties must be kept in a form suitable for modification with any common text editor, although this information is intended to be modified only by system administrators or the like. The model given by the utility setlocale() and its input-definitions [POSIX-2, IBM SC09-1264] demonstrates the needed functionality.

[Properties of the user environment]

[To top/bottom of page] Hierarchy of Defaults

It is obvious that within a particular application or even installation most of the attributes of the text-data can be considered constant. Nevertheless the National Language Architecture must provide full flexibility in the various stages of execution and user interaction.

There need to be «nested» environments, with their own sets of attributes. This is to transfer the defaults from global to local scope. The layers needed are:

Implementation of an NLA assumes that all data items (at least textual data) are tagged. Where no specific tags exist for an item, the global definition is assumed.

Hence with the total absence of any specific tags only defaults are active. This works also in non-NLA-installations, where no tags exist, but some new applications may use NLA functions.

Because the set of defaults may not be consistent among installations or systems, exported data must be tagged.

Some scenarios illustrate the needed flexibility of independent attribute settings:

The sophisticated set-up needed for these scenarios can not be expected to be implemented in the NLA in the beginning. However, any announcement must clearly show the direction, in which the NLA will expand over time.

[To top/bottom of page] Adaptability

Where flexibility and adaptability is recommended (e.g. in translation processes), a layered exit approach should be developed. That is, IBM should provide a standard exit where a user program can execute prior to calling an IBM option or function.

[Adapt the standard process - variant one]

This method creates n (in the example above: 3) pieces of special code for the user.

The following method is the better approach, but calls for a certain «granularity» of functions. It needs only one piece of user code, wherein all the specials are combined.

[Adapt the standard process - variant 2]

Tags to identify objects and their properties (attributes) are needed at various levels:

[To top/bottom of page] Modular Keyboard

Many of the keyboard requirements presented in earlier sections could be achieved by a modular keyboard. Such a modular design would reduce the number of keyboard variants, because the customer could choose the components he needs from a set like the following:

A modular design could specify the following elements (see also the figures later in this section):

  1. Basic «typewriter area» with sufficient number of keys to serve all national variants. In particular an «international version» of such a keyboard must exist.
  2. Alternatives for numeric input:
    • Numeric cluster with calculator functions
    • Numeric cluster with some calculator keys
  3. Alternatives for program function keys:
    • 3 x 4 Block of Programmed Function Keys (PFK) with some keys for «host functions». A shift mechanism provides access to 12, 24 or even 48 PFK's.
    • Row of Programmed Function Keys, grouped in 3 x 4 keys. A shift mechanism provides access to 12, 24 or even 48 PFK's.
    • Row of Function Keys, grouped in 3 x 4 keys, for local functions like set up, coulour selection, word wrap on input.
    • Block of function keys arranged as on the PC's (2 x 5) which also could be used in emulations for local terminal functions (for example clear, erase eof).
    • Block of function keys arranged as on some PC's (3 x 4 in a row) which also could be used in emulations for local terminal functions (for example clear, erase eof).
  4. Alternatives for cursor control:
    • Cursor control keys arrangement of a cross (+) (home in the centre) plus some local editing keys (character delete, character insert etc.). Both for these functions and the cursor functions a shift mechanism would be helpful functions like word delete or sentence delete as well as top or bottom.
    • Track ball (inverted mouse) with two keys to «click» the cursor and hold it for «dragging»
    • A mini tablet as pointing device
  5. A keyboard help function must graphically show the key assignments with shifts of each key on the keyboard.

Keyboard labelling in general must use internationally defined symbols (see ISO 9995-6) rather than cryptic abbreviations like STRNG for the German word Steuerung (English equivalent: control).

Basic Keyboard building Blocks

Main areas of a keyboard

Alternatives of Program Function Keys

[Arrangements for function keys]

Alternatives for Cursor Control

[Arrangement for cursor and other keys; location of a track ball]

It is by no means sensible to preserve the appearant problems of the original PC keyboard. The dual use of the right-hand key-cluster for numeric input and cursor control should be abandoned from keyboards in favor of separate keys for these functions.

[To top/bottom of page] Typewriter Area of Keyboard

The «typewriter area» of keyboards must become more ergonomic. This is especially relevant to NLA issues, because national keyboard layouts tend to enlarge the number of keys. This will happen in particular, if more characters are to be supported on one keyboard. Despite the fact that all keyboard layouts based on the «current» ones are not ergonomic in the pure sense, the following rules must be observed:

An installation should be able to use one keyboard that will satisfy a secretaries' requirement for a national keyboard and a programmer's requirement for a US-English keyboard.

 

[Main topics] [Navigation]
 URL:  Created: 1996-12-28  Updated:
© Docu+Design Daube, Zürich    
  Business of Docu + Design Daube Documentation issues Sharing information Klaus Daube's personal opinions Guests on this site Home of Docu + Design Daube To main page in this category To first page in series To previous page in series To next page in series To bottom of page To top of page Search this site Site map Mail to webmaster To bottom of page To top of page