BRAILLE AUTHORITY OF THE UNITED KINGDOM

Proposed alterations by Committee II to the contraction system of grade 2

Contents

BA/6/94

From: Bill Poole (BAUK Chair);

To: All members of the Literary Code Committee of BAUK, Stephen Phippen (UK representative on Committee II of the Unified Braille Code Project), and such other people as the author of the paper or the Literary Code Committee may decide;

Subject: Proposed alterations by Committee II to the contraction system of grade 2;

Date: 24th June 1994.

I. Background

1. PROCEDURE. At the UBC Project Committee meeting held in Sacramento in March 1993 it was agreed, as part of a general committee restructuring, to set up a committee (designated as Committee III) to deal with legislation relating to the contractions, and I was appointed to chair it. Other international members have since been appointed, and there is still a vacancy for an additional UK member. It was recognized at Sacramento that there was a need for such a committee to take an overview on all issues relating to contractions, and not simply those arising out of the need to conform with the symbol structure rules of the project, but it was made clear that, in accordance with the revised guidelines of the project, there must be no major change to the contraction system of grade 2 (guideline B). It was further agreed, in view of the major work previously done in the UK on this subject, that I should employ the Literary Code Committee of BAUK as a consultative group, whose views I would communicate to the members of Committee III itself, and report back their reactions. There would also be interaction between Committees III and II.

2. DOCUMENTS. In compiling this paper I have used the following documents: "A Study of the Braille Contractions" (published 1980); the report of Committee II to the Project Committee and the BANA board (latest version 23 November 1992); my response to the original version (12 October 1992) of this report (completed 5 February 1993, but largely written before the BAUK meeting of 16 December 1992: the main paragraphs referred to are 13, 14, 22, 35, 36); an interim report of Committee II (31 March 1994); a report written for me by Stephen Phippen of the decisions made at the meeting of Committee II held in Los Angeles, 26 April 1994 (I have not yet seen the official minutes); a memorandum written to me by Joe Sullivan, Chair of Committee II (14 June 1994).

3. VALUE OF CONTRACTIONS. For each contraction or group of contractions discussed I have given a numerical value in brackets. This is the number of braille cells, averaged over four counts, which would be lost in a million words of text if the contractions or group of contractions concerned were to be eliminated from grade 2, as given by Lorimer in table 16 of Book Two of "A Study of Braille Contractions". I have had to make one major correction to enable these figures to be aggregated. (As Lorimer was aware, this cannot be done with table 16 as it stands, for reasons which I explained in great detail in a statistical paper I wrote in 1985.) However, I have not thought it necessary for this paper to make the minor adjustments needed to take account of the existence of overlapping contractions. This results in only one significant inaccuracy, discussed in its place in due course. Lorimer calculated the total number of cells saved by contractions and sequences in a million words of grade 2 over grade 1 as being 1419632, and the total number of cells occupied by a million words of grade 1 as approximately 5650000.

4. SYMBOL CONSTRUCTION. One of the primary objects of Committee II is to facilitate computer translation from braille to print. This has two important consequences for code design. Firstly, it is necessary for the boundaries between braille symbols to be unambiguously defined irrespective of their meanings. (Sullivan uses the term "symbol" to mean roughly what I call a "sign"; but, as we shall see, there are differences, so I have preserved his usage in this discussion.) This in turn means that rules for symbol construction have to be very precisely formulated. For this purpose the 63 positive braille characters are divided into two categories: "root" characters, of which there are 55; and "prefix" characters, of which there are 8 (the 7 right-hand characters and the numeral sign). The latter are in turn divided into 6 "general prefix" characters, and 2 "special prefix" characters (dots 5-6 and dot 6). The following rules for combining characters into valid symbols need to be known: A symbol may contain or consist of only one root character, but no character may follow a root character as part of the same symbol; any number of general prefixes, whether by themselves or followed by a root, may constitute a valid symbol, but in the former case they must be followed by a space (which cannot in any circumstances combine with positive characters to form a valid symbol); the rules relating to special prefixes are more restrictive and more complex, but do not, I think, need to be detailed here. Secondly, because the meaning of some symbols is determined by their position, that is, by whether they occur in a character string initially, medially, terminally, or unattached, it is necessary to have very precise definitions for these positions, or indeed for any other determining conditions of use, in order to be able to distinguish unambiguously between two symbols having the same form but different meanings.

Back to Contents

II. Present Proposals

5. GENERAL. All the proposals so far made by Committee II with regard to contractions or the rules governing their use have been motivated, with perhaps one exception, by one or other of the two considerations outlined in the previous paragraph. None of them is final. The committee has also classified its recommendations as either strong or ordinary, according to the effect they are believed to have on the logical coherence of the whole package, but in this paper I have ignored that distinction. The committee has changed its mind on some issues, but I have always tried to report the most up to date state of its thinking, and I have been very much helped in this by Joe Sullivan's most recent memorandum to me, for which I am extremely grateful. In all cases, after stating what the proposal is, I have appended brief comments of my own on its merits or difficulties.

6. BLE. It is proposed to abolish the contraction for BLE (9053) either entirely, or at least in all cases where it is not followed by a blank cell. This is because the numeral sign can occur medially, and there is therefore a potential for ambiguity. The value in terms of space loss assigned to this contraction is particularly unreliable because there is a significant number of occasions when it is followed by letters d or r, so that its deletion would only result in the loss of one cell, not two. Moreover, from figures provided by Gill elsewhere in "A Study of Braille Contractions" it is possible to calculate that about 78.5% of the occurrences of this contraction come at the end of a word. This might seem to favour the weaker form of the proposal, but that would have the unwelcome consequence that the contraction could only be used at the end of a word provided it was not followed by punctuation.

7. COM. It is proposed to abolish the contraction for COM (9902), to avoid the perceived need to insert the grade 1 mode indicator (dots 5-6) before a hyphen occurring between punctuation and following letters in the same character string, since it could otherwise be theoretically misinterpreted as COM. Such situations are acknowledged to be of infrequent occurrence. The grade 1 indicator is required when it is necessary to show that the following character is a letter or punctuation, etc, and not a number or contraction. Of course COM can only be contracted in initial position, and it would seem to me that this difficulty could be eliminated by a more rigorous definition of what this means. However, Committee II does not mention that it is also possible for the hyphen to occur initially in such phrases as "forty-one or -two"; but I would take the view that this situation is rare enough for the grade 1 indicator to be acceptable here.

8. DD. It is proposed to abolish the contraction for DD (1741), because the abbreviation point often occurs medially, and a grade 1 indicator before all such instances is regarded as oppressive. However, the adoption of this proposal creates an asymmetry, in that it would no longer be the case that the first five alphabetic consonants could be doubled by being lowered. There is a proposal under consideration, on which Committee II has not yet voted, that the full stop and apostrophe signs should be interchanged. I am personally opposed to this on other grounds, but it would not alter the position with regard to DD, because apostrophes can also occur medially.

9. LOWER SEQUENCED WORDSIGNS. It is proposed to abolish the contractions for the words TO (50959), INTO (4064), and BY (8662: aggregate 63685), because these contractions can only be used in sequence with the next word (see paragraph 10 below). The idea of retaining the contraction for TO, but spacing it from the next word and thereby halving the space loss which would result from its abolition, was considered but rejected. At my request, Stephen Phippen put to Committee II the point that it was undesirable for two words of opposite meaning, such as "from" and "to", to have the same configuration (Lorimer's term to describe two signs which would be indistinguishable if written on a blank piece of paper). It should be noted that TO is the most valuable contraction proposed for deletion, ranking third in table 16.

10. OTHER SEQUENCES. It is proposed to abolish the rule permitting the words "and", "for", "of", "the", "with", "a", to be written in sequence (18481: aggregate for sequences 82166), for a number of reasons, which also apply to the sequences discussed in paragraph 9 above. Words are not normally sequenced in print, so braille sequences violate what I call the principle of conservation of space, which can be stated as requiring that any space within text in print must be represented by at least one blank cell in braille. This principle, though not unproblematic, is explicitly accepted by Committee II. Abolishing all braille sequences would end the controversy over the natural pause rule (now abrogated in both British and American braille, though against the wishes of some readers). Moreover, there are cases where two braille characters can equally well represent a word or a sequence: "fora" can be the plural of "forum"; the word "forth", as printed up to about the middle of the 17th century, could have a final e added (though current British Braille rules only allow the TH to be contracted); and there are some troublesome proper names. In addition it was pointed out at the Louisville project meeting in November 1992 that words are in fact sometimes run together in print (and therefore in braille), either in computer programs (where grade 1 would be expected), or to simulate a particular style of speaking: but in such cases without editorial intervention difficulties are created for the human reader, not because traditional braille contains sequences which print lacks, but because it is necessary to prevent braille contractions from crossing word boundaries if the effect sought by print is to be achieved without creating decipherment problems that are unique to braille. There are grounds for believing that the abolition of all braille sequences would be unpopular both in America (as suggested by the Louisville debate) and here. In reply to the 1986 BAUK questionnaire, 33% of respondents would have welcomed more sequences in a reformed code, while 54% thought the number about right. They account for nearly 6% of the space saved by grade 2.

11. DOTS 5-6 SERIES. It was originally proposed to abolish the seven final-letter contractions beginning with dots 5-6 (27255), but Committee II now takes a neutral stance on this issue. See paragraph 13 below. However, there is a price to be paid for their retention. Since dots 5-6 is a special prefix character, and one, moreover, to which a root character cannot be added to form a valid symbol, it is necessary to devise rather complex usage rules to make it unambiguously clear whether, for example, dots 5-6 n is to be read as an alphabetic character, or as the letters TION.

12. DOT 6 SERIES. It is proposed that the two final-letter contractions beginning with dot 6 (23203) should be retained, but that, to take account of the status of dot 6 as a special prefix character, some additional usage rules should be devised to meet some possible, if rather unlikely, situations. See also paragraph 13 below.

13. DOTS 4-6 SERIES. It is proposed that the five final-letter contractions beginning with dots 4-6 (14911: final-letter aggregate 65369) should be retained pending a study of all fourteen final-letter contractions, with a view to their ultimate removal if anecdotal evidence alleging that they are the most difficult contractions to learn to recognize is confirmed. In fact, the studies conducted by John Lorimer in the UK and by Marjorie Troughton in Canada show some of these contractions scoring rather poorly on some of their tests, but the same could be said of quite a lot of others. So I see no good reason why these contractions should be singled out in any new study of the educational impact of contractions that may be commissioned.

14. SHORTFORMS. It is proposed that the 76 existing shortforms should be incapable of having letters added to them, as is the case with simple wordsigns, or alternatively that any valid extensions that are permitted should have to be listed as distinct shortforms. There is a supplementary proposal that any word which constitutes or contains a character sequence that could be misread as a shortform should have to be prefixed by the grade one word indicator dots 5-6 5-6 and written without contractions. (The report actually says that only one dots 5-6 character is necessary, but that appears to be a mistake, according to my understanding of how mode indicators function.) There is a real problem here, but the remedy is far from obvious. In my terminology shortforms of course constitute one class of "signs", but in UBC terminology a shortform does not constitute a single "symbol", because (as pointed out in paragraph 4 above) a symbol cannot contain more than one root character. (The contraction for "into", which it is proposed to delete, and the dots 5-6 series (for a different reason) also belong in this category.) It must therefore be regarded as an aggregation of symbols which has to be interpreted as a single entity. The question therefore arises, how, for example in automatic braille to print translation, does one distinguish unambiguously between such an aggregation and a sequence of symbols which are to be interpreted separately? In the case of unfamiliar words (especially proper names or acronyms) there can also be problems of interpretation for human readers. It is impossible to quantify the space loss involved in the main proposal without knowing exactly how it would be implemented, but that of course is not the main issue. To adopt the more radical form of the proposal would appear to be simplest, but people would then have to remember, for instance, that they could contract "afterward" but not "afterwards"; "beside" but not "besides"; "could" but not "couldn't"; "great" but not "greater"; "immediate" but not "immediately"; "letter" but not "letters"; "necessary" but not "unnecessary"; and this might not be popular. So we are led to the weaker version, which would mean that every extended shortform would have to be listed and learned as a separate contraction in addition to the 189 we already have. It is suggested that the list could be quite short, and/or that specific prefixes and/or suffixes could be routinely allowed. Joe Sullivan has offered to send me a comprehensive list of words which contain shortforms embedded within them, irrespective of whether it is permissible to use these contractions under current SEB rules. When I have studied this, I shall be in a position to circulate notes on how such a list of extensions to shortforms should best be constructed. But it will not be easy to draw the line between valid and invalid extensions in such a way as to safeguard the interests of both human readers and machines. Of course the fewer valid extensions we admit, the less need there will be to invoke the supplementary proposal.

15. HOMOGRAPHS. It is proposed that words which have the same written form in print (including proper names and acronyms) should have the same written form in braille, irrespective of meaning. It will, however, continue to be permissible for the braille form of a word to vary according to the braille characters to which it is in contact, as is the case with lower wordsigns. Committee II was originally unwilling to allow this, but perceptual arguments seem to have won out. The obvious reason for the proposal is that distinctions based entirely on meaning cannot be automated; but it is also argued that it is wrong to put the braille reader in a different position from the print reader by imposing on him distinctions which print does not make. However, as I have often pointed out before, the very existence of contractions, and whether or not they are used, sets up expectations about word structure in the mind of the braille reader which make human word processing significantly different from machine word processing. So far as homographs with identical or very similar pronunciations are concerned, I think this proposal would be generally accepted. Difficulties begin to arise with simple wordsigns when, for example, "do" can mean the musical note, SO (printed in full capitals) can stand for Symphony Orchestra, or US for the United States; or when contractions can be used in an acronym even when it is pronounced as its constituent letters and not as a whole word. The increasing frequency with which one comes across such things reinforces my conviction that they are a barrier to fluent reading, and makes me ask once again whether we should not consider openly adopting a dual standard of contractions for braille, according to whether it is produced manually or automatically. One particular application of Committee II's proposal which is worth noting is that the small letters a, i, and o when standing alone would be brailled without a letter sign just like the single letter words with which they are identical in form, since there is no simple wordsign with which they could be confused. This introduces an unfortunate asymmetry into the representation of single letters, which in certain contexts would become very marked. It will still be necessary, of course, to insert a letter sign before the letters a-j (but only those), when immediately preceded by an arabic number, since this does remove an ambiguity. Conversely, a single capital letter, even when followed by a stop, must be preceded by a letter sign as well as a capital sign to distinguish it from a simple wordsign with an initial capital letter. This must be right, but it should alert people to the fact that rules affecting the sufficiency of letter and capital signs will have to be tightened up, and this is likely to mean more rather than fewer of them coming together.

16. STANDING ALONE. It was proposed that what it means for a sign to be "standing alone" should be more rigorously defined, and this has apparently been done at Los Angeles, though I have not yet seen the refined definition. This is clearly an issue on which Committee III should be entitled to express a view. I would put the matter more generally (see paragraph 4 above) by saying that we need to make sure that we know unambiguously what characters a sign may adjoin without affecting its position, whether initial, medial, terminal or unattached.

17. REJECTED POSSIBILITIES. As mentioned in paragraph 15 above, it was originally proposed to abolish the rules restricting the use of lower signs, but the perceptual problems which this would cause have now been acknowledged, and the proposal has been rescinded. It was originally proposed to change the question mark to dots 1-5-6, thereby making it impossible to use the WH contraction terminally; but this has also been rescinded, on the ground that there is merit in keeping all the major punctuation marks as lower signs. This has no practical effect on existing contractions, since the letters WH do not naturally occur at the end of English words, but it leaves open the possibility of using the character terminally to represent some new contraction, if this is thought desirable. In view of the proposal to change the method of showing italics, consideration was given to allowing dots 4-6 s to represent the word "less", but this was rejected to avoid creating an asymmetry in the rules governing the use of final letter contractions. The question of whether to abolish the contractions for DIS (as well as DD ) was discussed at Los Angeles, but it was decided not to do this, since the stop is not normally an initial character. Committee II does not appear to subscribe to the view, advocated most powerfully by Peter Duran in his paper for the New York workshop of 1976, that it should be possible without editorial intervention to braille any sequence of print characters in any order regardless of meaning or likelihood of occurrence. None of these rejected possibilities is very likely to be revived, but I thought it desirable to update people who may be acquainted with earlier stages of Committee II thinking.

18. NEW POSSIBILITIES. There are three other contractions which I perceive as being under possible threat, perhaps unjustifiably. The proposal to change the signs for brackets, made at Los Angeles, creates the possibility that a new literary or technical use of lower g will be found which conflicts with its use as the contraction for GG. More importantly, the dot locator is used in Committee II documents, and is therefore presumably intended to form part of any unified braille code, so the use of the contraction for FOR (except as a word: about 25% of all instances) appears to be under threat, in view of the need for automatic braille to print translation to be as free from editorial intervention as possible. Finally, the line sign is not used in BANA countries, and Committee II has failed to notice that something needs to be done to prevent possible clashes with the terminal use of the AR contraction (about 12% of all instances). This could be achieved either by abandoning the line sign altogether (which I would not support), or by changing the rules governing its use (eg by requiring that it should be spaced from all preceding punctuation), or by abolishing the terminal use of the AR contraction (leading to an unwelcome return of the collocation EA contraction plus r in some but not all cases). There may be some other items of this kind which I have overlooked, but there will be a thorough search for outstanding symbol clashes when the project is nearing completion.

Back to Contents

III. The Way Forward

19. IMPACT OF PROPOSED CHANGES. On the basis of the figures I have provided, the total number of cells lost in a million words of grade 2 text by implementing all the proposals contained in paragraphs 6 to 13 above would be 168231. This is about 12% of the total space saved by grade 2 over grade 1, and would add about 4% to the cell count of grade 2 text. Taken in conjunction with possible adoption of full capitalization, other changes not yet quantified (including those not affecting contractions) which UBC can be expected to bring, and the increasing prevalence of single sided braille production, British braille will become significantly bulkier than it was 30 years ago or less. On the basis of work done by Tobin at Birmingham University, I would expect this to result in a small but noticeable adverse effect on reading speed and fluency. The choice of contractions proposed for deletion, being unrelated to space saving considerations, merits closer scrutiny. In addition to the upper word sequences there are 20 contraction types, amounting to about 11% of the total number, though it has to be said that some of the final-letter contractions, and especially the dots 4-6 series, are in my view not under serious threat. The mean value of all grade 2 contractions is 7511, whereas for those proposed for deletion (treating upper word sequences as if they were an extra contraction type) it is not much higher at 8011, though this figure would rise sharply if the final- letter contractions were retained. But because of the large number of uneconomic contractions in the system, the median value (using my amended version of table 16) is as low as 1820, and only four of the contractions listed by Committee II fall below this figure, all but one of them final- letter contractions. None of the lowest ranking contractions is listed, though the third highest is. Other criteria of value are just as important as space saving - some would say more so - and I only spend less time on these because I have less hard information to give. But it seems to me that the complexity and probably also the number of rules will be increased in UBC as a whole, though Committee II is entitled to point out that some rules do not need to be known by readers (but what about writers?) at an early stage. No attempt has so far been made to target for deletion contractions (like "character") which are known to be hard to recognize, or those (like some shortforms) which are known to be hard to remember; and the issue of contractions which tend to be confused with one another has not yet been addressed as such. Nor has the question of liberalizing (with a view to simplifying) the rules governing bridging contractions, diphthongs and the like, on which Les Pye has written so persuasively. But all this, it will be pointed out, is now within the remit of Committee III. And it cannot be too highly stressed that all the members of Committee II, as well as other people actively involved in the UBC project, are strongly in favour of study assignments being given to the Braille Research Centre in Louisville, and of the results being acted upon. In view of all this, I expressed the private opinion, both at Louisville and Sacramento, that if deletions of the kind indicated above were to be implemented, a substantial amount of the space lost should be made good by the addition of a small number of new contractions, and that we should not finally accept any major deletions until we had a reasonably clear idea of where the compensatory contractions are coming from. For there are undoubtedly difficulties associated with my view. New contractions would have to be very carefully chosen so as to harmonize with the rest of the system; it must, for example, be recognized and accepted that the prevailing climate of opinion on both sides of the Atlantic - I do not know about the southern hemisphere - would be extremely hostile to the idea of substitution. In addition, the scope for finding sensible signs for new contractions is not unlimited: Joe Sullivan in his recent memorandum has pointed out to me, for example, that combinations of dot 5 with non-alphabetic characters are being rapidly used up on technical assignments. No one is yet in a position to calculate the deviancy of the UBC literary code with respect to grade 2, but my estimate, assuming all the proposed deletions, would be 10%. If this seems an undesirably high figure, it could be pointed out that the deletion of contractions results in deviant forms that are not totally unfamiliar; but of course it remains true that the figure will be made worse by the addition of "any" new contraction.

20. MATTERS REQUIRING DECISIONS. We are of course free to defer a decision on any matter, but the following fall to be decided on 11 July: whether to endorse or reject any or all of the proposals contained in paragraphs 6 to 16 above; whether or not to accept in principle the view that significant space loss should be substantially compensated for by the creation of any new contractions; whether at this time to suggest any contractions for consideration; whether to delimit at this time the amount of change to the contraction system that we would consider acceptable; whether to lay down any principles that should guide the direction that change should take in this area; whether to propose any subjects for study by the Braille Research Centre; whether to recommend that certain changes, or indeed all changes, to the contraction system should only be made after their educational impact has been properly investigated by the Braille Research Centre; whether to make any recommendations with regard to liberalization of the rules governing the use of contractions; whether to recommend to the Annual General Meeting of BAUK on 20 July the appointment of a specific additional UK representative onto Committee III; whether to make any other recommendations not covered by the nine items listed above. Any decisions we do make will be transmitted to the other members of Committee III as soon as possible.

Back to Contents