BA/9/94
From: Bill Poole (Chairman, UBC Committee IV);
To: All members of Committee IV, all members of the Languages Committee of BAUK, all officers of BAUK, Joe Sullivan (Chairman, UBC Committee II), and such other people as the author of the paper, UBC Committee IV, or the Languages Committee of BAUK may decide;
Subject: The interface between foreign language and linguistics codes and English braille in the context of the Unified Braille Code Project;
Date: 15 August 1994.
1. PURPOSE OF PAPER. On 19 May 1994 there was a meeting of the Languages Committee of BAUK, at which the main topic of discussion was the interface between special braille codes for foreign languages or linguistics and English braille (particularly the literary code) in the context of UBC generally, and certain specific proposals contained in the report of Committee II dated 23 November 1992. Four subject areas, contained in paragraphs 3.11 to 3.14 of the report, were identified as requiring our special attention, and a fifth emerged in discussion as a result of more recent deliberations by Committee II. The purpose of this paper is to acquaint the members of Committee IV with the conclusions we reached on 19 May, to seek their reactions to these, and to invite them to comment on any other matters not covered at all, or not adequately, in this paper, which they regard as meriting the attention of the committee. I will report on the progress of our discussions to Committee II at or before its project meeting in London in January 1995. I shall also append to this paper a brief outline of other activities of the Languages Committee of BAUK, which may be of interest to recipients of the paper.
2. ACCENTS. In paragraph 3.11 of its report Committee II proposes the following assignments for accents: acute, dots 4-5 3-4; cedilla, dots 4-5 1-2-3-4-6; circle over letter, dots 4-5 1-2-4-6; circumflex, dots 4-5 1-4-6; diaeresis, dots 4-5 2-5; grave, dots 4-5 1-6; tilde, dots 4-5 1-2-4-5-6. It is proposed that these seven signs should be used in foreign language words which appear as part of English text, and that they should follow the letters to which they apply, irrespective of their position in print. It is proposed that the general accent sign, dot 4, should be abolished. Several issues arise.
(a) How do we distinguish between occasional foreign words occurring in English context, in which these signs are operative, and foreign language passages, in which they are not? Are there any other contextual categories which we need to identify? I shall address these questions in paragraph 3 below.
(b) Are these the best assignments for the seven accents listed? At our meeting on 19 May they were not seriously contested, at least within the context of a series of two-cell signs with a common first character. Convenient mnemonics are provided by the shapes of some, such as acute and grave, and by the special foreign code assignments of others, such as cedilla and tilde. However, if members of Committee IV have significant improvements to suggest, or significant disadvantages to point out, now is the time to do so.
(c) Are there any other accents for which assignments should be made at this time? We did not discuss this, but our focus has been almost exclusively on western European languages, and in my view there is a good case to be made for legislating now for at least some of the following: the hacek (a tiny v above the letter, as in Czech); an oblique stroke (top right to bottom left) through a letter (as in Danish o); a horizontal stroke through a letter (as in Polish crossed l); a dot above or below a letter (as in Turkish, and in the transliteration of some non-Roman scripts). The list could be continued, since the range of accents on letters in occasional foreign words occurring in English texts for the educated layman has been increasing markedly. In my view the best solution would be to have about three transcriber-defined accent signs in addition to the specific assignments which we eventually agree to. In some cases an accent is an integral part of the letter rather than a separate mark attached to it; and in some languages accented letters count as separate dictionary entries, while in others they do not. But I do not on the whole think that these considerations should affect the ways in which accents are treated in UBC braille.
(d) Should the general accent sign be abolished? For the ordinary reader it provides an uncomplicated way of indicating the presence of an accent of some kind in the text, but clearly there is merit in being able to distinguish the type of accent in all situations, and for automatic braille to print translation this would appear to be essential. However, the view was forcibly expressed that it is undesirable, and may be a positive deterrent, to burden new adult learners of braille at an early stage with the need to get to know a substantial number of two- cell signs for little reward. In certain situations British Braille provides for the use of the special foreign code accent signs to be used in a word, so long as it is prefixed by the letter sign. Would a system of this kind be preferable to the new proposed two-cell signs, or would that simply make ordinary English words which happen to contain an accented letter even more unfamiliar to the general reader?
(e) Should the proposed new accent signs follow the letters to which they apply? We were very much opposed to this, though we recognized that, whether they precede or follow the letters, a logically coherent system results. However, in traditional English literary braille (though not necessarily in technical codes) modifiers tend to precede the characters they modify, and the placement of the accent sign is only one particular instance of this; moreover, in my experience of keyboarding (though not apparently always) accents have to be registered before the letters they apply to, to obviate the need for backspacing. What positive benefit is to be gained from changing traditional English braille practice in this respect? Given that UBC and traditional braille would be likely to exist in parallel for a considerable time after the projected implementation of UBC, do we really want a situation in which ordinary readers are uncertain, even if only momentarily, as to which letter in an unfamiliar word the accent covers?
(f) We are in the process of building up a substantial series of signs for accents based on dots 4-5, and this erodes the scope for creating new contractions based on this character. Does this matter? This is more a question for Committee III, to which I shall be reporting on it in greater detail, but at present the answer appears to be no. Evidence from other countries has yet to be collected, but there is no inclination within BAUK to compensate by the addition of new contractions to this series for the loss of space that would result from the acceptance of some of Committee II's other proposals. In any case it should be noted that all the accent signs so far proposed use non-alphabetic characters, which would probably be regarded as less of a threat to any contraction development that might later be thought desirable.
3. GREEK LETTERS. In paragraph 3.12 of its report Committee II proposes that small Greek letters should be prefixed by dots 4-6, and capitals by dot 6 plus dots 4-6. It is also proposed to use the character set which has been employed internationally for well over a hundred years in classical Greek transcriptions, rather than the one favoured in Greece itself, which alone is listed in the current edition of "World Braille Usage". Two issues therefore arise.
(a) To take the simplest question first, was it right to opt for the classical rather than the modern Greek character set? Speaking as a classical scholar, my answer would be an unequivocal yes, and this was supported by our Languages Committee on 19 May. These signs are used not only in classical Greek texts, but also for Greek letters occurring in all mathematics notations of which I have detailed knowledge, and it was a mistake by the compilers of the latest "World Braille Usage" not to include both sets. Modern Greek comes well down the list of languages with which English has to interface, though of course I am not saying that it never happens. It would be easy to develop a "politically correct" argument on the basis that braille scripts should be the property of their indigenous language users. I will reply to this if it is seriously put, but at the moment it seems sufficient to say that I am not seeking to deny to any language group the ultimate right to braille their language as they think best.
(b) What is the scope of operation of the Greek letter prefixes? As will appear, this conceals two interrelated questions. The characters to be used for these prefixes were not themselves contested on 19 May, but it emerged that there was uncertainty as to whether these prefix characters would need to be repeated before every Greek letter in a sequence or not. It was recognized that it would be intolerable to do this for small Greek letters, let alone capitals, and the matter was referred to Joe Sullivan, Chairman of Committee II, for clarification. In a memorandum dated 18 June to his committee, and copied to me, he has dealt with the issues involved, and my analysis of the situation is based on the information which he very helpfully supplied. He begins by pointing out correctly that the Greek letter indicators are only prefixes, not complete symbols, since under UBC symbol construction rules a prefix character (other than dots 5-6), or a sequence of such characters, cannot constitute a valid symbol unless followed by a space. Therefore in representing the Greek alphabet it is the prefix (or prefixes) plus root character which alone constitute a complete symbol. This is important because it means that under present rules the effect of the prefix is terminated after the first root character, since a valid symbol can never contain more than one root character. He then goes on to quote the report of his committee to show that the use of the Greek letter symbols does not apply "where actual Greek language passages ... are to be transcribed", but only "where Greek letters are used in essentially English or technical context". Here he is making the distinction which I drew in paragraph 2 (a) above, but the problem is to know how to apply it in practice. His examples tend to be uncontroversial, and though he develops his argument in a way which shows his awareness of problem areas, he comes in my view to no very clear conclusion. So let me try. We would all recognize that a quotation from Xenophon on Pindar occurring in English context is an "actual Greek language passage" not requiring the Greek letter symbols, but covered by paragraph 3.13 of the Committee II report (to be discussed in my paragraph 4 below). We would all, I assume, equally agree that there would be nothing wrong with using the Greek letter symbols, singly or in small groups, in occasional mathematical expressions occurring in English context, or for the initials of American fraternities and sororities, or in other situations where the number of consecutive Greek letters remains small. But what about the odd word, or small group of words, occurring in English context, or groups of Greek letters not constituting words, and not easily recognized as such, perhaps, by the text processor? More important, what about bilingual texts, such as grammars, where English may be interspersed not only with the occasional Greek word or words, but with word fragments (such as prefixes, suffixes and inflections)? I will confine myself to Greek here, so as to avoid straying too far into the territory of the next paragraph; but it is easy to imagine situations where Greek script does not establish itself for long enough to be recognized by the reader without the aid of some kind of indicator, but where the profusion of Greek letter symbols as presently proposed would be clumsy and inappropriate. British literary braille has a solution which would be unacceptable within UBC, and which is in any case not unproblematic, whereby individual Greek letters are prefixed by dot 2, and individual Greek words (up to three) by dots 5-6, with a double letter sign before a group of four or more words and a single letter sign before the final word. Although this cannot be recommended within UBC, it suggests a solution which could. There should be a Greek mode, similar to some of those already defined by Committee II, providing indicators for Greek letters, words and passages. I suggest no assignments at this stage, though others may wish to. Simple boundary definitions could in my view quite easily be devised, which would ensure that the vast majority of situations were covered satisfactorily. Greek is still probably the most frequent non-Roman script to occur in English context, though obviously there are English books in which Arabic or Cyrillic or some other is the dominant non-Roman script. This solution might also be of assistance in mathematical or other technical contexts where there may be topics in which collocations of Greek letters may be frequent and/or large, but that is a matter for Committee II to decide. Joe Sullivan is doubtful about the need for a Greek mode, but I would strongly recommend its acceptance to Committee IV.
4. FOREIGN LANGUAGE INDICATORS. The question of foreign languages generally now arises. In paragraph 3.13 of its report Committee II proposes that the present treatment of foreign languages in English context should continue unless a specialist committee (now appointed as Committee IV) decides otherwise. Some of the ground has already been covered in the previous paragraph. But some new issues arise.
(a) Should we draw a distinction between the treatment of foreign languages written in Roman script and those written in other scripts? There is in my view a good case to be made for creating, say, two modes for transcriber defined non-Roman scripts, similar to Greek mode. They could be used in bilingual text as well as in foreign words or letters occurring in English context. The occurrence of such non-Roman script material would in any case have to be recognized before transcription from print to braille could be automatically effected, or at any rate completed. The modal treatment of non-Roman scripts, and the need to differentiate them from one another, can be advocated on the ground that their use affects more radically the way in which the 63 positive braille characters are to be interpreted than does the mere use of another language employing Roman script. It is, however, seriously questionable whether such mode indicators, if agreed to, should be regarded as part of basic braille, because of the undesirability, already referred to, of proliferating complex signs which adult learners would have to become acquainted with at an early stage. This would mean that ordinary readers would sometimes come across character collocations which they could neither read nor understand; but is this significantly different from the situation in which print readers find themselves when confronted with the occasional word in an alien script or language? Many of our decisions will have to be taken against the background of the increasing use of foreign words from a growing number of languages which occur in quite ordinary English books nowadays. Of course special modes would not have to be used when words from such languages as Arabic or Russian were transliterated into Roman script.
(b) Should there be switch signs to move from English to foreign languages employing Roman script and back again? Such switches would not be required for books wholly in a foreign language, or for the occasional foreign word occurring in English context (which are still generally distinguished by italic type, unless they are being regarded as thoroughly anglicized); how, after all, would such words be recognized by humans, let alone machines? It is unlikely that there is one standard dictionary whose arbitration would be accepted throughout the English speaking world. Problems, however, might arise in relation to books with significant amounts of continuous foreign text, or with bilingual books, such as grammars, readers and parallel text editions. Nevertheless, the BAUK Languages Committee was strongly opposed to a switch sign in these circumstances also - the issue of non-Roman scripts was not clearly identified - using arguments some of which had already appeared in Committee II's report. A switch sign would be clumsy because it would have to occupy two cells (one member even raised the spectre of a separate switch for each Roman script language). Print does not use, or apparently need, such a switch, so print to braille translation (not, I think, braille to print, as stated by Committee II) would require human intervention. However, there are times when readers are temporarily confused by not immediately recognizing what language a word belongs to; this situation also occurs in print, but it is in my view exacerbated by the use of English contractions in foreign words. Imaginative formatting can ease the difficulties in grammars and readers. I would therefore recommend to Committee IV that we should not support the use of any special explicit braille indicator to mark the boundary between two languages which both employ Roman script.
(c) How should punctuation, composition, and other ancillary signs be treated in foreign language text occurring in books produced in English speaking countries? Besides the obvious punctuation and composition signs, what is at issue includes such signs as the asterisk and dagger, simple mathematical signs that are liable to crop up in ordinary context, and the method of representing unit abbreviations. I will refer to all these collectively as ancillary signs. The strongly held view of the BAUK Languages Committee is that foreign language text produced in the UK (which may of course contain substantial amounts of English intermixed), should, so far as possible, use the same ancillary signs as would be used in the country where the language concerned is indigenous. British teachers of foreign languages in particular stress the need for the books we produce to provide an easy stepping stone for students to become familiar with the conventions which apply in the countries where the languages are spoken, since many of the books they need will have to be obtained from there. This raises problems, such as the need to research some of the details of indigenous braille practice, and the fact that ancillary signs can vary slightly between countries belonging to the same language group. However, it is the problems raised within a UBC context that I want to focus on. If British practice with regard to ancillary signs is imported into UBC, the extent of its application will need to be carefully defined, and it will have a substantial impact on automatic translation, especially when taken in conjunction with the recommendation in subsection (c) above, since books with significant amounts of foreign language text have not yet been excluded from the scope of UBC (nor am I arguing here that they should be). The practice with regard to ancillary signs in other UBC countries needs to be known and compared. Some of them (such as those used in association with numbers) fall clearly within the jurisdiction of Committee II. Some others (such as the Spanish initial question mark and exclamation mark) might cause confusion in mixed language contexts. There was once an international consensus with regard to the basic punctuation and composition signs. The British began to break away from this in 1905 when we abandoned the dots 2-6 question mark in favour of the use of this character for the EN contraction. We diverged further in 1932 when as part of a joint agreement with the US we changed the capital sign from dots 4-6 to dot 6. Other non-English speaking countries have also been gradually breaking ranks in other respects, motivated by considerations of very varying cogency. If we want to differentiate between opening and closing round brackets (as I do), we simply have to change our bracket signs. There is a proposal currently before Committee II but not yet voted on, to interchange the full stop and apostrophe signs. This would in my view be perceived by British braille users as a major and disruptive change, and I expressed opposition to it on 19 May; but no consensus was reached within our Languages Committee on the issue: after all, it was pointed out, the Germans have gone over to the use of dot 3 for the full stop without apparent hardship (and before 1932 we used it for the abbreviation point, which we distinguished from the full stop ending a sentence. In my view international agreement should be sought for a few of the basic ancillary signs, which are not peculiar to the character sets of only a few scripts or languages, to have as nearly as possible universal braille assignments. It is probably too idealistic to expect much progress on this in the near future, but it should, I think, be referred to the WBU Literacy Committee.
(d) Should foreign language contractions be used at all in foreign language text produced in English speaking countries? In the UK we use a few of the most frequent ones, and expect those of our students who intend to do a substantial amount of foreign language reading to learn the fully contracted codes, where these exist. We are currently preparing books in English, in both braille and print, setting out the French and German codes, including obsolete signs which can still be encountered in books held in library stocks. My belief (which is open to correction) is that other UBC countries ignore foreign contracted codes as far as possible. But I have flagged up the issue because a move towards British practice within UBC would complicate the automatic production of foreign language braille.
(e) Are there any assignments that we should discourage Committee II from making, so as to avoid possible conflicts with what Connie Aucamp calls sibling codes, ie codes for languages used alongside English in UBC countries, such as French in Canada, the indigenous languages of Africa, Chinese in Hong Kong, and the Asian languages of the Indian subcontinent? We need to be better informed than I at any rate am about some of the possible pitfalls in this area.
5. PHONETIC SCRIPT. In paragraph 3.14 of its report Committee II proposes that the treatment of phonetic script within UBC should be considered by a specialist committee (now appointed as Committee IV). We should also in my view consider whether there are any other topics in linguistics for which we may in due course wish to legislate. Around 1925, with the assistance of Professor Daniel Jones, RNIB developed a braille representation of the symbols of the international phonetic alphabet, which has been made available throughout the world. Other countries have, however, devised divergent systems, and the UK code will in any case need to be updated to bring it into line with changes in the IPA itself. Around 1969, shortly before becoming a member of what is now BAUK, I was responsible, initially only within the jurisdiction of RNIB, for changing the sign (dots 5-6 lower g) which according to the codebook should enclose phonetic symbols to two signs: dots 4-5 lower g for phonetic brackets (square brackets in print), and dot 4 lower g for phonemic brackets (oblique stroke in print); as with the original symbols of enclosure, I used the same sign for the opening and closing of these brackets. Clearly this will have to be changed again in order to conform with the UBC philosophy of differentiating between opening and closing brackets. I do not generally (and did not then) approve of unilaterally changing a sign in what was intended to be an international standard braille code; but I felt constrained to act as I did, because in 1966 the British Mathematics Notation Committee (of which I was then a member) decided to use dots 5-6 lower g for the equals sign so as to make it acceptable within literary braille, and no one else seemed much interested in the conflict which I indicated would result. In the light of this experience I very much welcome Committee II's intention before the completion of the project to check all symbols for possible conflicts of this kind. However, at its meeting in Los Angeles in April it was suggested that Committee II itself should make assignments for the brackets which enclose phonetic symbols, leaving Committee IV free to design the actual code. In my view Committee IV should make these assignments in consultation with the Chairman of Committee II, who will of course have an overall responsibility to identify and eliminate any symbol conflicts that may arise before the completion of the project, as described above. The reason why in my view Committee IV has to make these assignments is that the choice of characters used will have an effect on what is available for use in the phonetic script itself, since its character set, including modifiers, is large and complex, and there is pressure to maximize the equivalence between characters which represent specific phonetic sounds and characters which represent the letters which make those sounds in particular language codes. There is also another problem. The standard UBC signs for oblique stroke and square brackets, whatever they turn out to be, cannot be used to enclose braille phonetic script without generating an ambiguity in automatic braille to print translation, not to mention confusion for the human reader, since these signs would not be specific indicators, but occur frequently in other literary and technical contexts. On the other hand, if we use different braille signs, as I believe we should, we will be violating an important principle of UBC philosophy, since the braille representation of print oblique strokes and square brackets will depend on their meaning. As regards the symbols of braille phonetic script itself, there is in my view no urgency for us to determine these, but I would wish to see the Daniel Jones system followed as closely as possible, since it has been in wide use for a long time. However, members of the committee who are familiar with other systems may have different views. There are two other general points to be made. Committee II has rightly pointed out in its report that the Daniel Jones system violates the UBC rules for symbol construction (eg by having signs which consist of more than one root character); it will have to be decided how far, if at all, this is tolerable in a code delimited by special indicators. Finally, in this system modifiers follow the characters they modify (contrary to what is recommended in paragraph 2 (e) above); if we support that recommendation, we shall have to decide whether or not we want this anomaly to continue.
6. LOS ANGELES DECISIONS. There are two decisions made by Committee II at its Los Angeles meeting last April which I need to draw to Committee IV's attention. Committee II took the view that diacritics fall within its jurisdiction, and it intends to legislate about these. I use this term to refer to any marks attached to a letter such as accents or breathings, which serve to distinguish this form of the letter from an unmodified or differently modified form of the same letter. Committee II extends it to cover stress and scansion marks, which is fair enough. In my view all these matters come within the scope of linguistics, and should primarily be under the jurisdiction of Committee IV, with appropriate collaboration of course. In addition Committee II has passed a resolution "that to the extent possible, Committee II will choose symbols that do not increase the amount of international diversity". This is the kind of resolution to which people can appeal to support or attack almost any specific decision that falls within its scope. But the resolution was motivated by a desire to recognize the fact that English is not isolated, but has interfaces with many other languages, and we should certainly welcome and support the spirit behind it. I would also welcome Committee IV's views on the desirability of retaining extended grade 1 modes, since our attitude to this impinges on other recommendations we might make. I am personally in favour of this, but Committee II seems now to have become very much exercised over the question of whether grade 1 word mode and passage mode are really more trouble than they are worth.
7. CONCLUSION. I have dealt in detail with the four sections of Committee II's report which cover the territory of Committee IV, and in the course of doing this I have managed to incorporate all the extra material which I promised in paragraph 1 above, so I have nothing further to add. This is of course a discussion document, not a set of cut and dried proposals awaiting approval; and although I have made some recommendations for the committee to consider, I have left many important questions unresolved. I very much look forward to receiving the contributions of other committee members to the debate.