G L O S O R -= DICTIONARY-PRACTICE =- (C) Copyright 1996-99 Joakim Ek Version 4.02 1999-09-27 This is an enhancement of the previous version, 3.2x, that now supports other non-cyrillic languages, like spanish. I have rewrote a lot of the internal file-handling, but I've tried to keep it fully backward-compatible with old DAT-files and switches so it should be possible to replace the old GLOSOR.EXE with this one. The conversion-functions is not updated to handle other languages, that will come in a later version... The program shows a word or a phrase in one language and one answers in the other. The language in which it asks, and which word, is selected randomly from all the words in the current DAT-file. This behaviour can be changed with switches though, see below. There is a prompt saying which way the translation is supposed to be, like "BUL->ENG" which means the displayed word is BUL (Bulgarian) and the answer is wanted in ENG (English). It also keeps each language(s) on separate sides, so the inputfield will change from the left side (usually the known language) and the right side (usually the language you are learning). Your answer must match exactly the answer in the database, either just the normal+hilited, or the complete word(s) including the dim. It will consider it wrong if it is misspelled, words are swapped, or only a part of the dimmed is typed in. It is not case-sensitive though, but when using cyrillic it supports only upper-case letters. The answer can be made in any of the available to- languages, or pronounciation or latinised version of the cyrillic, whatever is available. It now supports both my cyrillic codepages jek850 and jek852, but also KOI-8, called CP:888, and CP866/Alternate russian, called /CP:866. Each of theese codepages can be used both in the source-files and on the screen, and it converts apropriately. Note though that it currently supports only uppercase letters for cyrillic. If no cyrillic is available on the screen it converts to latin letters instead. (My drivers KBJEK and JEKFONT can be used to get cyrillic support in DOS, and they also have a codepage/setting that works with the /CAPS-function) The latinization of words are made as phonetic in correspondense to swedish pronounciation, wich differs slightly from the normal ones used. Especially because the letters  and ™ is used, that are not present in English alphabet. The following table describes the cyrillic and corresponding latin letter. The key that is used to type when using jek850 is also displayed: CYR 850:a b c d e f g h i j k l m n o p q r s t u v w x y z † „ ”  CYR 852:à á æ ä å ô ã õ è é ê ë ì í ó ï ù ð ñ ò þ â ø ö ÷ ç î ÿ ú ü SOUND: A B ZH D E F G CH I J K L M N O P SHT R S T JO V SH TS TJ Z  JA ™ KEY: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z  Ž ™  ENGLISH KEY: (A-Z SAME AS SWEDISH) [ ; ' ] On the line between the questions there is a ( ) with numbers inside. First (or only) is the number of remaining words in the database. The second number, if present, are the number of words that have not been correctly answered. On the right side of the line there is a number showing the number of asked questions, the number of right answers and the right answers in %. These numbers are not zeroed even when the database restarts with all words. To begin at 0 again you should exit and restart the program. When using old DAT-files, made for GLOSOR 3.xx (they do NOT begin with ***) it will default to Bulgarian spelling+pronounsiation to the left, and Swedish and English to the right. With new DAT-files it can be diffrent languages in a file, and one can select which to use where with switches. If no switches is used the first two languages in the file will be to the left and the next to to the right, if it's not 4 languages it will use some other rules to select them, just try or specify with switches. When a question is answered correctly a * will mark that question. You may also set an option that will turn the color of right answered questions green, and wrong answers red. If you specify a DAT-file with the /FIL: switch it will default to having the /FELFŽRG option turned on, this can be forced off by using /-FELFŽRG. If you think an answer is right, even though the computer thinks you are wrong (like omitting one of two options) you might override it by beginning the next answer with + or , The previous question will be marked with + (instead of *) and, if enabled, the color made green. The score will also be added by 1. Normally the vowel in the stressed syllable will be marked with hilited text, but this depends what is put in the DAT-file and may vary. There are also often some dim letters/words. This is either used to display information like SING for singlyular, V for verb and others, and also to make some words optional in the answer if there are several words corresponding to one word in the other language. The way the dim words are used might differ in diffrent DAT-files, but that should be clear after using it a while. That one of two words in an answer is dim does not nessecary mean that it's a less used version of the word, but usually the most common is the one in normal white. Usually words with the same meaning are separated with , and words of diffrent meaning are separated with ;. Usually XXX is used as a placeholder where any word can be written. In the end of a phrase ... is often used instead. When all words in the database is done, it will restart. Depending on the option it will ask all words again, or just the ones you answered wrong. You may also specify wheather the ones corrected using + should be asked again. The score-counter is not reset. You will have to exit the program if you want to begin at 0 again. If all words are answered correctly it will start over with all words again. On the line between the questions "OMSTART" indicates restart, and "OMSTART, ALLA ORD" a restart with all words. To exit the program answer with just Q or 0, or press Ctrl+Q. It uses DOS to input keystrokes, this is because no internal functions in QB allowed ascii 224 to be entered. (used for cyrillic A in my jek850). This means that some keys don't work normally. Like F-keys send strange letters, you'll just have to not hitting them! Also it made me add the option to hit the ESC-key to close the DOS-key-read, to be able to call up RELEX (an excellent TSR dictionary, not by me, not for free) by hitting ALT+1. You cannot type any letters before activationg the normal key-routine again by once more hitting ESC. A little tune indicates that ESC is hit on or off. While in off-mode you might move the cursor though. Use arrows to move one step or PgUP/PgDown to move one question up_/down or Home/End to move to the first column of Swedish/English or Bulgarian. This is used to take advantage of that RELEX can search for the word under the cursor when hitting ALT+1. If you do not have a bulgarian keyboard-driver, it's bossible to use a switch to reconfigure the keyboard to change into a bulgarian one. It is made to work on a swedish keyboard, so some letter won't work otherwise. If you type spanish or another language with accented letters you may enter them using the Ctrl-key together with a letter. The supported special letters currently available are €   ‚ ¡ ¢ and £. It do not differ between accented letters and normal while checking the spelling though, so "A€o" could be answered with "Ano" and it would be considered right by the program. To get a short help over the switches type GLOSOR /? This help is displayed in Swedish. You might make your own data-base of words. The program uses a special format internally, but it has a conversion included that converts to its own format from a text-file of some formats. GLOSOR /KONV? will display some help about this, in swedish. Here are a description in English: Note though that this conversion has not been updated to use other languages, it still is the same routines used in 3.2x that was just for Bulgarian. To use your own words in GLOSOR 4.XX you should edit the DAT-files as explained in the next section. If no ##-line (described later) is present the conversion defaults to a file with the words in the order Bulgarian, English , Swedish. The bulgarian should be in jek852 codepage (cyrillic on lowercase latin, as described in the pronounciation-table above, the KEYS-line) and the fields should be separated with comma , Each word/question is on one line, that means CR/LF separates each word&question. Several words with the same meaning should be separated with . (wich is then converted to , that is used in the program). The words can be entered in 'TON'-format, that is [] around characters that should be hilighted (stressed syllables) and {} around letters/words that should be dimmed. There are some special things to have in mind when using [] and {}: If using a [] inside a {} you should type it like: NORMAL{,D[I]}{MMED} That is a closing } should always come directly after a closing ], then you can put another opening { if there is more dim to come. If this is not made the last half of the word will be displayed in normal, not dimmed. Also note that when dimming a word you should also dim the apropriate spaces to make it possible to answer without the dim. Like: ONE{ TWO} THREE This will make one space optional, and "ONE THREE" will be ok. If you type ONE {TWO} THREE you will have to answer "ONE TWO THREE" or "ONE THREE", with two spaces, to get it right. (This is currently not made correctly in the 1600-word data-file). Example of a source-file: v[†]da,WATER,VATTEN n†q {F},NIGHT Convert it using GLOSOR /KONV (If just this is specified, without filename, it will look for GLOSOR.TON in the above described format) You may specify a filename like GLOSOR /KONV:MINFIL.EXT and it will convert this file to GLOSOR.DAT (the internally used file). Note that it will empty the remaining records of a previous GLOSOR.DAT but not treuncate the file. Therefore it is normally best to delete GLOSOR.DAT before making a conversion. If you start the first line in the source-file with ## this indicates that a parameter-line is present. This line tells the conversion how to handle the file. If any lines in the file begins with # that line will be ignored. (exept of course the first one) This can only be done if a ## line is present first. If a line in either file-format does not contain a Bulgarian (spelled) word or neither of Swedish nor English that line will be skipped. Description of the ## parameter line: ##852TON.;,BULENGSVE ^ ^ ^ ^ ^ | | | | | | | | | 2,3 or 4 fields in the order they appear in the source-file. | | | | BUL=bulgarian(spell), ENG=english, SVE=swedish, UTL=Bulgarian | | | | pronounciation. (If UTL is not present pronounciation is calculated | | | | using grammar rules that normally gives the right result) | | | Separators. First the one used between words of the same meaning (, | | | in the program) then the one used between words of diffrent meaning | | | (;) and third the one used between the language-fields. | | | If # is entered as the language-field-separator the file is considered | | | to be fixed-spaceed instead of character-deliminated. Se later. | | | (If you dont use word-separators enter the default , and ; ) | | Formatting-type. 'TON' is [] around hilighted letters, 'U22' is ASCII 22 | | around hilighted. Both use {} around dim letters/words. 'TXT' means no | | formatting (hilight or dim) is present in the source. | CodePage (EK.CPI) Only '850' or '852' is supported. (jek850 and jek852). | This is for the BUL and UTL fields. ENG and SVE should be uppercase-ascii. ## Indicates that this is a parameter-line. The above is for character-deliminated files, for fixed-space files enter # as the field-separator character, and put the language-fields with # in front on the startposition of each field, like this: ##852TON,;#BUL #ENG #SVE voda WATER VATTEN The first field, #BUL in the example, begins at position 1 but should be entered directly after the options on the parameterline as showed here. You have to have a Bulgarian, BUL, field. Even if you do not intend to use cyrillic. Though you might just enter anything, like a . ,and then enter a pronounciation in the UTL field if using only latinized when practising. If you have my xxx.G files and the G.BAT you can see some examples of conversion. (This files is either supplied with the program or in a special archive probably called G.ZIP from the same internet-site) CUSTOM DAT-FILES IN GLOSOR 4.XX ------------------------------- The converson is currently not updated for other languages, instead one should edit the DAT-files directly. A new feature is that version 4.XX may have the DAT-file specified with the /FIL: switch, so one does not have to make a new conversion each time one wants to use a diffrent set of words. All lines, including the first, of the DAT-file should be of a fixed length, ended with CR-LF (a new-line). The first line specifies the sizes and languages of the fields. It has the following format: ***050 ENG437 SPA437 SWE437 BUL888 ****** It always starts with *** to show that it is a version 4 file, next follows the field-with as a three-difit-number, all fields (columns) should be of the same width. Then there is a space (or any single character) followed by a description of each field. The description is a three-letter-code telling what language to use, and then a 3-digit number specifying the codepage. After that there is a space (or other single letter) before the next column- specifier. When all columns have been specified it should be ****** to show that there are no more fields in the file. The rest of the line should be filled with spaces until the CR-LF. In this case it is four fields, with 50 characters in each, so the total length of each line should allways be 200 characters, plus the 2 characters for the CR-LF newline. For latin languages you use 437 as the codepage, for cyrillic languages you use 850 or 852 (my special codepages), 888 (KOI-8) or 866 (CP866/Alternative) to tell what codepage is used in the file. Note that only uppercase-letters are supported in cyrillic. If you specify 000 or UTL as a codepage that will mean it is a pronounciation-field of the language. Normally it is always a normal version of the same language in the file if there is a pronounciation specified, i.e "***050 BUL866 BULUTL ENG437 ******". If there is a word present in the UTL-field it will be used, otherwise the pronounciation will be calculated using rules. The pronounciation is displayed in swedish when it's calculated. The name of the languages can be whatever one wants, but it is a special meaning of SPA and BUL, to know what rules for pronounciation to use, and SWE and ENG, since they will never show a pronounciation-field. Any other than theese will show the same word in a pronounciation-field as in the normal field if one selects to use a pronounciation-field. It is possible to have any amount of languages and fields in a file, the options on the commandline specifies which of them to use. COMMANDLINE-OPTIONS ------------------- There are several options that can be used to modify the way GLOSOR works. First there are the conversion-options, that should be used alone, and that do not start the actual practis-program. Above we explain: /KONV[:fil] Convert text-file (Default GLOSOR.TON) to GLOSOR.DAT (Database) There are one more conversion available: /SPAB Export words to modified SPAB Windows program This will make files for a special modified version of a windows-program that was made for german-words. Since this is not my program, and therefore not available from me, i don't bother to explain more about this option. Then there are all options and toggles that modifies the behaviour of the practice-program. They can be entered in any order: /FIL:glosfile.dat Specifies which file to use, GLOSOR.DAT is the default. Also when this swith is used the default values will change for some options. /FELFŽRG will be turned on, /CAPS turned off, /CP without a codepage will default to 850 instead of 852 and /+KANINTE will be selected. The old 3.2X-defaults will be used if /FIL: is not specified. /CP[:NNN] CodePage NNN for cyrillic is loaded and should be used If /CP is not specified only latinized cyrillic will be used, If /CP is used without number it will default to 852 or 850. You may specify 850 (jek850), 852 (jek852), 888 (KOI-8), 866 (CP866/ Alternative russian) or 437 (only latin). 0 is the same as 437. You will have to load the cyrillic driver before starting the program. If /FIL is not specified, and /CP is, it will turn on the /CAPS-option automatically. Otherwise it defaults to off. /CGA Use CGA graphic-mode (Monochrome). For some cyrillic-drivers. /TILL Always ask questions in the knwown (Left) language. /FRN Always ask the question in the unknown (Right) language. /SVE Same as /TILL, used for 3.XX compatibility. (ask in Swedish) /BUL Same as /FRM, used for 3.XX compatibility. (ask in Bulgarian) /-LAT Do not show latinized translation of the cyrillic words. /EJLAT Same as /-LAT used for 3.XX compatibility. /LAT Turn on latin words, but this is already the default. /BULGKEY (Or /BK) Translate sedish keyboard to standard bulgarian layout (if no bulgarian keyboard-driver is present.) /FŽRG Show diffrent colors depending of the questions language /-FŽRG Turn this off, but it already defaults to off. /FELFŽRG Change the color depending on if the answer was correct or not /-FELFŽRG Turn this off. The default depends on the /FIL-switch. /MONO Only uses black and white colors, for monochrome displays. /KANINTE Only reuse words that was incorrectly answerd on the next round /+KANINTE As above but do not reuse words that are + corrected either /-KANINTE Always restart with all words. The default depends on /FIL-switch. /PIP Make a diffrent sound indicating if an answer was right or wrong. /-PIP Turn this off, but it's off as default. /1:SVE Specify what languages to use. You may specify /1: /2: /3: and /4: 1 and 2 is the known lagnuages to the left, and 3 and 4 is the languages you are learning, to the right. After the : you write the 3-letter language-code, or UTL for pronounciation. If UTL is uset it will select a UTL-field corresponfing to the other language in the pair 1/2 or 3/4, i.e /3:BUL /4:UTL till make 4 use BULUTL. You may also use /1:1 to specify the first column/field in the file for language 1. If a language specified on the commandline isn't present in the file it will show a pronounciation of the other language in the pair there instead, if that is available. This program is not primarilly done to be released, i made it for my own use. But if you like you can make suggestions. I won't say I'll do anything you ask for, but maby. My e-mail address is probably: joakim THE-EMAIL-THINGY ek.to This program, and KBJEK and JEKFONT, can most likley be downloaded from: http://www.ek.to/files THIS PROGRAM IS DISTRIBUTED AS FREEWARE 'AS-IS' WITHOUT ANY WARRANTY WHATSOEVER REGARDING FUNCTIONALITY OR ANYTHING ELSE. I DO NOT TAKE RESPONSIBILITY FOR ANYTHING THAT CAN HAPPEN WHILE USING THIS PROGRAM, NOR CAN I GUARANTEE THAT THE CONTENTS OF THE DATA-FIELS IS CORRECT. Regards jek 99-09-28