G L O S O R

                           -= DICTIONARY-PRACTICE =-
                        (C) Copyright 1996-99 Joakim Ek
                            Version 4.02 1999-09-27

This is an enhancement of the previous version, 3.2x, that now supports other
non-cyrillic languages, like spanish. I have rewrote a lot of the internal
file-handling, but I've tried to keep it fully backward-compatible with old
DAT-files and switches so it should be possible to replace the old GLOSOR.EXE
with this one. The conversion-functions is not updated to handle other
languages, that will come in a later version...

The program shows a word or a phrase in one language and one answers in the
other. The language in which it asks, and which word, is selected randomly
from all the words in the current DAT-file. This behaviour can be changed
with switches though, see below.

There is a prompt saying which way the translation is supposed to be, like
"BUL->ENG" which means the displayed word is BUL (Bulgarian) and the answer
is wanted in ENG (English). It also keeps each language(s) on separate sides,
so the inputfield will change from the left side (usually the known language)
and the right side (usually the language you are learning).

Your answer must match exactly the answer in the database, either just the 
normal+hilited, or the complete word(s) including the dim. It will consider it 
wrong if it is misspelled, words are swapped, or only a part of the dimmed is 
typed in. It is not case-sensitive though, but when using cyrillic it supports 
only upper-case letters. The answer can be made in any of the available to-
languages, or pronounciation or latinised version of the cyrillic, whatever
is available.

It now supports both my cyrillic codepages jek850 and jek852, but also KOI-8,
called CP:888, and CP866/Alternate russian, called /CP:866. Each of theese
codepages can be used both in the source-files and on the screen, and it
converts apropriately. Note though that it currently supports only uppercase
letters for cyrillic. If no cyrillic is available on the screen it converts
to latin letters instead. (My drivers KBJEK and JEKFONT can be used to get
cyrillic support in DOS, and they also have a codepage/setting that works
with the /CAPS-function)

The latinization of words are made as phonetic in correspondense to swedish 
pronounciation, wich differs slightly from the normal ones used. Especially 
because the letters  and ™ is used, that are not present in English alphabet. 
The following table describes the cyrillic and corresponding latin letter. The 
key that is used to type when using jek850 is also displayed:

CYR 850:a b c  d e f g h  i j k l m n o p q   r s t u  v w  x  y  z † „  ”  
CYR 852:à á æ  ä å ô ã õ  è é ê ë ì í ó ï ù   ð ñ ò þ  â ø  ö  ÷  ç î ÿ  ú ü
SOUND:  A B ZH D E F G CH I J K L M N O P SHT R S T JO V SH TS TJ Z  JA ™
KEY:    A B C  D E F G H  I J K L M N O P Q   R S T U  V W  X  Y  Z  Ž  ™ 
ENGLISH KEY:           (A-Z SAME AS SWEDISH)                        [ ;  ' ]

On the line between the questions there is a ( ) with numbers inside. First 
(or only) is the number of remaining words in the database. The second number, 
if present, are the number of words that have not been correctly answered.
On the right side of the line there is a number showing the number of asked
questions, the number of right answers and the right answers in %. These
numbers are not zeroed even when the database restarts with all words. To
begin at 0 again you should exit and restart the program.

When using old DAT-files, made for GLOSOR 3.xx (they do NOT begin with ***)
it will default to Bulgarian spelling+pronounsiation to the left, and 
Swedish and English to the right. With new DAT-files it can be diffrent
languages in a file, and one can select which to use where with switches. If
no switches is used the first two languages in the file will be to the left
and the next to to the right, if it's not 4 languages it will use some other
rules to select them, just try or specify with switches.

When a question is answered correctly a * will mark that question. You may 
also set an option that will turn the color of right answered questions green, 
and wrong answers red. If you specify a DAT-file with the /FIL: switch it
will default to having the /FELFŽRG option turned on, this can be forced off
by using /-FELFŽRG. If you think an answer is right, even though the 
computer thinks you are wrong (like omitting one of two options) you might 
override it by beginning the next answer with + or , The previous question 
will be marked with + (instead of *) and, if enabled, the color made green.
The score will also be added by 1.

Normally the vowel in the stressed syllable will be marked with hilited text,
but this depends what is put in the DAT-file and may vary. There are also 
often some dim letters/words. This is either used to display information like 
SING for singlyular, V for verb and others, and also to make some words 
optional in the answer if there are several words corresponding to one word 
in the other language. The way the dim words are used might differ in diffrent 
DAT-files, but that should be clear after using it a while. That one of two
words in an answer is dim does not nessecary mean that it's a less used
version of the word, but usually the most common is the one in normal white.
Usually words with the same meaning are separated with , and words of diffrent
meaning are separated with ;. Usually XXX is used as a placeholder where 
any word can be written. In the end of a phrase ... is often used instead.

When all words in the database is done, it will restart. Depending on the 
option it will ask all words again, or just the ones you answered wrong. You 
may also specify wheather the ones corrected using + should be asked again. 
The score-counter is not reset. You will have to exit the program if you want 
to begin at 0 again. If all words are answered correctly it will start over 
with all words again. On the line between the questions "OMSTART" indicates 
restart, and "OMSTART, ALLA ORD" a restart with all words.

To exit the program answer with just Q or 0, or press Ctrl+Q.

It uses DOS to input keystrokes, this is because no internal functions in QB 
allowed ascii 224 to be entered. (used for cyrillic A in my jek850). This 
means that some keys don't work normally. Like F-keys send strange letters, 
you'll just have to not hitting them! Also it made me add the option to hit 
the ESC-key to close the DOS-key-read, to be able to call up RELEX (an 
excellent TSR dictionary, not by me, not for free) by hitting ALT+1. You 
cannot type any letters before activationg the normal key-routine again by 
once more hitting ESC. A little tune indicates that ESC is hit on or off. 
While in off-mode you might move the cursor though. Use arrows to move one 
step or PgUP/PgDown to move one question up_/down or Home/End to move to the 
first column of Swedish/English or Bulgarian. This is used to take advantage 
of that RELEX can search for the word under the cursor when hitting ALT+1.

If you do not have a bulgarian keyboard-driver, it's bossible to use a switch 
to reconfigure the keyboard to change into a bulgarian one. It is made to
work on a swedish keyboard, so some letter won't work otherwise.
If you type spanish or another language with accented letters you may enter
them using the Ctrl-key together with a letter. The supported special letters
currently available are ¤   ‚ ¡ ¢ and £. It do not differ between accented
letters and normal while checking the spelling though, so "A¤o" could be
answered with "Ano" and it would be considered right by the program.

To get a short help over the switches type GLOSOR /?
This help is displayed in Swedish.

You might make your own data-base of words. The program uses a special format 
internally, but it has a conversion included that converts to its own format 
from a text-file of some formats. GLOSOR /KONV?  will display some help about 
this, in swedish. Here are a description in English:
Note though that this conversion has not been updated to use other languages,
it still is the same routines used in 3.2x that was just for Bulgarian. To
use your own words in GLOSOR 4.XX you should edit the DAT-files as explained 
in the next section.

If no ##-line (described later) is present the conversion defaults to a file 
with the words in the order Bulgarian, English , Swedish. The bulgarian should 
be in jek852 codepage (cyrillic on lowercase latin, as described in the 
pronounciation-table above, the KEYS-line) and the fields should be separated 
with comma ,  Each word/question is on one line, that means CR/LF separates 
each word&question. Several words with the same meaning should be separated 
with . (wich is then converted to , that is used in the program). The words 
can be entered in 'TON'-format, that is [] around characters that should be 
hilighted (stressed syllables) and {} around letters/words that should be 
dimmed. There are some special things to have in mind when using [] and {}:
If using a [] inside a {} you should type it like: NORMAL{,D[I]}{MMED}
That is a closing } should always come directly after a closing ], then you 
can put another opening { if there is more dim to come. If this is not made 
the last half of the word will be displayed in normal, not dimmed. Also note 
that when dimming a word you should also dim the apropriate spaces to make it 
possible to answer without the dim. Like: ONE{ TWO} THREE  This will make one 
space optional, and "ONE THREE" will be ok. If you type ONE {TWO} THREE  you 
will have to answer "ONE TWO THREE" or "ONE  THREE", with two spaces, to get 
it right. (This is currently not made correctly in the 1600-word data-file).

Example of a source-file:
v[†]da,WATER,VATTEN
n†q {F},NIGHT
Convert it using GLOSOR /KONV
(If just this is specified, without filename, it will look for GLOSOR.TON in 
the above described format)
You may specify a filename like GLOSOR /KONV:MINFIL.EXT  and it will convert 
this file to GLOSOR.DAT (the internally used file). Note that it will empty 
the remaining records of a previous GLOSOR.DAT but not treuncate the file. 
Therefore it is normally best to delete GLOSOR.DAT before making a conversion.

If you start the first line in the source-file with ## this indicates that a 
parameter-line is present. This line tells the conversion how to handle the 
file. If any lines in the file begins with # that line will be ignored. (exept 
of course the first one) This can only be done if a ## line is present first.

If a line in either file-format does not contain a Bulgarian (spelled) word or 
neither of Swedish nor English that line will be skipped.

Description of the ## parameter line:

##852TON.;,BULENGSVE
^ ^  ^  ^  ^
| |  |  |  |
| |  |  |  2,3 or 4 fields in the order they appear in the source-file.
| |  |  |  BUL=bulgarian(spell), ENG=english, SVE=swedish, UTL=Bulgarian 
| |  |  |  pronounciation. (If UTL is not present pronounciation is calculated 
| |  |  |  using grammar rules that normally gives the right result)
| |  |  Separators. First the one used between words of the same meaning (, 
| |  |  in the program) then the one used between words of diffrent meaning
| |  |  (;) and third the one used between the language-fields.
| |  |  If # is entered as the language-field-separator the file is considered
| |  |  to be fixed-spaceed instead of character-deliminated. Se later.
| |  |  (If you dont use word-separators enter the default , and ; )
| |  Formatting-type. 'TON' is [] around hilighted letters, 'U22' is ASCII 22
| |  around hilighted. Both use {} around dim letters/words. 'TXT' means no
| |  formatting (hilight or dim) is present in the source.
| CodePage (EK.CPI) Only '850' or '852' is supported. (jek850 and jek852).
| This is for the BUL and UTL fields. ENG and SVE should be uppercase-ascii.
## Indicates that this is a parameter-line.

The above is for character-deliminated files, for fixed-space files enter # as 
the field-separator character, and put the language-fields with # in front on 
the startposition of each field, like this:
##852TON,;#BUL          #ENG                       #SVE
voda                    WATER                      VATTEN
The first field, #BUL in the example, begins at position 1 but should be 
entered directly after the options on the parameterline as showed here.
You have to have a Bulgarian, BUL, field. Even if you do not intend to use 
cyrillic. Though you might just enter anything, like a . ,and then enter a 
pronounciation in the UTL field if using only latinized when practising.

If you have my xxx.G files and the G.BAT you can see some examples of 
conversion. (This files is either supplied with the program or in a special 
archive probably called G.ZIP from the same internet-site)

CUSTOM DAT-FILES IN GLOSOR 4.XX
-------------------------------
The converson is currently not updated for other languages, instead one 
should edit the DAT-files directly. A new feature is that version 4.XX may
have the DAT-file specified with the /FIL: switch, so one does not have to
make a new conversion each time one wants to use a diffrent set of words.
All lines, including the first, of the DAT-file should be of a fixed length,
ended with CR-LF (a new-line). The first line specifies the sizes and
languages of the fields. It has the following format:
***050 ENG437 SPA437 SWE437 BUL888 ******
It always starts with *** to show that it is a version 4 file, next follows
the field-with as a three-difit-number, all fields (columns) should be of the
same width. Then there is a space (or any single character) followed by a
description of each field. The description is a three-letter-code telling
what language to use, and then a 3-digit number specifying the codepage. After
that there is a space (or other single letter) before the next column-
specifier. When all columns have been specified it should be ****** to show
that there are no more fields in the file. The rest of the line should be
filled with spaces until the CR-LF. In this case it is four fields, with 50
characters in each, so the total length of each line should allways be 200
characters, plus the 2 characters for the CR-LF newline.
For latin languages you use 437 as the codepage, for cyrillic languages you
use 850 or 852 (my special codepages), 888 (KOI-8) or 866 (CP866/Alternative)
to tell what codepage is used in the file. Note that only uppercase-letters 
are supported in cyrillic. If you specify 000 or UTL as a codepage that will
mean it is a pronounciation-field of the language. Normally it is always
a normal version of the same language in the file if there is a pronounciation
specified, i.e "***050 BUL866 BULUTL ENG437 ******". If there is a word
present in the UTL-field it will be used, otherwise the pronounciation will
be calculated using rules. The pronounciation is displayed in swedish when
it's calculated. The name of the languages can be whatever one wants, but it
is a special meaning of SPA and BUL, to know what rules for pronounciation
to use, and SWE and ENG, since they will never show a pronounciation-field.
Any other than theese will show the same word in a pronounciation-field as
in the normal field if one selects to use a pronounciation-field.

It is possible to have any amount of languages and fields in a file, the 
options on the commandline specifies which of them to use.

COMMANDLINE-OPTIONS
-------------------
There are several options that can be used to modify the way GLOSOR works. 
First there are the conversion-options, that should be used alone, and that do 
not start the actual practis-program. Above we explain:
/KONV[:fil] Convert text-file (Default GLOSOR.TON) to GLOSOR.DAT (Database)
There are one more conversion available:
/SPAB       Export words to modified SPAB Windows program
This will make files for a special modified version of a windows-program that 
was made for german-words. Since this is not my program, and therefore not 
available from me, i don't bother to explain more about this option.

Then there are all options and toggles that modifies the behaviour of the 
practice-program. They can be entered in any order:
/FIL:glosfile.dat Specifies which file to use, GLOSOR.DAT is the default.
          Also when this swith is used the default values will change
          for some options. /FELFŽRG will be turned on, /CAPS turned off,
          /CP without a codepage will default to 850 instead of 852 and
          /+KANINTE will be selected. The old 3.2X-defaults will be used
          if /FIL: is not specified.
/CP[:NNN] CodePage NNN for cyrillic is loaded and should be used
          If /CP is not specified only latinized cyrillic will be used,
          If /CP is used without number it will default to 852 or 850. You
          may specify 850 (jek850), 852 (jek852), 888 (KOI-8), 866 (CP866/
          Alternative russian) or 437 (only latin). 0 is the same as 437.
          You will have to load the cyrillic driver before starting the
          program. If /FIL is not specified, and /CP is, it will turn on
          the /CAPS-option automatically. Otherwise it defaults to off.
/CGA      Use CGA graphic-mode (Monochrome). For some cyrillic-drivers.
/TILL     Always ask questions in the knwown (Left) language.
/FRN     Always ask the question in the unknown (Right) language.
/SVE      Same as /TILL, used for 3.XX compatibility. (ask in Swedish)
/BUL      Same as /FRM, used for 3.XX compatibility. (ask in Bulgarian)
/-LAT     Do not show latinized translation of the cyrillic words.
/EJLAT    Same as /-LAT used for 3.XX compatibility.
/LAT      Turn on latin words, but this is already the default.
/BULGKEY  (Or /BK) Translate sedish keyboard to standard bulgarian layout
          (if no bulgarian keyboard-driver is present.)
/FŽRG     Show diffrent colors depending of the questions language
/-FŽRG    Turn this off, but it already defaults to off.
/FELFŽRG  Change the color depending on if the answer was correct or not
/-FELFŽRG Turn this off. The default depends on the /FIL-switch.
/MONO     Only uses black and white colors, for monochrome displays.
/KANINTE  Only reuse words that was incorrectly answerd on the next round
/+KANINTE As above but do not reuse words that are + corrected either
/-KANINTE Always restart with all words. The default depends on /FIL-switch.
/PIP      Make a diffrent sound indicating if an answer was right or wrong.
/-PIP     Turn this off, but it's off as default.
/1:SVE    Specify what languages to use. You may specify /1: /2: /3: and /4:
          1 and 2 is the known lagnuages to the left, and 3 and 4 is the
          languages you are learning, to the right. After the : you write
          the 3-letter language-code, or UTL for pronounciation. If UTL is
          uset it will select a UTL-field corresponfing to the other language
          in the pair 1/2 or 3/4, i.e /3:BUL /4:UTL till make 4 use BULUTL.
          You may also use /1:1 to specify the first column/field in the file
          for language 1. If a language specified on the commandline isn't
          present in the file it will show a pronounciation of the other 
          language in the pair there instead, if that is available.

This program is not primarilly done to be released, i made it for my own use. 
But if you like you can make suggestions. I won't say I'll do anything you 
ask for, but maby. 
My e-mail address is probably:  joakim THE-EMAIL-THINGY ek.to  
This program, and KBJEK and JEKFONT, can most likley be downloaded from:
http://www.ek.to/files

THIS PROGRAM IS DISTRIBUTED AS FREEWARE 'AS-IS' WITHOUT ANY WARRANTY 
WHATSOEVER REGARDING FUNCTIONALITY OR ANYTHING ELSE. I DO NOT TAKE 
RESPONSIBILITY FOR ANYTHING THAT CAN HAPPEN WHILE USING THIS PROGRAM,
NOR CAN I GUARANTEE THAT THE CONTENTS OF THE DATA-FIELS IS CORRECT.

Regards jek
99-09-28