Khmer OD: Help
Help
This page contains a list of topics that are related to typing and searching for words. If you like to contact us, please use this contact page. If you are new to Khmer Unicode typing, see this tutorial: How to Type Khmer Unicode (PDF).
This site provides three features that can assist the searching:
- Full Text Search: search for any words found in the definition or compound words. Don't forget to use this if you cannot seem to find the word you are looking for.
- Auto Complete: as you are typing in the search box, it displays suggested words from the dictionaries.
- Spell checking based on dictionary words to suggest a correct spelling -- known as "Did you mean x?"
Typing Order
The typing order of characters is important for searching. Many times, words may appear identical on screen but they were typed differently. There should be only one correct way to encode a word otherwise searching will be too complex if not possible. Below is the rule from the PDF above:
Rule: Consonant + Coeng Consonant(s) + Consonant-shifter + Vowel + sign(s)
Here are some tips:
- Order of subscripts (Coeng Consonants) and Consonant-shifter Example:
- ម៊្ហែត (ម+៊+្ហ+ែ+ត) vs ម្ហ៊ែត(ម+្ហ+៊+ែ+ត -- Incorrect)
- Order of subscripts example (No clear rule which of the subscripts should go first):
- សំស្ក្រឹត vs. សំស្រ្កឹត (look identical but prefer: សំ+ស+្ក+្រ+ឹ+ត not សំ+ស+្រ+្ក+ឹ+ត
- ចង្ក្រាន : ច+ង+្ក+្រ+ា+ន (Notice: ្ក is before ្រ)
- Inconsistencies: ហ្រ្វ័ង (ហ+្រ + ្វ + ... for Chuon Nath) vs ហ្វ្រ័ង (ហ+្វ+្រ+... for Headley)
Hint: use auto complete to help finding the word or use "Start with" search option.
Typing with subscript ដ and ត ([ជើង ត] and [ជើង ដ])
Since both subscript ដ and ត look the same (្ដ), they required extra knowledge about the spelling. Example:- ម្តង vs ម្ដង (They looks the same but the correct typing is: ម+្+ដ+ង)
- អន្តរ (ជើង ត)
- ប្រតិបត្តិ (ជើង ត)
Please note that in Headley dictionary, the subscripts entered as [ជើង ត] for these type of subscripts. This is incorrect so you need to change your query accordingly to see them. We may make corrections in the future.
Below is a snippet of the grammar packaged with Electronic Chuon Nath dictionary version 2 by លោក មៀច ប៉ុណ្ណ.
ការប្រើប្រាស់ជើងអក្សរ
ជើង “ត្ត” និងជើង “ដ្ដ” មានរូបរាងដូចគ្នាអាចច្រឡំ ព្រោះការប្រើប្រាស់កន្លងមក ហាក់ដូចជាច្របូកច្របល់បន្តិច ដូចជាពាក្យ ស្ដូកស្ដឹង, ផ្តិល(ទឹក), ផ្ដាច់ផ្ដិល ។
ប៉ុន្តែអ្នកប្រាជ្ញបានកំណត់ប្រើជើង(ត្ត) នៅផ្ញើពីក្រោមអក្សរ (ន) វាមានសំឡេងជាជើង(ត្ត)ទាំងអស់ដូចជា ប៉ុន្តែ, កន្តែរ៉ែ, កន្តាំង, អន្តរាយ, បន្តិចបន្តួច ។ល។
លើកលែងតែពាក្យ ៤ ម៉ាត់ចេញ ដែលជើង (្ត) យកទៅផ្ញើនឹង (ន) ទៅជាមានសំឡេង“ដ”នោះគឺ សន្តោស, សន្តាន, ចិន្តា, អន្តរធាន ។
ប៉ុន្តែបើជើង (្ត) នៅផ្ញើនៅពីក្រោមតួព្យញ្ជនៈ (ណ) វិញនោះ វាមានសំឡេងជា “ដ” វិញ ដូចជា សណ្តែក, សណ្ដាន់, កណ្ដៀវ, កណ្ដុរ, កណ្ដាល, អណ្តូង, អណ្តើក[2] ។ល។
ព្យាងតម្រួត ឬ ព្យាងរាយ
There is some discrepancies on some of the spelling in regards to ព្យាងតម្រួត. For example:
- កម្សត់ (Chuon Nath) កំសត់ (Headley)
- កម្សាន្ត (Chuon Nath) កំសាន្ត (Headley)
See discussions here: ( ហេតុអ្វីយើងចាំបាច់សរសេរព្យាង្គតម្រួត? and លោក-លី-សុវី-វិភាកអក្សរស)
Word Segmentation
Khmer text does not use space to separate words. Many authors use invisible space (char \u200B) to separate phrases into words so the search engine can search for words properly. The Chuon Nath dictionary in this site uses this technique. This also make full text search possible where each segmented words were indexed. There is a difficulty in determine what is consider a word. Here are some examples:
- បែកការ -- compound word (two words)
- ស្រាប់តែ -- one word
- ឆបោក -- not in Chuon Nath, the root word is ឆ
- ឡោមព័ទ្ធ -- only in Headley not Chuon Nath
- សុគតិ vs សុគតិភព (one word: show in sub-entry when doing a Full Text search)
- សុគតិភព -- compound word for Chuon Nath, សុគតិភព -- one word for Headley
- ភាសាក្លិង្គ (not found) use ក្លិង្គ
- គូព្រេង -- compound word for Chuon Nath, Headley is one word (គូព្រេង)
- ត្រីខ -- not in Chuon Nath -- But a Full Text Search will show in Headley sub-entry
Compound word notes: Chuon Nath: use separator for compound word in headword but Headley and English/Khmer do not.
Common Mistyped
The following are common mistyped words found in our log.
- Acritic ័ goes after the start of the letter: eg. ការិយាល័យ, not ការិយាលយ័
- ស៊ី is ស + ៊ + ី (not ស + ុ + ី )
- Special cases:
អ៊្ហុះ : use អ៊ + ្ហ + ុះ
អ្ហុះ ! : use អ្ហ -- more entry start with jerg HA then treysap (see the tutorial above) - Independent vowels are its own characters: ie. រឭក not រព្ញក (using ្ ញ )
- ខ្វល់់ -- notice big bontaq, (the bontaq were typed twice)