Title | : | Modified Grapheme Encoding and Phonemic Rule to Improve PNNR-Based Indonesian G2P |
Author | : |
SUYANTO (1) Prof. Dra. Sri Hartati, M.Sc., Ph.D. (2) Prof. Drs. Agus Harjoko, M.Sc., Ph.D. (3) |
Date | : | 0 2016 |
Keyword | : | Modified grapheme encoding; phonemic rule; In- donesian grapheme-to-phoneme conversion; pseudo nearest neigh-bour rule Modified grapheme encoding; phonemic rule; In- donesian grapheme-to-phoneme conversion; pseudo nearest neigh-bour rule |
Abstract | : | A grapheme-to-phoneme conversion (G2P) is very important in both speech recognition and synthesis. The existing Indonesian G2P based on pseudo nearest neighbour rule (PNNR) has two drawbacks: the grapheme encoding does not adapt all Indonesian phonemic rules and the PNNR should select a best phoneme from all possible conversions even though they can be filtered by some phonemic rules. In this paper, a modified partial orthogonal binary grapheme encoding and a phonemic-based rule are proposed to improve the performance of PNNR-based Indonesian G2P. Evaluating on 5-fold cross-validation, contain 40K words to develop the model and 10K words to evaluation each, shows that both proposed concepts reduce the relative phoneme error rate (PER) by 13.07%. A more detail analysis shows the most errors are from grapheme ?e? that can be dynamically converted into either /E/ or /??/ since four prefixes, ’ber’, ’me’, ’per’, and ’ter’, produce many ambiguous conversions with basic words and also from some similar compound words with both different pronunciations for the grapheme ?e?. A stemming procedure can be applied to reduce those errors. |
Group of Knowledge | : | Ilmu Komputer |
Original Language | : | English |
Level | : | Internasional |
Status | : |
Published
|
No | Title | Action |
---|---|---|
1 |
Paper_58-Modified_Grapheme_Encoding_and_Phonemic_Rule.pdf
Document Type : [PAK] Full Dokumen
|
View |