How it works

  • IntonTrainer is a software system designed to train learners in producing a variety of intonation patterns of speech.
  • The system is based on comparing the melodic (tonal) portraits of a reference phrase and a phrase spoken by the learner and involves active learner-system interaction.
  • Since parametric representation of intonation features of the speech signal faces fundamental difficulties, the paper intends to show how these difficulties can be overcome.
  • The main algorithms used in the training system proposed for analyzing and comparing intonation features are considered.
  • A set of reference sentences is given which represents the basic intonation patterns of English speech and their main varieties.
  • The system’s interface is presented and the results of the system operation are illustrated.

  • Next Home

    Impotence of intonation learning

  • A current linguistic idea is that a foreign accent is more evident and stable in intonation than in segmental sounds.
  • A foreign accent in intonation emerges mainly as a result of prosodic interference, an inevitable ‘by-product’ of bilingualism and, particularly, under the influence of the prosodic patterns of the learner’s native language on those of the target language.
  • Considering the variety of functions of intonation in speech and its potential socio-cultural effects, deviations in this area can lead to serious semantic losses in communication.
  • It is a well-known fact that it is incorrect intonation that is often the cause of the wrong impression a non-native language speaker might produce.
  • This study is concerned with the progress achieved in providing an additional visual feedback as well as a quantitative assessment accuracy helping nonnative learners eliminate intonation errors.

  • Next Home

    Five tasks that should we solve

  • Task 1. An adequate comparison of the pattern signal and a spoken one which is usually characterized by a non-linear time deformation and its beginning and end are not known beforehand.
    The solution of this problem has become possible thanks to the application of the modified method of a continuous dynamic time warping (CDTW) of two signals.
  • Task 2. Automatic segmentation of the signal being analyzed into areas for which the notion of F0 is relevant as far as the formation of the tonal contour of the phrase is concerned (the segments of vowels and most of the sonorants).
    This problem is being solved by means of a non-linear transfer of segment markers from the preliminarily marked pattern-phrase onto the phrase being uttered
  • Task 3. Precise calculation of F0 of the pattern speech signal and of that produced by the learner within a very wide voice range {30 – 1000 Hz}, for male and female voices pooled.
    The task is solved by using known modern methods of singling F0 out of a speech signal.
  • Task 4. Automatic interpolation of current values F0 on the segments for which measuring F0 is invalid, i.e. on most of the consonants.
    This task is solved by using well-known interpolation mathematical formulas determining the way of finding intermediate values.
  • Task 5. An adequate calculation of a similarity measure between the pattern signal and the uttered one under the condition of their differences in duration and F0 voice-ranges.
    This task is solved by using a representation of an intonation curves in the form of a unified melodic portraits (UMPs) described below. Calculation of the similarity measure of two UMPs is carried out through determining the vector distance between the curves.

  • Next Home

    UMP stylization model

  • According to UMP model, a phrase is represented by one or more AUs (often referred to as Accent Group). Each AU can be composed of one or more words and consists of pre-nucleus (all phonemes preceding the main stressed vowel), nucleus (the main stressed vowel) and post-nucleus (all phonemes following the main stressed vowel).
  • The model allows of representing the intonation constructions of a given language as a set of melodic patterns in normalized space {Time – Frequency}.
  • Time normalization is performed by bringing pre-nucleus, nucleus and post-nucleus elements of AU to standard time lengths.
  • For fundamental frequency normalization F0 min and F0 max are determined within the ensemble of melodic contours produced by a certain speaker.

  • Next Home

    Grafical represation of UMP

    Normalized space for UMP is presented as a rectangle with axes (TN, F0N).
    The interval [0 - 1/3] on the abscise TN is a pre-nucleus,
    [1/3 - 2/3] is a nucleus, and
    [2/3 - 1] is a post-nucleus.

    The intervals on the ordinate F0N: [0 ‑1/3] - low level,
    [1/3 - 2/3] - mid level,
    [2/3 - 1] - high level.


    Next Home

    Acoustic Database

  • The developed prototype of the system is realized in 2 variants for implementation in multimedia course-books for advanced learners of English and Russian intonation.
  • Application of this system makes it possible for the students not only to listen to phrases pronounced with standard intonation but also observe the model F0(t) и A0(t) curves on display.
  • It possible for the students reproduce these phrases, compare their F0(t) и A0(t) curves with the original ones and obtain a numerical evaluation of their similarity.
  • Used as models are male-and-female-spoken sample phrases from the Internet-available multimedia course-books.
  • In the database used, there are several samples of common types of utterances for different intonation pattern as well as several samples of conversational speech and a piece of narrative prose.

  • Next Home

    The basic types of English tonal patterns used

    Type of tone No Pitch varieties Types of utterances. Common usage. Typical examples
    Rising 1 Mid Wide General, Elliptical questions, Tags Is it di+fficult?
    2 Low Wide General questions, Tags, Non finality Can I speak to Ma+ry?
    3 High Narrow Interrogative repetitions Na+tive?
    4 Low Narrow Statements, Tags Ye+sterday.
    Falling 5 High Wide Statements, Imperatives, Special questions Li+sten to me, please!
    6 Mid Wide Statements, Imperatives, Tags, Special questions Whe+re is she?
    7 Low Narrow Statements, Imperatives, Tags It’s in the So+uth.
    Falling-Rising 8 Undivided Imperatives, Questions, Statements, Non-finality, Conversational formulas They are re+ady.
    9 Divided No+t no+w.
    Rising-Falling 10 Undivided Statements, Special questions It’s wo+nderful.
    * In this table the [+] sign indicates the position of the nuclear vowel of the phrase.

    Next Home

    Block diagram of the training system


    Next Home

    Illustration of speech signals marking:

    the phrase “It’s Sa+turday” pronounced by the teacher (above) and by a learner (below)

    PreNucleus - NUCLEUS - PostNucleus


    Next Home

    Illustration of F0 trajectory processing

    for the teacher’s phrase “It’s Sa+turday”:
    original (light curve line) and interpolated, smoothed and normalized (dark curve line)

    PreNucleus - NUCLEUS - PostNucleus


    Next Home

    Illustration of F0 curve comparison (correct pronunciation)

    between the teacher’s (light curve line) and a student’s (dark curve line) phrase “It’s Saturday” in real time space (above) and in UMP-space (belowe)


    Next Home

    Illustration of F0 curve comparison (incorrect pronunciation)

    between the teacher’s (light curve line) and a student’s (dark curve line) phrase “It’s Saturday” in real time space (above) and in UMP-space (belowe)


    Next Home

    User Interface - Starting page


    Next Home

    Selection box - “TONE TYPES”


    Next Home

    “TONE TYPES” – FallRise

    (example)


    Next Home

    Selection box - “TONES COMPARISON”


    Next Home

    “TONES COMPARISON” – Rise & Fall

    (example)


    Next Home

    Selection box – “TONE USAGE”


    Next Home

    “TONE USAGE” – Imperative

    (example)


    Next Home

    Selection box – “DIALOGUES”


    Next Home

    “DIALOGUES” - Fragment

    (example)


    Next Home

    Selection box – “LITERATURE”


    Next Home

    “LITERATURE” – Fragment

    (example)


    Next Home

    CONCLUSIONS

  • Software realization of the system named "IntonTrainer" is written on C++ programming code by using Qt framework.
  • It can be compiled under Windows platform (from XP to 10 versions) and Linux (Debian), as well as under Linux platform.
  • At present, using the IntonTrainer system, experiments are conducted to learn by students the intonation of Russian and English.
  • Preliminary results indicate a significant effectiveness of its use.

  • Home