System.Speak correctly speaking characters?

Ryan_Hartz · January 24, 2023, 6:05pm

Is there a way for System.Speak to verbalize certain characters, such as “-” for minus, “x” for times, and “/” for divided by? I have text that a student would like to be spoken aloud, and when testing the speak function, it seems to disregard these characters (or says “ex” for x instead of times). Some of the text includes math formulas, which I’d like to be read appropriately

Also, not crazy on the robotic voice. Any solutions for a more human voice? I considered recording my own voice of reading the 300+ articles, but that would be quite time consuming, and I haven’t begun to think how to implement this without including 300+ audio recordings in the app. Hosting these somewhere and calling up via URL or something? Ideas?

Christian_Schmitz · January 24, 2023, 6:30pm

macOS or Windows?

See NSSpeechSynthesizerMBS class in MBS Xojo Plugins.
There you can set options like the number mode:

e.g.

dim s as new NSSpeechSynthesizerMBS
dim e as NSErrorMBS

call s.setObjectForProperty(s.NSSpeechModeLiteral, s.NSSpeechNumberModeProperty, e)
msgBox s.objectForProperty(s.NSSpeechNumberModeProperty, e)

Ryan_Hartz · January 24, 2023, 6:33pm

It’ll be for both Mac and Windows. I was checking out your NSSpeechSynthesizerMBS example and thought this might be viable, but that’s for Mac only and not Windows, right?

Christian_Schmitz · January 24, 2023, 7:01pm

We also have WinSpeechMBS class for Windows.

But I think you may also just try some inline commands.

Arnaud_N · January 25, 2023, 6:01am

Another solution would be to replace the characters:

Var s as string="3x4=12" 'From the user

System.Speak s.ReplaceAll("x","times")

As for any solution, you’ll have to know whether “x” is for “times” or a letter for an equation (if there can be).

Greg_O · January 25, 2023, 11:34am

You’d need to put spaces around the replacements, otherwise you’ll get:

3times4equals12

And it may not handle that in all cases.

Ryan_Hartz · January 25, 2023, 1:42pm

This is helpful guys. Thanks for the tip. This works well when there is one math operation, but how do you capture all of the troubling ones? i.e. minus (-), times (x), and divided by (/)

Such as:
4 x 5 / 2

Ryan_Hartz · January 25, 2023, 1:46pm

Ah. Found you can do multiple ReplaceAll in the same line

System.Speak(s.Text.ReplaceAll(" x ","times").ReplaceAll(" / ", "divided by").ReplaceAll(" - ", "minus"))

Something a tad odd I just discovered. When using the above, if there is a multiple digit number, it speaks each individual number:

50 - 10 spoken as “five zero minus one zero”

When using the standard System.Speak(s), 50 - 10 speaks as “fifty ten”

Is there a way to correct the ReplaceAll method to speak multiple digits correctly?

Greg_O · January 26, 2023, 2:01am

You’ll need to convert the numbers to English. I seem to remember doing this in school somewhere along the line. You’ll need a method that can convert 1-19 and 20, 30, 40, 50, 60, 70, 80 and 90. You’ll also need Hundred, Thousand, million, billion, etc.

Then you need to break down numbers into their component parts so that for 134 you get 100 30 and 4 to give you One Hundred Thirty Four.

You’ll need to figure out where the special areas are, things like 90,000 being Ninety Thousand whereas 9,000 is Nine Thousand instead of Ninety Hundred.

Ryan_Hartz · January 26, 2023, 2:59am

That is interesting Greg. Thanks for the tip. I’ll see if I can come up with something to speak the numbers correctly

Arnaud_N · January 26, 2023, 9:36am

It looks like the speech process has several modes. Even if you don’t have the MBS plugins, take a look there: Monkeybread Xojo plugin - NSSpeechSynthesizerMBS class
And look at the NSSpeechCharacterModeProperty property (for example), which states:

When the character-processing mode is NSSpeechModeNormal, input characters are spoken as you would expect to hear them. When the mode is NSSpeechModeLiteral, each character is spoken literally, so that the word “cat” is spoken “C–A–T”.

Also, please re-read Christian’s reply (1st answer in this thread) where his answer looks to suit your needs.

David_Cox · January 26, 2023, 9:55am

Have a look at forum.xojo.com/25029-function-to-convert-currency-to-string/0

Here is a Method (two actually) I created to solve the Number to words conundrum. It is overkill since it also handles currency types, but use it if it’s helpful:

Protected Function getNumberToWordsWAD(myNumber As String) As String
  'an alternative: forum.xojo.com/25029-function-to-convert-currency-to-string/0
  Var tempString, tempString2, tempNumber, ReturnResults, temptrillions, tempbillions, tempmillions, tempthousands, temphundreds, currencyMark, currencyCents As String
  Var tempInt, tempInt2, decimalLocation As Integer
  Var tempDouble As Double
  Var oneDollar, oneCent As Boolean = False
  
  If myNumber.trim = "" Then
    Return ""
  End If
  
  tempDouble = Double.FromString(CommonStrings.getRemoveWAD(myNumber, "All", "Non Numbers"))
  
  'trillion, billion, million, thousand, hundred point 
  '-$###,###,###,###,##0.############
  
  tempNumber = tempDouble.ToString("000000000000000.#########")
  
  decimalLocation = tempNumber.IndexOf(0, ".")
  If decimalLocation < 0 Then decimalLocation = tempNumber.Length + 1
  tempString = tempNumber.Left(decimalLocation) 'chop off the decimal
  'tempString = String.Left("000000000000000" + tempString, 15) 'pad out the number
  
  'Work out the Trillions
  tempString2 = tempString.Middle(0, 3)
  If tempString2 = "000" Then
    temptrillions = ""
  Else
    temptrillions = CommonStrings.getNumberToWordWAD(tempString2.Val)
  End If
  'Work out the Billions
  tempString2 = tempString.Middle(3, 3)
  If tempString2 = "000" Then
    tempbillions = ""
  Else
    tempbillions = CommonStrings.getNumberToWordWAD(tempString2.Val)
  End If
  'Work out the Millions
  tempString2 = tempString.Middle(6, 3)
  If tempString2 = "000" Then
    tempmillions = ""
  Else
    tempmillions = CommonStrings.getNumberToWordWAD(tempString2.Val)
  End If
  'Work out the Thousands
  tempString2 = tempString.Middle(9, 3)
  If tempString2 = "000" Then
    tempthousands = ""
  Else
    tempthousands = CommonStrings.getNumberToWordWAD(tempString2.Val)
  End If
  
  'Work out the Hundreds
  tempString2 = tempString.Middle(12, 3)
  If tempString2 = "000" Then
    temphundreds = ""
  Else
    temphundreds = CommonStrings.getNumberToWordWAD(tempString2.Val)
  End If
  
  ReturnResults = ""
  'positive of negative?
  If myNumber.IndexOf(0, "-") >= 0 Then ReturnResults = ReturnResults + "negative "
  
  'Add the pieces together
  If temptrillions > "" Then ReturnResults = ReturnResults + temptrillions + " trillion, "
  If tempbillions > "" Then ReturnResults = ReturnResults + tempbillions + " billion, "
  If tempmillions > "" Then ReturnResults = ReturnResults + tempmillions + " million, "
  If tempthousands > "" Then ReturnResults = ReturnResults + tempthousands + " thousand, "
  
  If temphundreds > "" Then
    If ReturnResults > "" And ReturnResults <> "negative " Then
      If ReturnResults.Right(2) = ", " Then ReturnResults = ReturnResults.Left(ReturnResults.Length - 2) 'Remove the trailing comma
      ReturnResults = ReturnResults + " and " + temphundreds
    Else
      ReturnResults = ReturnResults + temphundreds
    End If
  End If
  
  If ReturnResults = "" Then ReturnResults = " zero "
  If ReturnResults.Right(2) = ", " Then ReturnResults = ReturnResults.Left(ReturnResults.Length - 2) 'Remove the trailing comma in case there are no values below the top value
  
  'Work out the Currency
  'minus trillion, billion, million, thousand, hundred point point 12345 dollars
  If tempNumber.Left(decimalLocation - 1).Val = 1 Then oneDollar = True 'Use this below to turn 'one dollars' into 'one dollar'
  If tempNumber.Middle(decimalLocation - 1).ToDouble.ToString(".00") = ".01" Then oneCent = True
  
  currencyMark = ""
  If myNumber.IndexOf(0, "$") >= 0 Then
    If oneDollar Then currencyMark = "dollar" Else currencyMark = "dollars"
    If oneCent Then currencyCents = "cent" Else currencyCents = "cents"
  ElseIf myNumber.IndexOf(0, "€") >= 0 Then
    If oneDollar Then currencyMark = "euro" Else currencyMark = "euros"
    If oneCent Then currencyCents = "cent" Else currencyCents = "cents"
  ElseIf myNumber.IndexOf(0, "¥") >= 0 Then
    currencyMark = "yen"
  ElseIf myNumber.IndexOf(0, "£") >= 0 Then
    If oneDollar Then currencyMark = "pound" Else currencyMark = "pounds"
    currencyCents = "pence"
  ElseIf myNumber.IndexOf(0, "₨") >= 0 Then
    currencyMark = "rupees"
  ElseIf myNumber.IndexOf(0, "৲") >= 0 Then
    currencyMark = "bengali rupee marks"
  ElseIf myNumber.IndexOf(0, "৳") >= 0 Then
    currencyMark = "bengali rupees"
  ElseIf myNumber.IndexOf(0, "૱") >= 0 Then
    currencyMark = "gujarati rupees"
  ElseIf myNumber.IndexOf(0, "௹") >= 0 Then
    currencyMark = "tamil rupees"
  ElseIf myNumber.IndexOf(0, "﷼") >= 0 Then
    currencyMark = "rial"
  ElseIf myNumber.IndexOf(0, "₩") >= 0 Then
    currencyMark = "won"
  ElseIf myNumber.IndexOf(0, "฿") >= 0 Then
    currencyMark = "baht"
  ElseIf myNumber.IndexOf(0, "₮") >= 0 Then
    currencyMark = "tugrik"
  ElseIf myNumber.IndexOf(0, "₱") >= 0 Then
    currencyMark = "peso"
  ElseIf myNumber.IndexOf(0, "៛") >= 0 Then
    currencyMark = "khmer riel"
  ElseIf myNumber.IndexOf(0, "₭") >= 0 Then
    currencyMark = "kip"
  ElseIf myNumber.IndexOf(0, "₦") >= 0 Then
    currencyMark = "naira"
  ElseIf myNumber.IndexOf(0, "₴") >= 0 Then
    currencyMark = "hryvnia"
  ElseIf myNumber.IndexOf(0, "₲") >= 0 Then
    currencyMark = "Guarani"
  ElseIf myNumber.IndexOf(0, "₪") >= 0 Then
    currencyMark = "new sheqels"
  ElseIf myNumber.IndexOf(0, "₡") >= 0 Then
    currencyMark = "colons"
  ElseIf myNumber.IndexOf(0, "₫") >= 0 Then
    currencyMark = "dong"
  ElseIf myNumber.IndexOf(0, "₵") >= 0 Then
    currencyMark = "cedi"
  ElseIf myNumber.IndexOf(0, "₣") >= 0 Then
    currencyMark = "French francs"
  ElseIf myNumber.IndexOf(0, "₤") >= 0 Then
    currencyMark = "lira"
  ElseIf myNumber.IndexOf(0, "₧") >= 0 Then
    currencyMark = "peseta"
  ElseIf myNumber.IndexOf(0, "₠") >= 0 Then
    currencyMark = "euros"
  ElseIf myNumber.IndexOf(0, "₢") >= 0 Then
    currencyMark = "cruzeiro"
  ElseIf myNumber.IndexOf(0, "₳") >= 0 Then
    currencyMark = "austral"
  ElseIf myNumber.IndexOf(0, "₯") >= 0 Then
    currencyMark = "drachma"
  ElseIf myNumber.IndexOf(0, "₥") >= 0 Then
    currencyMark = "mill"
  ElseIf myNumber.IndexOf(0, "₰") >= 0 Then
    currencyMark = "pfennig"
  End If
  
  'Work out the Decimal point value e.g. point 12345'
  tempString = tempNumber.Middle(decimalLocation + 1) 'chop off the integer and the decimal point to see if there is anything left!
  If tempString = "" Then
    If currencyMark > "" Then ReturnResults = ReturnResults + " " + currencyMark 'No decimal so just add the currency
    
  ElseIf currencyMark = "" Then
    ReturnResults = ReturnResults + " point "
    tempInt2 = tempString.Length
    For tempInt = 1 To tempInt2
      If tempString.Middle(tempInt - 1, 1).Val = 0 Then
        ReturnResults = ReturnResults + "zero " 'normally returns blank, so set it manually
      Else
        ReturnResults = ReturnResults + CommonStrings.getNumberToWordWAD(tempString.Middle(tempInt - 1, 1).Val) + " "
      End If
    Next
  Else
    'round to two decimal points then add the currencyMark
    tempString = tempNumber.Middle(decimalLocation) 'chop off the integer, but keep the decimal point
    tempString = tempString.ToDouble.ToString("0.00") 'Get the full decimal to round up or down to two decimal points
    tempString = tempString.Right(2)
    If currencyCents = "" Then
      If tempString <> "00" Then
        ReturnResults = ReturnResults + " point "
        
        tempInt2 = tempString.Length
        For tempInt = 1 To tempInt2
          If tempString.Middle(tempInt - 1, 1).Val = 0 Then
            ReturnResults = ReturnResults + "zero " 'normally returns blank, so set it manually
          Else
            ReturnResults = ReturnResults + CommonStrings.getNumberToWordWAD(tempString.Middle(tempInt - 1, 1).Val) + " "
          End If
        Next
        
        ReturnResults = ReturnResults + " " + currencyMark 'point one two lira
      End If
    Else
      ReturnResults = ReturnResults + " " + currencyMark + " "
      
      If tempString <> "00" Then ReturnResults = ReturnResults + " and " + CommonStrings.getNumberToWordWAD(tempString.Val) + " " + currencyCents
      
    End If
  End If
  
  ReturnResults = CommonStrings.getRemoveExcessWAD(ReturnResults, " ")
  ReturnResults = CommonStrings.getSentenceCaseWAD(ReturnResults)
  
  Return ReturnResults
    
End Function

Protected Function getNumberToWordWAD(tempNumber As Integer) As String
  Var ReturnResults, Hundreds, Tens As String 'tempString, Units
  
  'Hundreds
  Select Case tempNumber.ToString("000").Left(1)
  Case "0"
    Hundreds = ""
  Case "1"
    Hundreds = "one hundred"
  Case "2"
    Hundreds = "two hundred"
  Case "3"
    Hundreds = "three hundred"
  Case "4"
    Hundreds = "four hundred"
  Case "5"
    Hundreds = "five hundred"
  Case "6"
    Hundreds = "six hundred"
  Case "7"
    Hundreds = "seven hundred"
  Case "8"
    Hundreds = "eight hundred"
  Case "9"
    Hundreds = "nine hundred"
  End Select
  
  'Tens
  Select Case tempNumber.ToString("000").Middle(1, 1)
  Case "0"
    'Do nothing
  Case "1" 'teens
    Select Case tempNumber.ToString("000").Right(1)
    Case "0"
      Tens = "ten"
    Case "1"
      Tens = "eleven"
    Case "2"
      Tens = "twelve"
    Case "3"
      Tens = "thirteen"
    Case "4"
      Tens = "fourteen"
    Case "5"
      Tens = "fifteen"
    Case "6"
      Tens = "sixteen"
    Case "7"
      Tens = "seventeen"
    Case "8"
      Tens = "eighteen"
    Case "9"
      Tens = "nineteen"
    Case "0"
      Tens = "ten"
    End Select
  Case "2"
    Tens = "twenty"
  Case "3"
    Tens = "thirty"
  Case "4"
    Tens = "forty"
  Case "5"
    Tens = "fifty"
  Case "6"
    Tens = "sixty"
  Case "7"
    Tens = "seventy"
  Case "8"
    Tens = "eighty"
  Case "9"
    Tens = "ninety"
  End Select
  
  If tempNumber.ToString("000").Middle(1, 1) = "1" Then
    'Do nothing since handled above
  Else
    'Units
    Select Case tempNumber.ToString("000").Right(1)
    Case "0"
      'Do nothing
    Case "1"
      Tens = Tens + " one"
    Case "2"
      Tens = Tens + " two"
    Case "3"
      Tens = Tens + " three"
    Case "4"
      Tens = Tens + " four"
    Case "5"
      Tens = Tens + " five"
    Case "6"
      Tens = Tens + " six"
    Case "7"
      Tens = Tens + " seven"
    Case "8"
      Tens = Tens + " eight"
    Case "9"
      Tens = Tens + " nine"
    End Select
  End If
  
  If Hundreds > "" Then
    If Tens > "" Then
      ReturnResults = Hundreds + " and " + Tens
    Else
      ReturnResults = Hundreds
    End If
  Else
    If Tens > "" Then
      ReturnResults = Tens
    Else
      ReturnResults = "" '"zero"
    End If
  End If
  
  Return ReturnResults
    
End Function

Christian_Schmitz · January 26, 2023, 11:00am

for macOS you can use embedded commands in the text:

Use Embedded Speech Commands to Fine-Tune Spoken Output

As described in Control Speech Quality Using Embedded Speech Commands, you use embedded commands to fine-tune the pronunciation of individual words in the text your application passes to a synthesizer. Even if you use only a few of the embedded speech commands described in this section, you may significantly increase the understandability of your application’s spoken output. This section provides an overview of embedded speech command syntax, lists the available commands, and illustrates how to use them to achieve different effects.

Note that some embedded speech commands have functional equivalents provided by the Carbon selector mechanism (for a complete list of available selectors, see Speech Synthesis Manager Reference.) This means that to achieve some effects, you can either insert the embedded command in the text, or you can pass the equivalent selector to the Carbon SetSpeechInfo function. If you use the SetSpeechInfo function (described in Adjust Speech Channel Settings Using the Carbon Speech Synthesis API), the effect applies to all speech passing through the current speech channel, subject to synthesizer capabilities. If you use the embedded command to achieve the same effect, however, it applies only to the word immediately preceded by the embedded command.

Embedded Speech Command Delimiters

When processing an input string or buffer, speech synthesizers look for special strings of characters called command delimiters. These character strings are usually defined to be pairings of printable characters that do not typically appear in the text. One character string is defined as the begin command delimiter and another character string is defined as the end command delimiter. When the synthesizer encounters the begin command delimiter string, it interprets the characters following it as one or more embedded commands until it reaches the end command delimiter string.

The default begin and end command delimiter strings recognized by the MacinTalk synthesizer are “[[“ and “]],“ respectively. You can change these strings if necessary, but you should take care to use printable characters that you do not expect to see in the text your application processes. Also, if you change the default delimiters, be sure to change them back to the default characters when you have finished with the text, because the change is persistent for the current speech channel. For example, if you expect square brackets to appear in the text you’ll be sending to the synthesizer, you can change the default command delimiters to strings containing other printable characters that do not naturally occur in your text.

You can disable the processing of all embedded commands by setting both the begin and end command delimiters to two NUL bytes. You might want to do this if your application speaks text over which you have no control and you’re absolutely sure the text contains no embedded commands. To disable processing of embedded commands programmatically, use the soCommandDelimiter selector with the SetSpeechInfofunction, as shown below:

// Create a structure to hold the new delimiter values
DelimiterInfo MyNewDelimiters;
MyNewDelimiters.startDelimiter[0] = 0;
MyNewDelimiters.startDelimiter[1] = 0;
MyNewDelimiters.endDelimiter[0] = 0;
MyNewDelimiters.endDelimiter[1] = 0;
SetSpeechInfo(CurrentSpeechChannel, soCommandDelimiter, &MyNewDelimiters);

Overview of Embedded Speech Command Syntax

Note: This section describes enough of the embedded command syntax for you to be able to understand the examples in this document. For a formal description of the syntax of embedded speech commands and their parameters, see Syntax of Embedded Speech Commands.

All embedded commands consist of a 4-character command code and a parameter, enclosed by the begin and end command delimiter strings. For example, the emph command requires a parameter that tells the synthesizer to increase or decrease the emphasis with which to speak the next word, as shown below:

[[emph +]] The + parameter tells the synthesizer to increase emphasis for the following word.

More than one command may occur within a single pair of delimiter strings if they are separated by semicolons, as shown below:

[[emph +; rate 165]] Together, these commands tell the synthesizer to speak the following word or phrase with increased emphasis and at a rate of 165 words per minute.

A parameter may consist of a string, a numeric type, or an operating-system type, and may be accompanied by the + or - characters (the exact format of a parameter depends on the command with which it’s associated). Some commands allow you to use the parameter to specify either an absolute value or a relative value. For example, the volm command allows you to specify a particular volume or an amount by which to increase or decrease the current volume, as shown below:

[[volm 0.3]] This command sets the volume with which the following word is spoken to 0.3.

[[volm +0.1]] This command increases the volume with which the following word is spoken by 0.1.

The speech synthesizer ignores all whitespace within an embedded command, so you may insert as many spaces as you need to make your command text more readable.

In addition, this document uses the following characters to express the syntax of embedded speech commands (these characters do not appear in actual embedded speech commands):

The < and > characters enclose items that represent logical units, such as string, character, integer, or real value. When you insert an embedded command in your text, you replace the logical unit with an actual value. For example, you might replace "<RealValue>“ with 3.0. For precise definitions of each logical unit, see the formal description of the syntax in Syntax of Embedded Speech Commands.

The | character means “or" and appears between members in a list of possible items, any single one of which may be used. For example, the emph command accepts either the + character or the - character for its parameter. Therefore, the syntax of the emph command is expressed as emph + | -.

The [ and ] characters enclose an optional item or list of items. For example, the rate command accepts the optional addition of the + or - character to its numerical parameter to indicate a change relative to the current value. Therefore, the syntax of the rate command is expressed as rate [+ | -] <RealValue>.

Items followed by an ellipsis character (…) may be repeated one or more times.

The OS X Embedded Speech Commands

Table 3-1 describes the embedded speech commands, their parameters, equivalent speech information selectors (if they exist), and in which versions of OS X the commands are available. The syntax of each command in Table 3-1 is expressed using the conventions described in Overview of Embedded Speech Command Syntax.

Note: All embedded speech commands, except for ctxt, are available in OS X v10.0 and later. The ctxt command is available in OS X v10.4 and later.

Table 3-1 Embedded speech commands|Command|Syntax and description|Selector|
| — | — | — |
|char|char NORM | LTRL

The character mode command sets the word-speaking mode of the speech channel. When the NORMparameter is used, the synthesizer attempts to automatically convert words into speech. This is the most basic function of the synthesizer. When the LTRL parameter is used, the synthesizer speaks the individual characters of every word, number, and symbol following the command (all other embedded commands are processed normally). For example, to cause the synthesizer to speak the word “cat” as “C-A-T,” you would include the following in a text buffer or string:

[[char LTRL]] cat [[char NORM]]|SoCharacterMode|
|cmnt|cmnt [<Character>...]

The comment command is ignored by speech synthesizers. It enables you to add arbitrary content to the text buffer that will never be included in the spoken output. Note that the comment text itself must be included within the begin and end command delimiters of the cmnt command.

[[cmnt This is a comment that will be ignored by the synthesizer.]]|None|
|ctxt|ctxt [WSKP | WORD | NORM | TSKP | TEXT]

The context command allows you to identify the context of a word to help the synthesizer generate the correct pronunciation of that word, even if no other words in the surrounding phrase or sentence are spoken. Because the pronunciation of words can be different depending on the context in which they appear, you can use the context command to specify the pronunciation used in a particular context.

The context command recognizes two modes: word-by-word and text fragment. In both modes, you use the appropriate “skip” parameter (WSKP or TSKP) to identify the text that provides context and the WORDor TEXT parameter to identify the word or phrase whose pronunciation is affected by the context. The synthesizer parses the entire phrase or sentence to determine the correct pronunciation of the word or phrase, but does not speak the portions of the text marked as “skipped.“ Use the [[ctxt NORM]]command to signal a return to the default input-processing mode.

In word-by-word mode, the synthesizer parses the complete text selection to determine the part of speech (such as noun or verb) of the specified word. The synthesizer pronounces the word according to its part of speech, but it does not make any intonation or duration adjustments to the pronunciation. For example, the word “coordinates” is pronounced differently depending on whether it is used as a noun or a verb. The two sentences below illustrate how to use the context command to tell the synthesizer which pronunciation of the word to use:

[[ctxt WSKP]] GPS provides [[ctxt WORD]] coordinates. [[ctxt NORM]]

[[ctxt WSKP]] The post office [[ctxt WORD]] coordinates [[ctxt WSKP]] its deliveries. [[ctxt NORM]]

In text fragment mode, the synthesizer parses the complete text selection to determine the part of speech and the intonation and duration of the specified word or phrase. For example, the different pronunciations of the phrase “first step” are informed by the context provided by the surrounding words in the following two sentences:

[[ctxt TSKP]] Your [[ctxt TEXT]] first step [[ctxt TSKP]] should be to relax. [[ctxt NORM]]

[[ctxt TSKP]] To relax should be your [[ctxt TEXT]] first step. [[ctxt NORM]]|None|
|dlim|dlim <BeginDelimiter> <EndDelimiter>

The delimiter command changes the character sequences that indicate the beginning and end of all subsequent embedded speech commands. The new delimiters take effect after the command list containing the dlim command has been completely processed. If the delimiter strings are empty, an error is generated. If you want to disable embedded command processing for the remainder of the text buffer, you can pass two NUL bytes in the BeginDelimiter and EndDelimiter parameters.

[[dlim $ $] |soCommandDelimiter|
|emph|emph + | -

The emphasis command causes the synthesizer to speak the next word with greater or less emphasis than it is currently using. The + parameter increases emphasis and the - parameter decreases emphasis.

For example, to emphasize the word “not” in the following phrase, use the emph command as follows:

Do [[emph +]] not [[emph -]] over tighten the screw.|None|
|inpt|inpt TEXT | PHON | TUNE

The input mode command switches the input-processing mode to textual mode, phoneme mode, or TUNE format mode. Note that some synthesizers may define additional speech input modes you can use. The default input-processing mode is textual, and you should always use the [[inpt TEXT]] command to revert to textual mode after you’re finished providing content in one of the other modes. In phoneme mode, the synthesizer interprets characters as representing phonemes (listed in Phonemes). In the TUNE format mode, the synthesizer recognizes the same set of phonemes but also interprets additional information that specifies a precise spoken contour, or tune, for the words. For more information about the TUNE format, see Use the TUNE Format to Supply Complex Pitch Contours.

For example, to supply the phonemic representation of a name that synthesizers frequently mispronounce, you can use the inpt command as follows:

My name is [[inpt PHON]] AY1yIY2SAX [[inpt TEXT]].|soInputMode|
|nmbr|nmbr NORM | LTRL

The number mode command sets the number-speaking mode of the synthesizer. The NORM parameter causes the synthesizer to speak the number 46 as “forty-six,” whereas the LTRL parameter causes the synthesizer to speak the same number as “four six.“

For example, to make it clear that the following 7-digit number is a phone number, you can use the nmbr command to tell the synthesizer to say each digit separately, as follows:

Please call me at [[nmbr LTRL]] 5551990 [[nmbr NORM]].|soNumberMode|
|pbas|pbas [+ | -] <RealValue>

The baseline pitch command changes the current speech pitch for the speech channel to the specified real value. If the pitch value is preceded by the + or - character, the speech pitch is adjusted relative to its current value. Baseline pitch values are always positive numbers in the range of 1.000 to 127.000.|soPitchBase|
|pmod|pmod [+ | -] <RealValue>

The pitch modulation command changes the modulation range for the speech channel, based on the specified modulation-depth real value.|soPitchMode|
|rate|rate [+ | -] <RealValue>

The speech rate command sets the speech rate on the speech channel to the specified real value. Speech rates fall in the range 0.000 to 65535.999, which translates into a range of 50 to 500 words per minute. If the rate is preceded by a + or - character, the speech rate is increased or decreased relative to its current value.|soRate|
|rset|rset <32BitValue>

The reset command resets the speech channel’s voice and attributes to default values. The parameter has no effect; it should be set to 0.|soReset|
|slnc|slnc <32BitValue>

The silence command causes the synthesizer to generate silence for the specified number of milliseconds. You might want to insert extra silence between two sentences to allow listeners to fully absorb the meaning of the first one. Note that the precise timing of the silence will vary among synthesizers.|none|
|sync|sync <32BitValue>

The synchronization command causes an application’s synchronization callback procedure to be executed. The callback is made as the audio corresponding to the next word begins to sound. The 32-bit value is set by the application and is passed to the callback procedure.

You can use the sync command to trigger a callback at times other than those defined by the built-in callbacks (such as the phoneme and speech-done callbacks). For example, you might want to perform some custom processing each time a date is spoken to highlight its place on a graphical timeline. To do this, you would define a synchronization callback procedure and refcon values, and insert a synccommand after each date in the text, as follows:

In 1066 [[sync 0x000000A1]], William the Conqueror invaded England and by 1072 [[sync 0x000000A2]], the whole of England was conquered and united.|soSyncCallback|
|vers|vers <32BitValue>

The format version command tells the speech synthesizer which embedded command format version will be used by all subsequent embedded speech commands.|none|
|volm|volm [+ | -] <RealValue>

The speech volume command sets the speech volume on the current speech channel to the specified real value. If the volume value is preceded by a + or - character, the speech volume is increased or decreased relative to its current value.|soVolume|
|xtnd|xtnd <OSType> [<Parameter> ...]

The synthesizer-specific xtnd command enables other synthesizer-specific commands to be embedded in the text. The first parameter (OSType) must be the creator ID of the synthesizer. The remaining optional parameters are synthesizer-specific.|soSynthExtension|

and I just tried it with “Do [[emph +]] not [[emph -]] over tighten the screw.” to speak. Seems to still work after all the years.