How to record text-to-speech on MacOS X 10.5
I found an
interesting article about the history of "intellectual property".
My friend didn't have time to read it,
but was interested in hearing it. MacOS X has a reasonable text-to-speech
engine, so I wanted to create a sound file for him. The process is not
difficult, but it's not obvious. It will cost you only a bit of time.
[Edit: Later that year, I found out that you can use the command line
program 'say' to do the same with much less effort. Type 'man say'
to find out how. The instructions below are still useful for capturing
audio from the output mix, though. At least the bits with the *s.]
1. Install Soundflower
Sunflower opens a conduit between the audio output device and the audio input
device. This allows you to pipe the audio output of one application to the
input of another. You could also do this using
a short cable connecting the headphone jack to the microphone jack, or by
simply playing the audio through speakers and recording through a microphone
in a quiet location.
The software solution delivers the signal without decoding and reencoding,
and therefore will give the highest fidelity.
Download Soundflower.
It's free, open source, and licensed under the GPL. It comes as a .dmg,
so to install you mount the volume, and run the .mpkg installer.
2. Install Audacity
Audacity lets you record audio and encode it in various formats. You can
probably use GarageBand, but I found this easier (less feature-encumbered).
Audacity is also free, open source, and GPL'd.
Download
the .dmg, mount it, and move Audacity.app into your Applications folder
or a subfolder.
3. Configure System Preferences *
System | Speech | Text to Speech
To me, the Alex voice
sounds best. Keep in mind that the Speaking Rate will effect
a tradeoff between recording length, and intelligibility.
I chose a bit faster than normal, because I don't want to feel as though I'm
waiting for Alex to finish its sentances.
Select "Speak selected text when the key is pressed" and set a hot key.
Hardware | Sound | Output
Select "Soundflower (2ch)" as the sound output device.
4. Select the text to be spoken
Open the reader or web site, and select your chosen text. You may need
to futz with the text to get TTS to interpret it better. For example,
sed s/copyleft/copy-left/gi
(replace "copyleft" with
"copy-left") to avoid having "copyleft" spelled out.
5. Run Audacity *
Set Preferences:
Set Audio I/O | Recording | Device to Soundflower (2ch) and
Channels to 1 (Mono).
For voice, you don't need to waste a lot of disk space and bandwidth, so:
Set Quality | Sampling | Default Sample Rate to 8000 Hz and
Default Sample Format to 16-bit.
Back on the main Audacity interface, click record, switch to your text reader,
and press your TTS hot key.
5. Wait
You won't be able to hear the audio, but you'll see the signal in
Audacity. For reference, the article mentioned above took 50 minutes.
If you find a formula relating word count to length, let me know.
6. Trim the file
Once the signal drops off, alerting you to the end of the reading,
stop the recording, and delete the extra seconds at the beginning
and minutes at the end of the track. Then export the sound file
using the patent-unencumbered encoding of your choice.
Here is the result in Ogg Vorbis.
If you choose to edit the sound file further, you may notice that
Alex takes a breath before beginning a sentance, and a shorter
one when it encounters a comma.
7. Play
Optionally, you can use GarageBand to create a podcast complete with
artwork and chapter markers.
8. Support FLOSS
Support your favourite open source software project, because it
lets you do such great things with that expensive computer.
9. Improve the guide
If you find an error, need clarification, or use this method to
record something neat, let me know. I'm sure you can figure out
my email.
Created: 2009-06-13
Last time I changed this date: 2011-04-25