|
|
System
Requirement
We
recommend to install Integrated communication Platform only on computers with
at least following software and hardware.
The
system resources are mainly consumed while we are doing speech recognition
and/or Text To Speech Synthesis. So in the next few lines we will discuss this
issue. Hardware Requirement for Text to
Speech Synthesis and Speech Recognition.
Integrated
communication platform can be resource intensive. It is especially important
that SR engines have enough RAM and disk space to respond quickly to user
requests. Failure to respond quickly results in additional commands spoken into
the system. This has the effect of creating a spiraling degradation in
performance. The worse things get, the worse things get. It will not
take too much of this before your will decide that our software is more
trouble than it's worth! Our
Text-to-speech engines can also tax the system. While TTS engines do not always
require a great deal of memory to operate, insufficient processor speed can
result in halting or unintelligible playback of text. For
these reasons, it is important to establish clear hardware and software
requirements when installing Integrated Communication Platform. User must have
all the memory resources and hard disk space needed for proper working of
SR and TTS services. There are three general categories of workstation
resources that should be reviewed:
The
following three sections provide some general guidelines to follow when
establishing minimal resource requirements. General
Hardware Requirements
Speech
systems can tax processor and RAM resources. SR services require varying levels
of resources depending on the type of SR engine installed and the level of
services implemented. TTS engine requirements are rather stable, but also depend
on the TTS engine installed. SR
and TTS engines currently available with our application can be successfully
installed on systems with 486/33
processor chip and an additional 1MB of RAM. However, overall PC performance
with this configuration is pretty poor and is not recommended. A
good suggested processor is a Pentium processor (P60 or better) with at least
16MB of total RAM. Systems that will be supporting dictation SR services require
the most computational power. It is not unreasonable to expect the workstation
to use 32MB of RAM and a P100 or higher processor. Obviously, the more
resources, the better the performance. SR Processor and Memory RequirementsIn
general, SR systems that implement command and control services will only need
an additional 1MB of RAM (not counting the application's RAM requirement).
Dictation services should get at least another 8MB of RAM-preferably more. The
type of speech sampling, analysis, and size of recognition vocabulary all affect
the minimal resource requirements. Table shows published minimal processor and
RAM requirements of speech recognition services. Table. Published
minimal processor and RAM requirements of SR services.
These
memory requirements are in addition to the requirements of the operating system
and any loaded applications. The minimal Windows 95 memory model should be 12MB.
Recommended RAM is 16MB and 24MB is preferred. The minimal NT memory should be
16MB with 24MB recommended and 32MB preferred. TTS Processor and Memory RequirementsTTS
engines do not place as much of a demand on workstation resources as SR engines.
Usually TTS services only require a 486/33 processor and only 1MB of additional
RAM. However, the grammar and prosody rules can demand as much as another 1MB
due to the complexity of the language being spoken. It is interesting to note
that probably the most complex and demanding language for TTS processing is
English. This is primarily due to the irregular spelling patterns of the
language. Most
TTS engines used speech synthesis to produce the audio output. However, but
advanced systems can use diphone concatenation. Since diphone-based systems rely
on a set of actual voice samples for reproducing written text, these systems can
require an additional 1MB of RAM. To be safe, it is a good idea to suggest a
requirement of 2MB of additional RAM, with a recommendation of 4MB for advanced
TTS systems. Software
Requirements-Operating Systems and Speech Engines
The
general software requirements are rather simple. The Microsoft Speech API is
implemented on Windows 32-bit operating systems. This means user will need
Windows 95 or Windows NT 3.5 or greater on the workstation. The
most important software requirements for implementing speech services are the SR
and TTS engines. An SR/TTS engine is the back-end-processing module. Our
application is the front end, and the SPEECH.DLL
acts as the broker between the two processes. We
along with our application software included a bundle of text to speech engines
and speech recognition engines. So user don’t need any of the additional
engines. Sound Cards,
Microphones, and Speakers Complete
speech-capable workstations need three additional pieces of hardware:
Just
about any sound card can support SR/TTS engines. Any of the major vendors' cards
are acceptable, including Sound Blaster and its compatibles, Media Vision, ESS
technology, and others. Any card that is compatible with Microsoft's Windows
Sound System is also acceptable. A
few speech-recognition engines still need a DSP
(digital signal processor) card. While it may be preferable to work with newer cards
that do not require DSP handling, there are advantages to using DSP technology.
DSP cards handle some of the computational work of interpreting speech input.
This can actually reduce the resource requirements for providing SR services. In
systems where speech is a vital source of process input, DSP cards can
noticeably boost performance. SR
engines require the use of a microphone for audio input. This is usually handled
by a directional microphone mounted on the PC base. Other options include the
use of a lavaliere microphone draped
around the neck, or a headset microphone that includes headphones. Depending on
the audio card installed, user may also be able to use a telephone handset for
input. Most
multimedia systems ship with a suitable microphone built into the PC or as an
external device that plugs into the sound card. It is also possible to purchase
high-grade unidirectional microphones from audio retailers. Depending on the
microphone and the sound card used, you may need an amplifier to boost the input
to levels usable by the SR engine. The
quality of the audio input is one of the most important factors in successful
implementation of speech services on a PC. If the system will be used in a noisy
environment, close-talk microphones should be used. This will reduce extraneous
noise and improve the recognition capabilities of the SR engine. Speakers
or headphones are needed to play back TTS output. In private office spaces,
free-standing speakers provide the best sound reproduction and fewest dangers of
ear damage through high-levels of playback. However, in larger offices, or in
areas where the playback can disturb others, headphones are preferred. Modem
and Telephony card issues
Basic
data modems can support Assisted Telephony services (outbound dialing) and
usually are able to support only limited inbound call handling. Voice-data
modems are a new breed of low-cost modems that provide additional features that
come close to that of the higher-priced telephony cards. These modems usually
are capable of supporting the Basic Telephony services and many of the
Supplemental services. The key to success with voice-data modems is getting a
good service provider interface for your card. We recommend Voice-data modem for
our application. Finally,
telephony cards offer the greatest level of service compatibility. Telephony
cards usually support all of the Basic Telephony and all of the Supplemental
Telephony services, including phone device control. Most telephony cards also
offer multiple lines on a single card. This makes them ideal for supporting
commercial-grade telephony applications. |
Send mail to askazad@hotmail.com with questions or comments about this web site.
|