Listen to work faster

As voice recognition technology penetrates more food distribution operations, wholesalers develop more questions about using the new systems. Some of
Jan. 1, 2006
9 min read

As voice recognition technology penetrates more food distribution operations, wholesalers develop more questions about using the new systems. Some of those answers indicate that voice recognition is an expensive technology. At the same time, it is a technology that many users believe pays for itself with improved warehouse accuracy and productivity.

According to Allan Kohl, managing partner of KOM International, two vendors can be considered in the top tier of voice recognition systems. These are Vocollect, based in Pittsburgh, and Voxware from Lawrenceville, New Jersey. KOM International is a consulting firm based in Montreal, Canada. “All we sell is advice,” Kohl says. “We do not sell equipment or software and have no financial arrangements with any company that does. We trust that makes our advice unbiased.”

Voice directed order selection has been available since the 1980s and has been used in a number of industrial applications. The first major installation in a retail chain distribution warehouse was done by Wal-Mart in 1996 at its location in Clarksville, Arkansas, Kohl says. Since that first installation, Wal-Mart has integrated Vocollect voice technology into 125 of its warehouses. A wide range of distribution operations, including chain stores, wholesale food distributors, and logistics operators such as UPS and the US Postal Service, now use voice recognition systems.

Proven technology

At this point in its development, voice recognition systems can be considered a mature and proven technology, Kohl says. Worldwide, voice recognition systems have been installed in 550 distribution centers in 22 countries. The systems are capable of handling partial case selection as well as full case quantities.

The primary advantage of voice recognition technology is that it leaves a worker's hands free to perform the intended jobs of driving a pallet jack and selecting orders, Kohl says. In many applications, voice technology eliminates paper from the warehouse. Workers don't carry around a lot of paper or make mistakes by reading the wrong thing.

The systems are simple. The radio device is usually worn on a belt. In addition, the worker wears a headset with earphone and microphone. For hygienic reasons, headsets are usually dedicated to specific individual workers, Kohl says. These two components provide a radio link to the warehouse management system.

Once selections orders have been loaded into the system, a voice prompts the order selector to go to a specific warehouse location. On arrival, the selector verifies the location by reading back a random series of check digits posted at the selection slot. After location verification, the system issues instructions to pick a given number of cartons. When selection is complete, the worker says “ready” into the microphone or “pick zero” if the instruction was not understood completely. Those two statements can be customized by the user to make the system easy for workers to use, Kohl says.

With voice systems, workers don't have to read or fumble with paper. They don't have to press keys on a transmitter console. “Hands are free; eyes are free, so the worker focuses entirely on the picking activity,” Kohl says. “This ease of use is particularly important in hostile environments such as a freezer. A worker wearing gloves usually has a little trouble getting a finger on the tiny buttons of a radio frequency terminal.”

Two speech systems

Voice systems can be speaker dependent or speaker independent, Kohl says. The two possibilities have important differences. With a speaker dependent system, each worker develops a specific voice template before beginning to use the system. This template resides permanently in the computer so that the system will always recognize the worker's voice and speech patterns. Speaker dependent systems are usually preferred in environments where worker accuracy is vital, he says.

Speaker independent systems are much like other automated voice systems such as directory assistance on the telephone. With a speaker independent system, workers must be trained to understand the system rather than the system understanding the workers in a speaker dependent system. A speaker independent system limits the number of words that a worker can use to communicate with the computer. The main disadvantage of a speaker independent system is that more time is required to use it, Kohl says.

Synthetic speech systems

Voice directed order selection uses synthetic speech, which can be produced in two ways. Some systems use text-to-speech technology, which results in a computer voice speaking to the worker. The other technology is digitized speech that uses a human voice to communicate with the worker. “Hopefully, digitized speech will use a pleasant voice,” Kohl says. “However, digitized speech works best in applications that utilize a limited vocabulary.”

The big benefit from voice recognition systems is improved worker accuracy. No matter what level of error rate a distributor has, installing voice technology will reduce the number of worker errors, Kohl says. “We consistently see reductions of 60% to 75% in error rates following voice recognition system installation,” he says.

Productivity improvement depends on the application. “We have seen improvements of as much as 200 cases per hour, starting at 100 cases per hour and increasing to 300 cases per hour,” Kohl says. “However, that particular example does not come from the food industry. One big variable in productivity improvement depends on whether or not labels are eliminated from the order selection process.”

Preventing worker mistakes

While the technology is great, workers can still make mistakes. The system may tell a worker to pull five cases, and the selector may actually take only four cases. Workers also sometimes still pull product from the wrong slot. “That happens most often when the selector has the verification digits memorized or reads them into the system before actually arriving at the selection slot,” Kohl says. “To keep that from happening, some warehouse operators make the verification digits hard to see from a distance. For instance, the slot identification may be on the front of the rack and the check digits may posted somewhere back in the rack.”

Two factors have a big impact on order selector productivity. The first is the slot location, Kohl says. If the product is easy to reach, productivity goes up. If the selector has to reach up to the picking slot, it takes more time. Reducing travel time between picking slots helps as well. The second factor involves the verification digits. The sequence should be no longer than three digits; two check digits are ideal for best productivity, he says.

Pricing varies according to warehouse size and the number of workers involved. “For illustration purposes, we base pricing estimates on an operation with 50 or fewer workers in a warehouse with less than 100,000 sq ft or on a warehouse larger than 100,000 sq ft with more than 50 selectors,” Kohl says. “The first step requires installation of a radio frequency network. In the small operation, that may cost $40,000 to $50,000, while the same installation in a larger environment may cost $70,000 to $100,000. The hardware, software, and database for a small operation may cost $15,000 to $30,000 and escalate to as much as $30,000 to $40,000 for a larger installation.

“The professional services required to install a system seem to be independent of size, usually ranging from $20,000 to $40,000 per site. The hardware and software required for each user can run between $5,200 and $6,200 per worker in a small warehouse and can cost as much as $4,800 to $5,200 per user in a large system. Total installation cost might be $270,000 for a small warehouse or as much as $555,000 for a much larger system. In a small warehouse with 25 order selectors, the cost per user could be as high as $11,000 per worker. As systems become larger, the cost per user falls, perhaps to as low as $7,400 per user in a large warehouse with 75 selectors.”

Unlike some radio frequency applications, voice directed order selection does not replace traditional warehouse management systems. Voice recognition simply acts as a link between workers and the warehouse management system.

Two large vendors

The two big players in voice recognition are Vocollect and Voxware. Of the two, Vocollect has a larger presence in the market with an estimated 57,000 units already in the field in 400 distribution centers. Voxware is smaller, but gaining sales with 7,000 units installed in 150 distribution centers. Both companies have installations in multiple countries.

In every case, this is rocket science combined with a little bit of black magic, Kohl says. These systems must consistently identify a large number of individual voices using different accents and sometimes different languages. They must do this in a noisy environment with the constant hum of freezer fans and honking fork trucks in the background.

Voice recognition is expensive, but does offer advantages. On the customer side of the equation, voice directed systems provide better warehouse accuracy and fewer shipping errors. A high level of shipping accuracy certainly makes customers feel better about their vendors, Kohl says.

Voice systems offer internal advantages as well. In many cases, distributors can eliminate the use of picking labels, which at least reduces the cost of printing them. Worker ergonomics are improved, because selectors have their hands free to do the job, Kohl says. When the job is easier, productivity increases. In addition, voice systems can alert other warehouse workers when selection slots need to be replenished. The big advantage rests on worker training. The learning cycle is much shorter than training to use hardware that must be manipulated by hand.

KOM International has hard data from some of its clients, Kohl says. For instance, Hannaford Brothers, a supermarket chain in the northeast, has reported an 18.2% increase in selection productivity and a 64.1% reduction in selection errors after using voice recognition for five weeks. Likewise, Price Chopper, another store chain, reports a 15% productivity increase along with a 67% reduction in grocery selection errors and an 83% error reduction rate for perishables after 18 weeks. Both these companies had strong performance records to start, he says.

Sign up for our eNewsletters
Get the latest news and updates

Voice Your Opinion!

To join the conversation, and become an exclusive member of FleetOwner, create an account today!