The one feature for our survey platform requested more than any other is the ability to automatically spam call a list of phone numbers to answer a survey. It’s one thing to ask people to click on a link and fill out a web survey. They have to be in front of a computer to do that. It’s another thing to ask them to call into a number and answer some questions over the phone. They can be anywhere to do that, but, hey, they might not be in the mood to bother.
It’s a whole different thing to call them.
As mentioned in a previous post, web surveys can frequently present skewed data because the respondent set is self-selecting. Calling people up at home is still the most reliable way to get someone’s attention. Once someone has picked up the phone, the marginal effort to then answer a handful of questions is usually low enough that most people will suffer through the process.
Anyway, we’ve been hard at work here at Plum and we just finished designing, building, and testing this new capability. The coolest thing about building a new product is being able to quickly iterate it based on requests from real users. Of course it helps a great deal to have feedback off of which to work otherwise you’re designing in a vacuum. It also helps to have a team here at Plum capable of quickly iterating a complex product like Plum Surveys. Fortunately we’ve been getting the feedback and we have the team, so it’s been fun watching this particular product evolve.
The next dev cycle for Plum Surveys concludes the first week of October. I’ll let you in on the new features next week.
No Comments »
Last week I talked about audio encoding formats but did not address how the decoder knows how the encoder encoded the audio (try saying that ten times quickly!)
There are really only two ways to address this issue. Method one is encapsulating the data with the encoding descriptors. Method two is…to guess.
The only encapsulation file format supported by the Plum IVR is Microsoft .wav which is derived from a format called RIFF. People often think that Microsoft .wav is both a file format and an audio encoding format. It isn’t. .wav/RIFF is independent of the audio encoding. Without getting into too much detail, you can think of the .wav/RIFF format as merely an envelope; the data enclosed within the envelope can be encoded any number of ways from PCM or u-law (as mentioned last week) to MP3 to various proprietary audio encoding formats. Thus, it’s important if you are going to create a .wav file that you also make sure that the audio is encoded using one of the formats mentioned last week.
That all said, you could also just send the IVR raw audio data and have the IVR guess at the format. You do, however, have to give the IVR a bit of hint in the form of an appropriate file name extension. If you encoded your audio data as 8kHz 16-bit PCM mono, just slap a “.pcm” on the end of the filename and the IVR will assume that’s the format. On the other hand, if you recorded your audio data as 8kHz 8-bit u-law mono, add “.ul” to the end of your filename. These types of files are often referred to as “raw, headerless” files because there’s no metadata whatsoever in the file — it’s all pure audio data. The downside to this is that there’s nothing to stop you from recording 11kHz 8-bit PCM stereo but still naming the file “whatever.pcm”. The IVR will load it, assume the data is another encoding format, and produce some noisy garbage over your phone lines.
One final thing to mention are MP3s. The Plum IVR can handle MP3s just fine, however, we often hear complaints about the decline in audio quality between what someone hears when their MP3 is played over their headphones and what is ultimately heard over the phone. Bear in mind: the phone system was never intended to transmit high-fidelity audio. That’s why we usually recommend the lossless formats instead because ultimate sound quality can be better controlled by the application developer when what he or she hears through headphones closely matches what they would hear over the phone.
So what would I recommend as the audio encoding format and file encapsulation format? We usually recommend .wav encapsulation of a 16-bit linear PCM, 8kHz, mono audio file. A) the file is self-describing, and B) “16-bit linear PCM” is common to all audio production software. Ideally we’d prefer to recommend u-law instead of 16-bit linear, but u-law often confuses people because it’s sometimes referred to as “mu-law” or sometimes “μ-law”. As usual, our support forum at http://support.plumgroup.com/ is always there to help you work out any audio production issues you might have.
No Comments »
Creating pre-recorded audio files is a complicated and involved process that’s exacerbated by the fact most people don’t have a firm grasp of how an audio file format is specified in the first place.
When audio is recorded on a computer, it is encoded as a series of numbers that, when read and decoded by the IVR, can be converted back into sound. In order for this data to be encoded and then, in turn, successfully decoded and converted into sound, the encoder and decoder both need to agree on a set of descriptors for what the numbers represent.
The typical descriptors for an audio file recorded in a non-lossy format are as follows:
- audio format: linear PCM, u-law, a-law are all examples of audio formats which each specify a different way to map from a numerical data point in a file to a real sound generated by a speaker.
- bit depth: the number of bits used to specify each data point. Linear PCM, for instance, is usually 8 or 16 bits. u-law and a-law are always 8-bits.
- number of channels: 1 for mono, 2 for stereo, etc.
- frequency: the number of data points written to the audio file per channel per second. This is measured in hertz (Hz)
The Plum IVR can handle audio files that are 16-bit linear PCM, 8-bit u-law, or 8-bit a-law, single channel (mono) recordings sampled at 8000 Hz. These descriptors are important for IVR for a couple reasons. First, if you try to use an audio file that was not recorded with an acceptable encoding, the Plum IVR will not be able to play it. Second, when you initially record your file, it’s always preferable to record it in one of these formats so you won’t have to re-encode the file and possibly introduce noise artifacts into your audio file. Finally, third, these three formats were chosen because they could all be re-encoded with minimal or zero quality loss to 8000Hz mono 8-bit ulaw — the standard audio encoding format used by the U.S. public telephone system.
This leads to the final question: how does the encoder and decoder agree on the encoding format for the data? We shall discuss encapsulation next week…
No Comments »
Posted by: andykuan in IVR
Our fine engineers here at Plum have added a new question type to our survey application: the transfer question type. The transfer question type allows the survey designer to insert a phone call anywhere in a survey. This feature, when paired with skip logic, is quite powerful indeed as I’ll discuss further down in this post.
But first there were some design challenges associated with adding this feature:
- It’s only available for the IVR version of a survey. We felt there wasn’t a good web equivalent of making a phone call and decided, rather than coming up with a weak web counterpart to call transfer that no one will use, it’d be better to simply make this a phone-only feature.
- What is the “result” of a transfer question? For this iteration of the Plum survey application, a transfer question will return the length of the call. If there’s sufficient interest, we’ve considered returning a recording of the call transfer as the result similar to how a recording is returned for the comment question type.
- If the caller hangs up during a transfer, they might miss out on questions that occur afterwards. However it’s natural to hang up during a call transfer if it’s the last question in a survey. We decided that in the former case, data will not be saved for this respondent just like if they gave up in the middle of a web survey without finishing it. In the latter case, we’ll return the length of the call up to the point where they hung up and if, indeed, the call transfer was the last question in the survey, the survey is considered completed and the data is saved in the database.
Of course, these design choices are fairly minor matters. The ability to transfer a phone-based survey taker to any phone number based on choices they previously made opens up numerous possibilities for using the Plum survey application as both an enhanced survey tool and a general IVR tool.
First I’ll offer an example of how one coule use the transfer question type to enhance an existing survey. Let’s say you’re a call center that wants to ask your customers how satisfied they were with the rep that they reached. Frequently this determination of whether to take a survey occurs at the end of the phone call. This leaves open the possibility that the rep could game the system by only mentioning the satisfaction survey to callers with whom they’ve had a good call.
With the Plum survey application, you could ask the caller if they want to take a survey before they speak with a rep. It would look something like this:
- Ask caller if they want to take a survey after the call. If yes, go to step 2. If no, go to step 3.
- Transfer the call to a rep. After the rep hangs up, go to step 4.
- Transfer the call to a rep. After the rep hangs up, end the survey.
- Proceed with asking the caller some questions about the conversation they just had with a rep. Once the caller has answered all of the questions, end the survey.
Thus, the survey is no longer just a call destination after you’re done talking to a rep. The survey application becomes the entire call, from the first question, through the conversation with the rep, to the satisfaction questions themselves.
Second I’ll offer an example of using the Plum survey application for as an IVR autoattendant/call director. You can think of an autoattendant as a series of questions that the IVR asks a caller to figure out where to transfer their call. So even though the Plum survey application isn’t explicitly intended to be used as an autoattendant, it can now certainly be used in that manner now that a call transfer is just another question type.
Imagine the following autoattendant structure:
- Choose a language: English or Spanish
- If English, choose sales, billing, or technical support in English
- If sales, transfer them to the English sales line
- If billing, transfer them to the English billing line
- If support, transfer them to the English support line
- If Spanish, choose sales, billing, or technical support in Spanish
- If sales, transfer them to the Spanish sales line
- If billing, transfer them to the Spanish billing line
- If support, transfer them to the Spanish support line
There are six different possible phone numbers to which to direct the caller. The “survey” would end up looking something like this:
- Ask the caller if they want English or Spanish. If they choose English, skip to step 2. If they choose Spanish, skip to step 6.
- In English, ask if they want sales, billing, or technical support. Skip to step 3 if sales, step 4 if billing, or step 5 if support.
- Transfer call to English sales. Once the conversation is over, end the survey.
- Transfer call to English billing. Once the conversation is over, end the survey.
- Transfer call to English support. Once the conversation is over, end the survey.
- In Spanish, ask if they want sales, billing, or technical support. Skip to step 3 if sales, step 4 if billing, or step 5 if support.
- Transfer call to Spanish, sales. Once the conversation is over, end the survey.
- Transfer call to Spanish, billing. Once the conversation is over, end the survey.
- Transfer call to Spanish, support. Once the conversation is over, end the survey.
Thus, by adding the transfer question type, the Plum survey application is now a fairly general tool. Yes, there are a capabilities missing that would make it potentially a completely generalized tool: stateful control-flow logic, large user-defined grammars, and direct data integration to name a few. And, no, not all of them will be built into the Plum survey application. But even as-is, most users should be able to design and build many simple non-integrated applications quickly and cost-effectively using a tool that relies on a simple survey paradigm.
We’ve got more features on the way, so stay tuned.
No Comments »
Posted by: andykuan in IVR
It strikes me as odd that IVR technology has been used exclusively for business purposes. Here at Plum, we’ve worked on applications that were intended for entertainment purposes, but none of those apps were games. With the explosion of mobile gaming, you’d think there’d be more games that could be played not on your phone but through your phone.
The few IVR games I’ve seen fall into a couple camps. The first type shoehorns a game that’s better served via a visual medium into something over the phone. IVR blackjack is an example of this. It works okay because there aren’t that many cards to keep track of but it’s still pushing the limits of what’s reasonable over a phone (imagine trying to play Texas Hold’em over the phone.) Trivia games are another type of game that have made it onto the phone. While trivia does, in fact, translate well into a phone application, they’re not really games per se. They’re more like a survey. There’s no gameplay or strategy involved.
The problem here is that phone games have been derivative of games better suited for another medium. We might be better off considering what VoiceXML IVRs are actually good at. The two compelling capabilities of a good IVR platform are speech recognition and data integration. Unfortunately the only thing I can come up with, given those two features, is a game where you’d learn incantations that you’d have to speak into your phone which could be used to trigger events in the real world. Imagine Ali Baba saying “open sesame” into his cell phone to automatically open his garage door: the IVR would listen for “open sesame” as well as a set of other possible incantations (e.g. “close sesame”, “presto change-o”, “abracadabra”) and, upon recognizing the phrase, makes a call to a web service running on a web server with X-10 controls which would, in turn, open the garage door.
That’s a bit more of a toy than a game though, but I think there’s some kernel of a game in there.
No Comments »
Recently one of our customers encountered a strange problem with their VoiceXML application. Their application had three pages: a start page that attempted to fetch a dynamic page and an end page that would be fetched in case the dynamic page took too long to process. Ideally the following should’ve occurred whenever the dynamic page failed to return:
- The Plum IVR platform fetches the start page and sees the VoiceXML directive instructing it to fetch the dynamic page.
- The IVR platform tries to fetch the dynamic page but after a short timeout, gives up.
- The platform then fetches the end page and plays the message telling the caller that the service is currently unavailable.
For some reason, once the platform failed to fetch the dynamic page, it also failed to fetch the end page.
Now at first, we weren’t certain what the source of the problem was. We wrote a similar application on our own servers and failed to replicate the behavior. Fortunately our customer’s setup did fail reliably (which addressed the number one tool for troubleshooting: a way to trigger the bug over and over.)
After trying to use some pretty weak tools to debug the issue (including typing in “netstat -an” over and over), we broke out a packet sniffer. Specifically, we broke out tcpdump. With tcpdump, we were then able to trace the HTTP sessions between the IVR and our customer’s web server. And what did we find? To fetch the end page, our platform was attempting to reuse the socket that was hung on the dynamic script. This, of course, wouldn’t work. It’s all fine and well to reuse a socket for another request if the previous request has completed, otherwise the IVR is shouting at deaf ears on the customer web server.
Having thus isolated the problem to unexpected reuse of a busy socket, we knew the problem actually came from one of the libraries against which our platform is compiled: libcurl.
Now before I continue, it should be said that libcurl is awesome. We used to use libwww and found it to be an unmanageable mess. libcurl is simple, fast, full-featured, and well-documented. But sometimes even the best software has a bug or two. Well, in this case just one.
We decided to write a small test application in PHP that would, using the curl interface embedded in PHP, attempt to fetch those three pages in succession. This is the number two tool for troubleshooting: a simplified analogue of the problem that can reliably reproduce the bug. Sure enough our test application reproduced the issue and now, by using a scripting language like PHP, we could insert all manner of debugging information into our test script and immediately retest.
With debugging turned on, we discovered that libcurl was quite intentionally reusing the socket. Since libcurl itself was printing the debugging messages, we searched the libcurl source code (because it’s Open Source) for where this debugging message came from. And what did we discover? A slight flaw in the libcurl logic where they should’ve typed “!=” (i.e. not equal to) instead of “==”. We changed one character and “voila” we fixed the bug.
Mind you, after the “voila” comes a couple days of testing, building a patch kit, scheduling system maintenance across our infrastructure, and deploying the fix to production — but we’re discussing debugging today, not operations.
The marketing moral of this story? In addition to owning the platform source code in-house which allows us to instate patches within days of identifying a bug, we also take advantage of open source libraries which allows us to fix bugs in third-party software to which our platform links. The benefits of being able to quickly and directly modify our IVR platform code aren’t confined to bug fixes either. We have, in the past, added VoiceXML extensions based on customer requests. When our marketing team says that we own our IVR platform, now you know why it’s important.
1 Comment »
Posted by: andykuan in general
The problem with buzzwords is that, by their very nature, they get overused and overextended and eventually lose all meaning altogether. And then people still use those buzzwords and it irritates me. It’s kind of like people who use big words to sound erudite. It’s pretentious and usually it’s a feeble attempt to obfuscate their ignorance. </sarcasm>
Today I’m going to pick on SaaS: Software as a Service. It’s a blatant relabeling of ASP. I read an article where the writer extols the superiority of the SaaS model over the ASP model — as if the inherent business models and not execution, available technology, and market conditions spelled failure for ASPs and imminent success for any company that promotes SaaS.
The first set of lame arguments: “ASPs were not necessarily concerned about providing shared services to multiple tenants.” and that ASPs lacked “the required amount of application and business domain knowledge regarding the applications they were running.” The first argument is a bit contradictory: so if an ASP isn’t concerned with providing shared services to multiple tenants, does that mean that it’s providing shared services to one tenant — so the one tenant is sharing it with…whom precisely? Seems more like a pathetic ASP that could only find one customer than the standard model by which ASPs operated. And since when is an ASP defined as a business that’s run by people who don’t know the business they’re serving? That’s just a bad business. I’m certain there are SaaS businesses that are similarly run by people without “the required amount of…knowledge”
The second set of lame arguments state that ASPs had simple HTML interfaces whereas SaaS solutions are better because they’re designed specifically for the web. Look, if you could write a simple HTML interface in 1997 that meant you were building something specifically for the web. Just because developers have access to a far richer set of web client tools and capabilities now than were available 10 years ago doesn’t indicate a better business model. It indicates better tools. ASPs in ‘97 and today’s SaaS solutions both use web technology. The difference isn’t the model, it’s the tools.
Finally, the writer argues that ASPs rushed their products to market before taking into consideration “performance, security, customization and integration issues”. This is a corollary to one of the first arguments. And my counter is the same: ASPs aren’t defined by how products are released to market. They’re defined by what they sell and how they sell not how well they sell. Any company that rushes a product to market without concern for their infrastucture is in trouble.
Both ASP and SaaS vendors sell software that’s delivered to the customer over the Internet from a vendor’s data center. This data center infrastructure is shared across many of the vendor’s customers and is even frequently shared by many different vendors. Any nuanced differences between ASPs and SaaS solutions are not only marginal, but usually not even understood by the suits spouting these buzzwords. So do me a favor, quit worrying about what something’s called and try to actually say something meaningful. Thanks.
5 Comments »
Posted by: andykuan in IVR
We just rolled out our new app: a survey building tool. We recognized an underserved market: people who need a tool to build IVR surveys as easily as they can build web surveys. The market is absolutely crowded with competitors in the web survey space mostly because, like most web technologies, it’s cheap and easy to build such a tool and it’s cheap and easy for all of your competitors to build such a tool as well.
A recent article in Business Week discusses the pitfalls of relying on web surveys. Because the sample set is both self-selecting and confined to Internet users, the resulting data can be skewed. Telephone surveys avoid many of these sampling issues because a) most everyone has a phone and b) surveys can be initiated by the sampler, rather than the samplee (yeah, I know that might not be a real word).
Anyway, we now have a powerful survey building tool that allows you to create surveys that can be administered via the web and the phone. And we’re lucky that we’re in the particular market space that we’re in: competitors in the web survey market won’t be able to compete with our IVR and telephony experience and infrastructure while competitors in the IVR space really don’t care about the application needs of small- and medium-sized businesses.
Check it out: http://www.plumvoice.com/landing/surveys.php
No Comments »
Posted by: andykuan in IVR
Way back in 1961, Bell Labs introduced the T-1. We’re talking 47 years ago. For those of you unfamiliar with the T-carrier system, prior to 1961, all telco traffic was analog signaling over copper wires. Fast forward to now and you’ll still find people ordering a half-dozen phone lines from the phone company to plug into their IVR instead of buying a T-1.
As a rule of thumb, once you need to order 8 POTS (plain-old telephone service) lines from the phone company, it’s usually just as cost-effective to buy a T-1 that can handle 24 calls. A single metered phone line from Verizon will cost you around 55 dollars per month. A single T-1 with all 24 ports activated will cost you around 450 dollars per month. And no, I didn’t make up those numbers to work out so evenly.
The immediate advantage to a T-1 is the fact that it’s a single cable from the demarc to your IVR. Instead, with POTS lines, you have to contend with 8 demarc ports with 8 wires going to 8 IVR ports in a nasty snarl of silver satin wiring. Plus all of those POTS lines are subject to noise, static, and other problems that only occur on analog lines. This makes replacing 8 POTS lines with a T1 a gimme. Same cost, less hassle.
I would, however, argue for replacing as few as 4 POTS lines with a T-1 if you’re planning on taking full advantage of your Plum Voice IVR.
First, 4 POTS lines is still four times more wiring to tangle up than a single T-1 line. You’ll still save yourself headaches.
Second, POTS lines don’t know what number was dialed (the DNIS) so you can only ever run one IVR application on any given line. So if you have 4 POTS lines you can only run 4 IVR applications. With a T-1, the IVR knows what number was dialed when a call comes in on a channel and can then fire up the right application for that particular call. You can have hundreds — even thousands — of phone numbers assigned to your T-1 and any call made to any of these numbers will be sent over a free channel on your T-1 and then directed to the appropriate application.
Third, even if you were to only have 4 applications running on your 4 POTS lines, each application only has one channel available to it. If two people try and call the same number, one gets through and the other gets a busy — just like your plain old telephone service at home. On a T-1, you have 24 channels that can be used for any of your applications. Capacity is shared because, with DNIS-based application routing, every call identifies its own destination.
Fourth, POTS lines don’t tell you when they’re dead. They just quietly stop working. Both ends of a T-1 — your carrier’s switch and your IVR — are constantly trading status information. If the carrier switch ever fails, the IVR will know immediately and you’ll be able to immediately respond to the situation before your callers discover the failure for you.
Fifth, and last, ANI (commonly referred to as caller ID) requires two rings of an POTS line before it’s available to an analog IVR. Sure, call setup time is a minor matter as most callers are willing to wait 10 extra seconds for the IVR to pick up, but wouldn’t you rather have immediate transmission of ANI data and immediate call acceptance?
So unless you’re running only one application and have a high-tolerance for messy wiring, you should go out and replace anything larger than a 4-port analog IVR with an IVR with a digital T-1 interface.
1 Comment »
Posted by: andykuan in ASR, IVR
Speech recognition is cool. I’m still saying that after working here at Plum for 8 years. It’s cool that the IVR can listen for any US city-state combination being spoken and accurately and reliably recognize it. It’s cool that the IVR can listen for 90% of the given names used in the US. And it’s cool that I can drive an IVR app with just my voice while…um…driving.
But just because something’s cool doesn’t mean it should be used everywhere. Speech rec has been around long enough that IVR designers should know better by now but, alas, that’s not the case in practice. The worst possible example of this is using speech recognition for any IVR application that will be called from a noisy, talky environment: several airlines use ASR in their flight status lines. It’s a strikingly ill-advised decision to rely on speech recognition in an environment like an airport concourse. Even when I’m connected to a live agent while at an airport, I have trouble hearing them and vice versa. Replace that human agent with an ASR engine and you’ll discover there’s nothing more irritating than hearing the airline IVR say, “please say your destination city?” and then having the airport PA system say “San Francisco” loud enough for the airline IVR to hear and accept.
How do most of the airlines solve this problem? They rely on the callers getting frustrated and hitting “0″ to get a human being on the line. But if you’re going to do that, why even bother with an IVR in the first place? Now, instead of having customers calling their agents directly and costing them money, they’re irking their customers first and then transferring them to an agent and still costing them money. It’s a lose-lose situation.
So what’s my solution?
- Don’t use speech recognition, instead use DTMF when your callers are likely to be in noisy environments where accuracy is required. People are pretty used to text-messaging now, so if they have to spell “Boston” on their telephone keypad, they won’t be befuddled.
- If you absolutely want speech rec in an IVR app that’s to be often called from a noisy environment, design your IVR app to be modal: switching from voice-and-dtmf mode to dtmf-only mode if it seems the caller’s speech input is consistently erroneous.
1 Comment »
|