The short version
- Signed language involved? Video or on-site — always. There is nothing for an audio line to carry, so the phone is never an option for ASL.
- For spoken languages, OPI wins on speed and ubiquity; VRI wins on fidelity. Match the modality to the encounter, not to the price sheet.
- Both bill per-minute. The cheaper line item that fails the encounter — and has to be redone — is the expensive one.
- Make modality a field on the job, so intake, interpreter matching, and billing all follow from it automatically.
Remote interpreting gets talked about as one thing, but agencies dispatch two very different services under that umbrella: video remote interpreting (VRI) and over-the-phone interpreting (OPI). They differ in equipment, in cost, in connect time — and, most importantly, in which encounters they can actually serve. Choosing between them isn’t a matter of preference. It’s a short decision tree, and the first branch does half the work.
This guide walks the whole tree: the absolute rule at the top, an honest side-by-side, where real encounters tend to fall, what makes each modality succeed on the day, and how to wire the decision into your operation so the right answer is also the easy one.
Start with the language
If the encounter involves American Sign Language or any other signed language, the phone is not an option. Not a worse option — not an option. Signed languages are visual languages; there is nothing for an audio line to carry. Every ASL request routes to video or to an on-site interpreter, full stop.
This sounds obvious written down, and yet it’s the single most common remote-interpreting dispatch error, usually made by a well-meaning client-side scheduler who selected “phone” because it was the cheaper line item. The fix belongs in intake, not in training memos: when the requested language is signed, phone simply shouldn’t be selectable. A form that makes the mistake impossible beats a policy that asks everyone to remember it.
The side-by-side
Once you’re in spoken-language territory, both modalities work — the question becomes which one serves the encounter better.
| Factor | VRI | OPI |
|---|---|---|
| Setup | Camera-equipped device, dependable bandwidth, a place to position the screen | Any phone, immediately |
| Connect time | Short, but the room has to be ready | Fastest option available |
| Visual context | Interpreter sees facial expressions, gestures, documents, the room | Voice only |
| Signed languages | Fully supported | Not possible |
| Interpreter pool | Wide — remote removes geography; video-equipped subset | Widest — any interpreter with a phone line |
| Typical settings | Clinical consults, counseling, education meetings, anything with documents or demonstration | Confirmations, reminders, hotlines, brief intake, field calls |
| Typical billing | Per-minute, or a scheduled block | Per-minute |
The pattern that falls out of that table: OPI wins on speed and ubiquity; VRI wins on fidelity. A pharmacy confirming a pickup time needs a fast connection and thirty seconds of Spanish — that’s a phone call. A difficult conversation between a physician and a family, where tone and expression carry as much as the words, deserves video even though both would technically “work.”
Where real encounters fall
Most requests aren’t at the poles. The useful habit is asking two questions about the middle: how much is riding on nuance, and is there anything to look at? Stakes and visual content both push rightward; brevity and urgency pull left. An intake conversation with forms on the table sits mid-spectrum — workable by phone, better on video. When an encounter genuinely straddles the line, err upward in fidelity: nobody has ever regretted having more communication channel than a conversation needed.
Reach for OPI when
- The interaction is short and transactional — confirmations, reminders, basic intake questions.
- Calls arrive unpredictably and in volume, and holding for video setup would create a queue.
- The environment has no reliable camera or bandwidth — field work, older facilities, a caller on a flip phone.
- Minutes matter more than nuance, and the fastest path to any qualified interpreter is the whole point.
Phone work has its own craft. Because nobody can see anybody, the interpreter depends entirely on turn-taking discipline: identify who is speaking, keep one voice going at a time, and use a handset or a good speakerphone rather than a laptop mic across the room. The thirty seconds spent on that setup is the difference between a clean call and three rounds of “could you repeat that?”
Reach for VRI when
- Any signed language is involved. (Always.)
- The conversation is longer, sensitive, or high-stakes, and non-verbal communication matters.
- Participants are referencing documents, forms, or anything physical the interpreter should see.
- You want closer to an on-site experience without travel time or mileage on the invoice.
VRI succeeds or fails on logistics that take two minutes to get right. Mount the screen at face height where the person needing interpretation — not just the provider — can see it comfortably. Light faces, not backs of heads. Check bandwidth before the appointment, and agree on a fallback: if video degrades mid-encounter, who reconnects, and does the session drop to audio or pause? A one-line fallback plan turns a technical hiccup into a ten-second pause instead of a cancelled appointment.
OPI wins on speed and ubiquity; VRI wins on fidelity. The mistake is treating them as interchangeable.
The short version
The economics, briefly
Both modalities are usage-billed — typically by the minute — and OPI generally carries the lower per-minute rate: no video infrastructure, the widest possible interpreter pool, near-instant connections. That difference is real, and for high-volume transactional calls it adds up honestly.
It’s also the wrong place to optimize a consequential encounter. The costs that dwarf a per-minute delta are the hidden ones: the conversation that has to be rescheduled because tone was lost, the extra fifteen minutes of clarification, the participant who leaves without having understood. Price OPI and VRI correctly, bill them by the minute — and choose between them by fit. The cheap call that fails is the most expensive item on the invoice.
When neither is right
An honest modality guide includes the third answer. Some encounters call for an interpreter in the room: tactile interpreting, proceedings where everything rides on precision, long sessions with several people moving around a space, environments where technology can’t be trusted — or simply a participant who has said they communicate best with someone physically present. Remote modalities extend an agency’s reach; they don’t retire the original service. The strongest operations offer all three and route each request to the one it actually needs.
Make modality a first-class fact
The decision tree only helps if your platform treats its output as data. A request should carry its modality from intake through assignment through billing: video jobs matched to interpreters equipped and willing to work on camera, phone jobs billed at phone rates, and signed-language requests physically unable to end up on an audio line. On platforms where VRI and OPI run alongside on-site scheduling, dispatchers stop making the modality call from memory — the request itself knows what it is, and the rest of the pipeline follows.
Two services, one umbrella, one decision tree. Ask about the language first, the context second, and the logistics last — and put the tree into your intake form so the right answer is the easy one.
Frequently asked questions
Can ASL interpreting be done over the phone?
No. American Sign Language is a visual language — there is nothing for an audio line to carry — so every ASL request routes to video remote interpreting or an on-site interpreter. Good intake systems make phone unselectable when the requested language is signed.
Is VRI or OPI cheaper?
Both are typically billed per-minute, and OPI generally carries the lower per-minute rate — no video infrastructure, larger interpreter pool, faster connections. But choose by fit, not rate: a video-worthy conversation forced onto the phone often has to be repeated, and the redo costs more than the modality difference ever saved.
What equipment does VRI require?
A camera-equipped device — tablet, laptop, or cart — with dependable bandwidth, positioned so the interpreter and the person needing interpretation can clearly see each other. Stable mounting, decent lighting on faces, and a quick audio fallback plan cover most real-world failure modes.
When is on-site interpreting the right call instead of VRI?
When the encounter is long, legally or medically high-stakes, involves several people moving around a room, requires tactile interpreting, or takes place somewhere technology is unreliable. VRI approximates presence; it does not replace it for every situation — and honoring a participant’s stated preference for an in-person interpreter is part of doing the job well.
Can a booking switch between OPI and VRI?
It should be able to. When modality is a first-class field on the job, changing it updates the interpreter match and the rate together — an upgrade from phone to video is a field change, not a cancel-and-rebook.