Livekit latency
Posted by MostMulberry4716@reddit | LocalLLaMA | View on Reddit | 12 comments
Livekit playground latency
I've built my own agent, but in the deployment phase I'm perceiving an excess of latency with respect to the console trial. Considering that in both cases I'm using LiveKit inference, I found it weird. The excess of latency is particularly relevant when the agent calls some tools. I've run several experiments and I can't find the problem. By hosting on Livekit servers, I think the latency should have an improvement and not a downturn.
The tests I've already run:
- Use the SIP trunk (service I want to reach) since the playground might be a more debug rather than production tool
- Deploy the agent forcing: job_executor_type = JobExecutorType.THREAD
- Deploy the provided base agent to see whether this was performing better
- Use the base playground to compare my results with the "best" possible
At this point I'm stuck, and as you mentioned on the page, the expected latency from using LiveKit is from 1.5 to 2.5 sec. Right now I have such performances in console, but in playground and SIP trunking, which is the service I'll use in production, I have up to 5 seconds, which are not tolerable for a conversation since the optimality would be around 1s. I hope to receive a satisfactory answer and that the problem could be solved.
If you are interested in the geolocation and server distance parameters, it's all in Eu-central
RemoveSuperb1503@reddit
If you found a solution for this, let us know.
new-to-reddit-accoun@reddit
Did you end up finding a solution?
RemoveSuperb1503@reddit
I just don’t use live kit and built my own infrastructure
new-to-reddit-accoun@reddit
Nice. What’s hype latency now? On LiveKit mine still on 5 seconds. I’m thinking of trying Pipecat next.
toxyspam33@reddit
Hi man. How was pipecat? I am just thinking to switch between livekit to my own solution
new-to-reddit-accoun@reddit
I never got to Pipecat! I paused my experimentation at LiveKit which I’m not happy with at all.
toxyspam33@reddit
Sad ti hear that. Thats why I am going to swtich to twilio and python directly livekit is awesomefor debugging but the llmsare too slow and tehy only offer openairealtime (whichI dont havei profere gemini)
new-to-reddit-accoun@reddit
I was using just Twilio but you’ll still need something for turn detection etc
Chris_LiveKit@reddit
It is hard to diagnose your issue without full details of your setup. But since you say it is fine in console and has issues when deployed I think I would start with:
I've seen folks have problems with instances like AWS t3 and t4g, which are burstable. They don’t provide full CPU performance continuously. You should use
m5,c5,c6i, or similar families for consistent CPU performance.Other factors can introduce latency. If you are specifically having problems during the function calls that take time to produce the data the LLM needs to respond, you can respond with an initial response like "one second" and then whatever the function call returns.
Hosting your agent in the same region as the inference service will also be important to help minimize latency.
JustinPooDough@reddit
I would love some input on this, although it is the wrong sub. I have an agent I've built that I love, but the latency is anywhere from 1.2 seconds to 5 - 6 seconds. I'm in the middle of using LangFuse to try and identify the culprit - which I believe to be TTS.
new-to-reddit-accoun@reddit
Did you find a solution?
new-to-reddit-accoun@reddit
OP, did you find a solution?