I'm not an expert. Can't we abuse that LLMs don't need to receive audio as a continuous stream without interruptions? Couldn't we just send data and pipe it into the LLM with deduplication (if resending happens)?
You’re absolutely correct. A jitter buffer is necessary for a human listener, but a LLM isn’t aware of a time lapse, just like it isn’t aware of the time since your last message in the conversion (unless the chat harness explicitly informs it).
I haven't tried Löve, but I somehow enjoyed reading through the README.md, no AI slop, just a natural writing style with tiny indictors showing the authors' enthusiasm in creating software.
It's still possible to let users already type from the beginning, just delay sending the characters until checks are complete. Hold them in memory until then.
This was actually one of the reasons why Instagram felt smooth.
Another thing but Facebook/Instagram have also detected if a person uploads an image and then deletes it and recognizes that they are insecure, and in case of TEENAGE girls, actually then have it as their profile (that they are insecure) and show them beauty products....
I really like telling this example because people in real life/even online get so shocked, I mean they know facebook is bad but they don't know this bad.
[Also a bit offtopic, but I really like how the item?id=3913919 the 391 came twice :-) , its a good item id ]