-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
Capacitor Version 7.4.3
INTRODUCTION
Hello from Australia!
My name is Scott and I'm using the sdk @openai/agents-realtime and I'm having a huge problem with it in the capacitor app.
Basically it works perfectly on the android device, google chrome browser and safari browser but on the ios device i have a small bug that i cannot erradicate.
Basically how the app works is that a webRTC is created on the client side and it streams audio from the client side to openAI when the client speaks into the microphone openAI replies back in audio. It's awsome.
THE PROBLEM
The problem I'm having is that when the audio first starts to play on the ios device it stops then starts then stops and starts on the first play. After the first play, it works perfectly.
This is definately a capacitor bug because i can get it to work perfectly in the safari and chrome browsers and on an capacitor app on an android device but it malfunctions on the first audio play on a capacitor app on an ios device.
THE LIKELY CAUSE
The reason for this error is because capacitor apps for ios use a special wrapper (WKWebView) that mimics safari and the reason this error doesn't occur in the safari browser is because safari's audio handling has been built to handle this situation but the wrappers (WKWebView) that are used in frameworks like capacitor haven't been just yet, let me explain this in more detail.......
Safari (the full browser) runs in a long-lived WebKit process that usually has the AVAudioSession already active (or quickly activated) after a user gesture. When WebRTC/audio starts, Safari often has the route + sample rate “hot,” so first playback is smooth.
WKWebView on the otherhand (the wrapper that capacitor and many others are using atm) starts cold and so first audio play from a webRTC will stop and start and stop and start a few times on first play as the audio sets up from scratch each time on first play. So to fix the error it would be pretty easy on your end because all you really need to do is pre-activate the AVAudioSession inside capacitor.
RESOURCES
The sdk is below but in the project I am giving you via github the sdk is provided via cdn so you don't have to install it but the documentation for this sdk is below if you still want to read it:
https://openai.github.io/openai-agents-js/guides/voice-agents/quickstart/
I made a youtube video showing you how it works on everything except ios device.
I have already made you a simple project in capacitor, javascript and html using this library from openAI and you can use this project to debug the problem with capacitor only you will need to create your own openAI client secret key from the openAI console here;
https://platform.openai.com/api-keys
Also because the cdn sdk is using webRTC, the app must be hosted on a https local server not http so therefore you will need to create the two certificates to enable this using the unix comands below from the root directory:
-
Install mkcert (once)
brew install mkcert -
Install the local root CA (once)
mkcert -install -
Create a cert specifically for your IP
mkcert -cert-file ip.pem -key-file ip-key.pem 123.456.78.9 (your ip address on your machine)
I have already completed the imports for these two certificates so you just need to make them and keep them in the root directory.
i have already made all the configurations for ios and android for you to run this project on your own device via xcode and android studio and I have already set up the capacitor.config and vite.config.js files only you will need to put your own ip address there as i have still got mine there.
Here is the project I have built for you so you can use it to debug the issue:
https://github.com/scottywm/capacitorTest.git
To run the project you need to use vite on port 5173, just run the command below from the route directory:
npx vite --port 5173
If you need help getting the project started and have other questions I'm happy to chat with you over a zoom meeting if you like. Just email me at [email protected].
Please action this asap because i've been working on my app all year and im almost finished and now this bug in capacitor is a fatal flaw so please fix it.
Other API Details
Platforms Affected
- iOS
- Android
- Web
Current Behavior
The capacitor app works perfectly on an android device and i can even get it to work perfectly in the chrome and safari browser but when I use capacitor to make it into an app it malfunctions on an ios device when the audio first starts playing. The first audio play on an ios device stops then starts then stops then starts. This is terrible!
Expected Behavior
The audio must play smoothly on both ios and android devices without stops and starts.
Project Reproduction
https://github.com/scottywm/capacitorTest.git
Additional Information
I've been working on this app for nine months staright now and this error is fatal so it must get fixed ASAP!!!