today is the first time I'm running your project, and I wanted to see the control panel, but instead I see a python script calculating something for a long time and then crashing with an error.
I did everything as described in the quick start section to access the panel.
python app.py
C:\ProgramData\miniconda3\envs\audioldm\lib\site-packages\huggingface_hub\file_download.py:949: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
vocab.json: 899kB [00:00, 2.79MB/s]
merges.txt: 456kB [00:00, 12.6MB/s]
tokenizer_config.json: 100%|████████████████████████████████████████████████████████| 25.0/25.0 [00:00<00:00, 3.36kB/s]
config.json: 100%|████████████████████████████████████████████████████████████████████| 481/481 [00:00<00:00, 73.9kB/s]
C:\ProgramData\miniconda3\envs\audioldm\lib\site-packages\timm\models\layers\__init__.py:49: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers
warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning)
Caching examples at: 'F:\AI\Soft\AudioLDM2\gradio_cached_examples\22'
Caching example 1/4
Loading AudioLDM-2: audioldm_48k
Loading model on cpu
Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`
IMPORTANT: You are using gradio version 3.50.2, however version 4.44.1 is available, please upgrade.
--------
audioldm_48k.pth: 100%|███████████████████████████████████████████████████████████| 5.36G/5.36G [04:42<00:00, 18.9MB/s]
C:\ProgramData\miniconda3\envs\audioldm\lib\site-packages\torch\functional.py:513: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\TensorShape.cpp:3610.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
C:\ProgramData\miniconda3\envs\audioldm\lib\site-packages\torchaudio\transforms\_transforms.py:580: UserWarning: Argument 'onesided' has been deprecated and has no influence on the behavior of this module.
warnings.warn(
+ Use extra condition on UNet channel using Film. Extra condition dimension is 512.
DiffusionWrapper has 262.70 M params.
C:\ProgramData\miniconda3\envs\audioldm\lib\site-packages\torch\nn\utils\weight_norm.py:134: FutureWarning: `torch.nn.utils.weight_norm` is deprecated in favor of `torch.nn.utils.parametrizations.weight_norm`.
WeightNorm.apply(module, name, dim)
F:\AI\Soft\AudioLDM2\audioldm2\pipeline.py:172: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
checkpoint = torch.load(resume_from_checkpoint, map_location=device)
Running DDIM Sampling with 200 timesteps
DDIM Sampler: 100%|██████████████████████████████████████████████████████████████████| 200/200 [17:07<00:00, 5.14s/it]
INFO: clap model calculate the audio embedding as condition
Similarity between generated audio and text:
-0.20 -0.21 -0.22
Choose the following indexes as the output: [0]
C:\ProgramData\miniconda3\envs\audioldm\lib\site-packages\gradio\processing_utils.py:183: UserWarning: Trying to convert audio automatically from float32 to 16-bit int format.
warnings.warn(warning.format(data.dtype))
Traceback (most recent call last):
File "app.py", line 309, in <module>
gr.Examples(
File "C:\ProgramData\miniconda3\envs\audioldm\lib\site-packages\gradio\helpers.py", line 75, in create_examples
examples_obj.create()
File "C:\ProgramData\miniconda3\envs\audioldm\lib\site-packages\gradio\helpers.py", line 286, in create
client_utils.synchronize_async(self.cache)
File "C:\ProgramData\miniconda3\envs\audioldm\lib\site-packages\gradio_client\utils.py", line 540, in synchronize_async
return fsspec.asyn.sync(fsspec.asyn.get_loop(), func, *args, **kwargs) # type: ignore
File "C:\ProgramData\miniconda3\envs\audioldm\lib\site-packages\fsspec\asyn.py", line 103, in sync
raise return_result
File "C:\ProgramData\miniconda3\envs\audioldm\lib\site-packages\fsspec\asyn.py", line 56, in _runner
result[0] = await coro
File "C:\ProgramData\miniconda3\envs\audioldm\lib\site-packages\gradio\helpers.py", line 347, in cache
prediction = await Context.root_block.process_api(
File "C:\ProgramData\miniconda3\envs\audioldm\lib\site-packages\gradio\blocks.py", line 1550, in process_api
result = await self.call_function(
File "C:\ProgramData\miniconda3\envs\audioldm\lib\site-packages\gradio\blocks.py", line 1185, in call_function
prediction = await anyio.to_thread.run_sync(
File "C:\ProgramData\miniconda3\envs\audioldm\lib\site-packages\anyio\to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "C:\ProgramData\miniconda3\envs\audioldm\lib\site-packages\anyio\_backends\_asyncio.py", line 2364, in run_sync_in_worker_thread
return await future
File "C:\ProgramData\miniconda3\envs\audioldm\lib\site-packages\anyio\_backends\_asyncio.py", line 864, in run
result = context.run(func, *args)
File "C:\ProgramData\miniconda3\envs\audioldm\lib\site-packages\gradio\utils.py", line 661, in wrapper
response = f(*args, **kwargs)
File "app.py", line 48, in text2audio
waveform = [
File "app.py", line 49, in <listcomp>
gr.make_waveform((sample_rate, wave[0]), bg_image="bg.png") for wave in waveform
File "C:\ProgramData\miniconda3\envs\audioldm\lib\site-packages\gradio\helpers.py", line 866, in make_waveform
raise RuntimeError("ffmpeg not found.")
RuntimeError: ffmpeg not found.
hi
today is the first time I'm running your project, and I wanted to see the control panel, but instead I see a python script calculating something for a long time and then crashing with an error.
I did everything as described in the quick start section to access the panel.
What did I do?
Web APP
Prepare running environment
conda create -n audioldm python=3.8; conda activate audioldm
pip3 install git+https://github.com/haoheliu/AudioLDM2.git
git clone https://github.com/haoheliu/AudioLDM2; cd AudioLDM2
Start the web application (powered by Gradio)
python3 app.py
A link will be printed out. Click the link to open the browser and play.
result