Vocal3D

Vocal3D is a library for the real-time reconstruction of human vocal folds using a single shot structured light system. This is a joint work of the Chair of Visual Computing of the Friedrich-Alexander University of Erlangen-Nuremberg and the Phoniatric Division of the University Hospital Erlangen. This code accompanies the paper Real-Time 3D Reconstruction of Human Vocal Folds via High-Speed Laser-Endoscopy.

Dataset

The HLE Dataset described in the Paper is hosted here on GitHub!
We will add it to CERNs Zenodo Platform at a later stage.

Prerequisites

Make sure that you have a Python version >=3.5 installed. A CUDA capable GPU is recommended, but not necessary. However, getting PyTorch3D to work inside the Nurbs-Diff Module without CUDA may require some tinkering.

Installation

First, make sure that conda is installed and clone this repository, including its submodules:

git clone https://github.com/Henningson/Vocal3D.git
cd Vocal3D
git submodule update --init --recursive

Generate a new conda environment and activate it:

conda create --name Vocal3D python=3.8
conda activate Vocal3D

Then, install the necessary packages with

pip install opencv-python-headless matplotlib scikit-learn tqdm geomdl PyQt5 pyqtgraph ninja
pip install -U fvcore
conda install -c bottler nvidiacub
conda install -c conda-forge igl
conda install -c conda-forge kornia
conda install -c conda-forge trimesh
pip install pycallgraph2

Install pytorch and pytorch3D. ATTENTION: first make sure that your CUDA version matches with available cudatoolkit versions. Check available cudatoolkit versions with conda search cudatoolkit and make sure you are on a matching CUDA installation using nvcc --version

conda install pytorch torchvision torchaudio cudatoolkit=XX.Y -c pytorch
pip install "git+https://github.com/facebookresearch/pytorch3d.git@stable"

Download and install NURBS_Diff

pip install pybind11
git clone https://github.com/anjanadev96/NURBS_Diff.git
cd NURBS_Diff
pip install .

Download and install our fork of Victor Cornillères PyIGL Viewer.
It adds some shadercode that we use for a more domain specific visualization.

pip install git+git://github.com/Henningson/PyIGL_viewer.git

And finally install our lightweight C++ ARAP implementation.

conda install -c conda-forge eigen
cd PybindARAP
python setup.py install

Usage

An example video and calibration files are given in the assets folder. Unzip the example folder with unzip assets/sample_data.zip -d assets/ and run the example using

python source/main.py

Things to note

If you are using the supplied viewer, please note that the pipeline will generally be not as performant, as every step of the pipeline will be computed in succession (think of it more like a debug view). However, you will still be able to generate results in a matter of seconds, provided you do not use a PC that is barely able to run MS-DOS. We supply three Segmentation algorithms in this repository. One is especially designed for the silicone videos (that are included in the sample_data.zip file), then we include the one by Koc et al. and finally a Neural Segmentator based on a U-Net architecture. For first tests, we recommend the U-Net one, as it generally is the most robust (albeit the slowest) one. A pre-trained model is included in the assets folder.

Implementing your own segmentation algorithm

If you want to integrate your own segmentation algorithm into the viewer, we supply a BaseSegmentator class, from which your segmentation class may inherit. The necessary functions to override are marked by #TODO: Implement me. Please have a look at the supplied segmentation algorithms for some inspiration.

Limitations

Due to the moisture on top of human vocal folds, the mucuous tissue of in-vivo data often generates specular highlights that influences the performance of segmentation algorithms. Furthermore, the segmentation algorithm by Koc et al. that we supply in this repository requires well captured data, in which the glottis can be accurately differentiated from the vocal folds. As of right now, we are working on a system-specific segmentation algorithm that can deal with these harsh cases.

Citation

Please cite this paper, if this work helps you with your research:

@InProceedings{10.1007/978-3-031-16449-1_1,
  author="Henningson, Jann-Ole and Stamminger, Marc and D{\"o}llinger, Michael and Semmler, Marion",
  title="Real-Time 3D Reconstruction of Human Vocal Folds via High-Speed Laser-Endoscopy",
  booktitle="Medical Image Computing and Computer Assisted Intervention -- MICCAI 2022",
  year="2022",
  pages="3--12",
  isbn="978-3-031-16449-1"
}

A PDF of the Paper is included in the assets/ Folder of this repository. However, you can also find it here: Springer Link.

Name		Name	Last commit message	Last commit date
Latest commit History 103 Commits
PybindARAP @ c650663		PybindARAP @ c650663
assets		assets
images		images
source		source
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
_config.yml		_config.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Vocal3D

Dataset

Prerequisites

Installation

Usage

Things to note

Implementing your own segmentation algorithm

Limitations

Citation

About

Uh oh!

Releases

Packages

Languages

License

Sequynth/Vocal3D

Folders and files

Latest commit

History

Repository files navigation

Vocal3D

Dataset

Prerequisites

Installation

Usage

Things to note

Implementing your own segmentation algorithm

Limitations

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages