allen dataset: Create colab example with Allen demo dataset.#139
allen dataset: Create colab example with Allen demo dataset.#139abdelrahman725 wants to merge 1 commit intosensorium-competition:mainfrom
Conversation
|
Found 1 changed notebook. Review the changes at https://app.gitnotebooks.com/sensorium-competition/experanto/pull/139 |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
19b829d to
241f53f
Compare
There was a problem hiding this comment.
Pull request overview
Adds a new Colab-oriented example notebook demonstrating how to export an Allen dataset via allen-exporter and then use Experanto to load and interpolate multiple modalities.
Changes:
- Added
examples/allen_example.ipynbwith Micromamba-based setup forallensdkcompatibility in Colab. - Added walkthrough code to export an Allen experiment, load it with
experanto.Experiment, and run interpolation for all devices or a single device.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
241f53f to
a19b38b
Compare
UpdatesAddressed only the relevant Copilot comments |
schewskone
left a comment
There was a problem hiding this comment.
Thanks a lot for the PR, the colab setup and environment instantiation look good.
Please address the comments to make the example even more intuitive
| "import numpy as np\n", | ||
| "\n", | ||
| "# Query 100 time points spread evenly over 10 seconds\n", | ||
| "times = np.linspace(0, 10, 100)\n", | ||
| "\n", | ||
| "\n", | ||
| "# device can be: screen, treadmill, eye_tracker, or responses\n", | ||
| "screen = exp.interpolate(times, device=\"screen\")\n", | ||
| "treadmill = exp.interpolate(times, device=\"treadmill\")\n", | ||
| "eye_tracker = exp.interpolate(times, device=\"eye_tracker\")\n", | ||
| "responses = exp.interpolate(times, device=\"responses\")\n", | ||
| "\n", | ||
| "# Change here what device you want to see its interpolated signals\n", | ||
| "print(responses)" |
There was a problem hiding this comment.
A visualization of the results would be great here. Printing the screen values is not very intuitive.
Here is some code I use in one of my notebooks to check the interpolation in a interactive video plot :
# get the experiment class from experanto
from experanto.experiment import Experiment
# set experiment folder as root
root_folder = '../../data/allen_data/experiment_951980471'
# initialize experiment object
e = Experiment(root_folder)
%load_ext autoreload
%autoreload 2
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.animation as animation
from IPython.display import HTML`
times = np.arange(310., 340., 0.5)
video = e.interpolate(times, device="screen") # shape: (T, C, H, W)
# Clip and convert to uint8 to avoid matplotlib clipping warnings
video = np.clip(video, 0, 255).astype(np.uint8)
video = video.transpose(0, 2, 3, 1) # (T, H, W, C)
n_frames, height, width, channels = video.shape
print(f"Video shape: {video.shape}")
# Handle grayscale vs color
is_grayscale = (channels == 1)
if is_grayscale:
video = video[..., 0] # Now shape: (T, H, W)
fig, ax = plt.subplots()
# Initialize with appropriate cmap
if is_grayscale:
img = ax.imshow(video[0], cmap='gray', vmin=0, vmax=255)
else:
img = ax.imshow(video[0])
ax.axis('off')
def update(frame):
img.set_array(video[frame])
ax.set_title(f'Frame {frame}')
return [img]
ani = animation.FuncAnimation(fig, update, frames=n_frames, interval=50, blit=True)
plt.close(fig)
HTML(ani.to_jshtml())`
There was a problem hiding this comment.
I believe there is something wrong with the timestamp-to-frame mapping:
Running only the following:
times = np.arange(310., 340., 0.5)
video = e.interpolate(times, device="screen")tries to open this file data/allen_data/experiment_951980471/screen/data/00061.npy, which I can clearly see it doesn't exist. The same happens with the video timestamps 4484., 4500 you mentioned below, it tries to open another non-existent file 09631.npy
Worth mentioning that this issue occurs in both cases:
- on my local machine, where allensdk is installed from PyPI, which is quite outdated (Nov 2023)
- on colab notebok, where allensdk is installed from Github and should be new
Hmm, should we adjust the arguments to multi_session_export, since this is the function responsible for exporting the data ?
cache, ids = multi_session_export(1, val_rate= 0.2, subsample_frac=1)There was a problem hiding this comment.
@abdelrahman725 please read the code above.
you need to modify it, specifically the root_folder
Tom just gave an example how he visualised thing - you need to match this example with the experiment created from allen data
There was a problem hiding this comment.
I would never use that code without reading and modfiying it first. root_folder is not related here, I've already loaded the experiment. The problem is only with the timestamps.
As I explained, those specific .npy files shown in the logs are not physically in the /data, I searched for them.
There was a problem hiding this comment.
Don't worry, I'm still investigating and working on it. I just wanted to share this just in case @schewskone knows something
There was a problem hiding this comment.
Hi sorry for the late response, this is my bad. I always used #81 when working with the allen data and forgot that it implements a small change to data loading that is not yet in the main branch. Hence the allen-data is not compatible with the main branch. Sorry for the confusion, I will address this in a small PR later today.
There was a problem hiding this comment.
This is interesting.
I actually solved the problem by modifying _parse_trials():
...
modality = metadata.get("modality", "")
data_file_name = self.root_folder / "data" / f"{image_name}{file_format}"
...The main issue is that the field image_name inside experiment_951980471/screen/combined_meta.json was actually ignored, and instead it uses the consecutive json file keys (e.g. "00000", "00001") instead. Leading later to load the wrong file path.
While exp.interpolate ran successfully, that introduced other issue with np.clip(video, 0, 255).astype(np.uint8)
Anyway, thanks @schewskone for letting me know, I think I will use your branch until it gets merged to main, for the sake of completing the notebook.
Ah, ok if it's gonna be a small PR today, that's good.
There was a problem hiding this comment.
@abdelrahman725 should this thread be resolved now?
If yes - please resolve it
| "treadmill = exp.interpolate(times, device=\"treadmill\")\n", | ||
| "eye_tracker = exp.interpolate(times, device=\"eye_tracker\")\n", | ||
| "responses = exp.interpolate(times, device=\"responses\")\n", |
There was a problem hiding this comment.
Maybe it would be nice to have one cell per modality besides screen, where you briefly sanity check the shapes and print/plot a few values if it makes sense
| "import numpy as np\n", | ||
| "\n", | ||
| "# Query 100 time points spread evenly over 10 seconds\n", | ||
| "times = np.linspace(0, 10, 100)\n", |
There was a problem hiding this comment.
It would be nice to give options to query over multiple times by defining this cell as a function, i.e.
def interpolate_screen(experiment, timestamps) -> np.ndarray
For the Allen dataset this would be nice as initially there is only images and later on there is videos.
sample video time for experiment 951980471 : times = np.arange(4484., 4500., 0.5)
|
@abdelrahman725 please merge current main in your branch :) |
a19b38b to
e1d31b2
Compare
UpdatesAddressed review comments. I decided to leave |
e1d31b2 to
42e7c07
Compare
|
@abdelrahman725 please take a look on why the tests are failing. It's a bit suspicious cause adding a notebook should not change the tests but they were fine before. Maybe merging current main in there would be enough to solve. Also please re-request review from me and Tom, when the PR is ready for another review |
|
@abdelrahman725 we found the issue with the tests - its now fixed |
42e7c07 to
418c801
Compare
|
@abdelrahman725 At the end of the notebook in sections |
|
@pollytur
|
Yes, as I mentioned above, I'm not sure what "please print shapes only and use plots for the rest" means and for which devices ? |
It means that Its nice to have plots of Eye tracker / Treadmill channels activity over time same as you have it above for neurons |
|
@abdelrahman725 like also here |
|
for printing both neurons activity and behaviour activities please use |
|
@schewskone you downloaded the Visual Coding 2P Allen dataset right? https://allenswdb.github.io/physiology/ophys/visual-coding/vc2p-background.html There was no water licking or behaviour in there or was it? @abdelrahman725 wait for @schewskone reply but I think the explanation you give at the top is for the wrong dataset (e.g. |
There was a problem hiding this comment.
in the first markdown there is
Cheat sheet
See this helpful [cheat sheet](https://github.com/trendinafrica/TReND-CaMinA/blob/main/notebooks/Zambia25/07-to-10-Mon-toThu-AllenTutorial/Visual%20Coding%202P%20Cheat%20Sheet%20October2018.pdf) from CAMINA school
Cheat sheet for what? Give 1 sentence context what would they found there
Also add brief explanations in most of the markdown sections
- for example
Download and export Allen dataset- what is the difference between download and export? Load experiment- what happens there (e.g. data was downloaded in this folder and we are now loading it with the help of experanto) - yes, I know its trivial from a coding perspective but the goal of example notebooks is to extremely clear and self-explanatory- would be nice to have a comment on tiers split and what do these id mean here
@pollytur I agree, I can print the shapes in "Interplate from all devices" cell and maybe other minimal states , but I prefer to include other details like nans into each correpsonding modality cell below and before plotting. |
Sure, you are welcome to plot some details as suggested above. But this should be minimal meaningful prints instead of plain display of array parts |
There was a problem hiding this comment.
Screen and response interpolation lgtm. Please address the print of the interpolation output and finish todos.
|
Visual Behavior is the correct dataset not Visual Coding I'm not entirely sure about the difference but I followed the tutorials from here, which is for the Visual Behavior dataset. |

Description
Create a google colab notebook to show case the whole pipline of downloading, exporting, and interpolating allen dataset.
Follow up
Are there any visualizations you want in the notebook ? If yes, please explain why they are needed, this helps me learn really quickly while building !
Closes #102