-
Notifications
You must be signed in to change notification settings - Fork 4
Description
I'm building up an environmnet to use in a HPC setting. The goal is to build a docker container with several Python packages and to then convert this to a Singularity file to be used in the univesity's Slurm HPC cluster.
The one thing I've noticed, which isn't great, is that the install size of Repast4py is huge. The docker image is ~8.2 GB in size. After taking a look at the docker image layers:
> docker history 799316543ca4
IMAGE CREATED CREATED BY SIZE COMMENT
799316543ca4 9 minutes ago /bin/sh -c #(nop) ENV PYTHONPATH=/repast4py… 0B
91a65220423d 9 minutes ago /bin/sh -c env CC=mpicxx CXX=mpicxx pip inst… 8.16GB
1e7b9cd039e3 19 minutes ago /bin/sh -c pip install -r ./requirements.txt 199MB
6ccb1de7ca54 21 minutes ago /bin/sh -c #(nop) COPY file:ab16ddc3a986b259… 283B
a45824896792 25 minutes ago /bin/sh -c apt-get update && apt-get ins… 340MB
73b513f59526 2 years ago /bin/sh -c #(nop) CMD ["python3"] 0B
<missing> 2 years ago /bin/sh -c set -ex; savedAptMark="$(apt-ma… 9.51MB
<missing> 2 years ago /bin/sh -c #(nop) ENV PYTHON_GET_PIP_SHA256… 0B
<missing> 2 years ago /bin/sh -c #(nop) ENV PYTHON_GET_PIP_URL=ht… 0B
<missing> 2 years ago /bin/sh -c #(nop) ENV PYTHON_SETUPTOOLS_VER… 0B
<missing> 2 years ago /bin/sh -c #(nop) ENV PYTHON_PIP_VERSION=21… 0B
<missing> 2 years ago /bin/sh -c cd /usr/local/bin && ln -s idle3… 32B
<missing> 2 years ago /bin/sh -c set -ex && savedAptMark="$(apt-… 29.5MB
<missing> 2 years ago /bin/sh -c #(nop) ENV PYTHON_VERSION=3.10.0 0B
<missing> 2 years ago /bin/sh -c #(nop) ENV GPG_KEY=A035C8C19219B… 0B
<missing> 2 years ago /bin/sh -c set -eux; apt-get update; apt-g… 3.11MB
<missing> 2 years ago /bin/sh -c #(nop) ENV LANG=C.UTF-8 0B
<missing> 2 years ago /bin/sh -c #(nop) ENV PATH=/usr/local/bin:/… 0B
<missing> 2 years ago /bin/sh -c #(nop) CMD ["bash"] 0B
<missing> 2 years ago /bin/sh -c #(nop) ADD file:ece5ff85ca549f0b1… 80.4MB
And the dockerfile (similar to the file in the repast git repo):
FROM python:3.10.0-slim
RUN apt-get update && \
apt-get install -y mpich \
&& rm -rf /var/lib/apt/lists/*
# Install the python requirements
COPY ./requirements.txt ./requirements.txt
RUN pip install -r ./requirements.txt
# Install repast4py
RUN env CC=mpicxx CXX=mpicxx pip install repast4py
# Set the PYTHONPATH to include the /repast4py folder which contains the core folder
ENV PYTHONPATH=/repast4py/src
It's clear that the 8 Gig layer is coming from the RUN env CC=mpicxx CXX=mpicxx pip install repast4py command.
Using a file this large isn't impossible, but it introduces some issues with storing this in a limited free repo, building it with CI/CD, moving it to nodes in the cluster etc.
Is there a simple way to reduce the build size? I can't really believe that it's using 8 gigs of compiled C code to run repast4py.