Files
Verso/server-ce/Dockerfile-base
T
claude 83b6b323c3
Build and Deploy Verso / deploy (push) Successful in 17m0s
Add cv2/tqdm to base; implement per-project Python venvs (Design B, Phase 1)
Base image: add opencv-python-headless (cv2) and tqdm to the bundled
scientific stack, and python3-venv (needed to build per-project venvs).

Per-project dependencies: a project's requirements.txt is now installed into a
venv cached by its sha256 (python3 -m venv --system-site-packages, so the
bundled stack stays visible and only extra packages are installed); QuartoRunner
points Quarto at it via QUARTO_PYTHON. A per-hash flock serialises concurrent
builds; pip output is merged into output.log; on failure the render falls back
to the base interpreter. Venvs live under PYTHON_VENVS_DIR
(default /var/lib/overleaf/data/python-venvs).

Gating: PythonVenvGate.userCanInstallPython restricts installs to the project
owner + invited collaborators (ignorePublicAccess excludes anonymous/link
users), threaded to CLSI as allowPythonInstall on the editor compile,
presentation export, and publish paths. Behind OVERLEAF_ENABLE_PROJECT_PYTHON_VENV
(enabled in the deployment). Design doc updated; Phase 2 (egress policy) and
Phase 3 (venv eviction) remain.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 13:14:47 +00:00

166 lines
8.6 KiB
Plaintext

# --------------------------------------------------
# Overleaf Base Image (sharelatex/sharelatex-base)
# --------------------------------------------------
FROM phusion/baseimage:noble-1.0.3
# Makes sure LuaTeX cache is writable
# -----------------------------------
ENV TEXMFVAR=/var/lib/overleaf/tmp/texmf-var
# Update to ensure dependencies are updated
# ------------------------------------------
ENV REBUILT_AFTER="2026-05-21"
# Install dependencies
# --------------------
RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \
# Technically, we are using potentially stale package-lists with the below line.
# Practically, apt refreshes the lists as needed and release builds run in fresh CI VMs without the cache.
--mount=type=cache,target=/var/lib/apt/lists,sharing=locked true \
# Enable caching: https://docs.docker.com/reference/dockerfile/#example-cache-apt-packages
&& rm -f /etc/apt/apt.conf.d/docker-clean && echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' > /etc/apt/apt.conf.d/keep-cache \
&& apt-get update \
&& apt-get install -y \
unattended-upgrades \
build-essential wget net-tools unzip time poppler-utils optipng strace nginx git python3 python-is-python3 zlib1g-dev libpcre3-dev gettext-base libwww-perl ca-certificates curl gnupg \
qpdf \
# upgrade base-image, batch all the upgrades together, rather than installing them on-by-one (which is slow!)
&& unattended-upgrade --verbose --no-minimal-upgrade-steps \
# install Node.js https://github.com/nodesource/distributions#nodejs
&& mkdir -p /etc/apt/keyrings \
&& curl -fsSL https://deb.nodesource.com/gpgkey/nodesource-repo.gpg.key | gpg --dearmor -o /etc/apt/keyrings/nodesource.gpg \
&& echo "deb [signed-by=/etc/apt/keyrings/nodesource.gpg] https://deb.nodesource.com/node_22.x nodistro main" | tee /etc/apt/sources.list.d/nodesource.list \
&& apt-get update \
&& apt-get install -y nodejs \
\
&& rm -rf \
# We are adding a custom nginx config in the main Dockerfile.
/etc/nginx/nginx.conf \
/etc/nginx/sites-enabled/default
# Install Quarto (bundles Typst for PDF rendering — no LaTeX needed)
# ------------------------------------------------------------------
ARG QUARTO_VERSION=1.6.39
RUN curl -fsSL "https://github.com/quarto-dev/quarto-cli/releases/download/v${QUARTO_VERSION}/quarto-${QUARTO_VERSION}-linux-amd64.deb" -o /tmp/quarto.deb \
&& dpkg -i /tmp/quarto.deb \
&& rm /tmp/quarto.deb \
&& mkdir -p /var/www/.cache/quarto /var/www/.local/share \
&& chown -R www-data:www-data /var/www/.cache /var/www/.local
# Pre-install popular Quarto extensions
# -----------------------------------------------------------------------
# Extensions land in /opt/quarto-extensions/_extensions/<author>/<name>/.
# QuartoRunner copies them into each project's compile dir (no-clobber,
# so user-uploaded extensions in their project always take precedence).
# To add more: append another line: && quarto add --no-prompt <author>/<repo>
# -----------------------------------------------------------------------
RUN mkdir -p /opt/quarto-extensions \
&& cd /opt/quarto-extensions \
\
# Typst document formats
&& quarto add --no-prompt igorlima/charged-ieee \
\
# RevealJS presentation plugins (official Quarto extensions)
&& quarto add --no-prompt quarto-ext/fontawesome \
&& quarto add --no-prompt quarto-ext/attribution \
&& quarto add --no-prompt quarto-ext/pointer \
&& quarto add --no-prompt quarto-ext/drop \
\
&& chown -R www-data:www-data /opt/quarto-extensions
# Install Jupyter so Quarto can execute Python code cells in documents/decks
# -----------------------------------------------------------------------
# Quarto runs ```{python}``` cells through a Jupyter kernel. It uses the system
# python3 it detected (/usr/bin/python3), so Jupyter must be installed there.
# We install only the headless execution stack Quarto needs (jupyter-client +
# nbclient/nbformat + the ipykernel kernel + pyyaml, which Quarto's own
# /opt/quarto/share/jupyter wrapper imports), not the notebook/lab servers, and
# register a system-wide "python3" kernelspec under /usr/local/share/jupyter so
# it is discoverable regardless of HOME/XDG. Noble's Python is externally
# managed (PEP 668), hence --break-system-packages in this controlled image.
# The runtime user (www-data) writes Jupyter's runtime/connection files under
# its HOME (/var/www/.local), which is made writable in the Quarto step above.
# python3-venv is needed so a project's requirements.txt can be installed into
# a per-project venv (see QuartoRunner / PythonVenvGate).
RUN apt-get update \
&& apt-get install -y python3-pip python3-venv \
&& pip3 install --no-cache-dir --break-system-packages \
jupyter-core jupyter-client nbclient nbformat ipykernel pyyaml \
&& python3 -m ipykernel install --prefix /usr/local --name python3 --display-name "Python 3" \
# Bundle the common scientific-Python stack so most decks "just work" without
# any per-project install. matplotlib renders headless (Agg) automatically;
# opencv-python-headless is the GUI-less OpenCV build (provides cv2) suited to
# a server. To add more later, append to this list (the cheapest way to cover
# a library many projects need).
&& pip3 install --no-cache-dir --break-system-packages \
numpy pandas scipy matplotlib seaborn scikit-learn sympy plotly tabulate \
opencv-python-headless tqdm \
&& rm -rf /var/lib/apt/lists/* /root/.cache
# Install decktape + headless Chromium (for exporting RevealJS decks to PDF)
# -----------------------------------------------------------------------
# decktape drives a headless Chromium (via Puppeteer) to print the rendered
# reveal.js slides to a faithful, one-slide-per-page PDF. Chromium is the
# open-source engine (BSD); decktape is MIT, Puppeteer Apache-2.0 — all
# permissive and AGPL-compatible. They are invoked as a separate process
# (QuartoRunner runs `decktape ...`), never linked into the app.
#
# Puppeteer downloads its Chromium into PUPPETEER_CACHE_DIR during the global
# install; we put it in a world-readable /opt path so the www-data runtime user
# can launch it. Playwright is used only as a robust, distro-aware installer for
# Chromium's system libraries (handles Ubuntu Noble's t64 package renames).
ENV PUPPETEER_CACHE_DIR=/opt/puppeteer
RUN npm install -g decktape \
&& npx --yes playwright@latest install-deps chromium \
&& chmod -R a+rX /opt/puppeteer \
&& rm -rf /root/.npm /root/.cache
# Install TeX Live (for compiling .tex projects with latexmk)
# -----------------------------------------------------------------------
# Verso compiles .qmd with Quarto and .tex with latexmk; both engines live
# side by side.
#
# MINIMAL install (current): the upstream-Overleaf approach — scheme-basic
# (~300 MB) plus a few essential packages via tlmgr. Fast to build and small.
# Many documents that need extra packages (tikz, beamer, siunitx, extra
# fonts, ...) will NOT compile out of the box; users can be told to keep
# those projects in Quarto/Typst for now.
#
# TO GO FULL LATER (when the project is mature): change
# selected_scheme scheme-basic -> scheme-full
# and optionally drop the explicit `tlmgr install` line. That single change
# restores a complete LaTeX toolchain at the cost of size/build time.
# Alternatively add individual packages to the `tlmgr install` list below.
# -----------------------------------------------------------------------
ARG TEXLIVE_MIRROR=https://mirror.ox.ac.uk/sites/ctan.org/systems/texlive/tlnet
ENV PATH="${PATH}:/usr/local/texlive/bin/x86_64-linux"
RUN mkdir /install-tl-unx \
&& curl -sSL ${TEXLIVE_MIRROR}/install-tl-unx.tar.gz \
| tar -xzC /install-tl-unx --strip-components=1 \
&& echo "tlpdbopt_autobackup 0" >> /install-tl-unx/texlive.profile \
&& echo "tlpdbopt_install_docfiles 0" >> /install-tl-unx/texlive.profile \
&& echo "tlpdbopt_install_srcfiles 0" >> /install-tl-unx/texlive.profile \
&& echo "selected_scheme scheme-basic" >> /install-tl-unx/texlive.profile \
&& echo "TEXDIR /usr/local/texlive" >> /install-tl-unx/texlive.profile \
&& /install-tl-unx/install-tl \
-profile /install-tl-unx/texlive.profile \
-repository ${TEXLIVE_MIRROR} \
&& /usr/local/texlive/bin/x86_64-linux/tlmgr install \
--repository ${TEXLIVE_MIRROR} \
latexmk \
texcount \
&& rm -rf /install-tl-unx
# Set up overleaf user and home directory
# -----------------------------------------
RUN adduser --system --group --home /overleaf --no-create-home overleaf && \
mkdir -p /var/lib/overleaf && \
chown www-data:www-data /var/lib/overleaf && \
mkdir -p /var/log/overleaf && \
chown www-data:www-data /var/log/overleaf && \
mkdir -p /var/lib/overleaf/data/template_files && \
chown www-data:www-data /var/lib/overleaf/data/template_files