Commit Graph

110 Commits

Author SHA1 Message Date
yu-i-i
fdc092371b Sandboxed compiles: convert DockerRunner to ESM 2026-05-19 15:51:36 +02:00
yu-i-i
4239c2f3a8 Sandboxed compiles: set imageName to default if undefined in project 2026-05-19 15:51:36 +02:00
yu-i-i
773f3a15f6 Sandboxed Compiles: support configurable texlive image root via env var 2026-05-19 15:51:34 +02:00
yu-i-i
7ddcc0da61 Refactor Sandboxed Compiles 2026-05-19 15:49:16 +02:00
Mathias Jakobsen
9c97876268 [web]+clsi] Allow docx import via pandoc (#32004)
Co-authored-by: Jakob Ackermann <jakob.ackermann@overleaf.com>
GitOrigin-RevId: 246b3290ec04867f71545b1a7c5d95d0f68379ff
2026-03-27 09:06:23 +00:00
Jakob Ackermann
3aa69c6ffa [k8s] clsi-cache: double the number of shards (#32323)
* [k8s] clsi-cache: double the number of shards

* [monorepo] add missing clsi-cache env vars to dev-env

* [clsi] flip direction of clsi-cache shard migration

* [clsi] remove upper bound from clsi-cache shard migration

GitOrigin-RevId: a325a11c3ac9e22a12ad2d8ea802b91d2e175e24
2026-03-20 09:07:11 +00:00
Jakob Ackermann
bb3e1286db [clsi] parse env var for download concurrency as int (#31959)
GitOrigin-RevId: dbd91315318e385d691a6b59efeb3bf22f559fa2
2026-03-06 09:12:25 +00:00
Jakob Ackermann
81b7121408 [clsi] initial implementation of compile from history (#31883)
* [clsi] initial implementation of compile from history

* [clsi] copy changes

* [saas-e2e] extend test case with nested folder

* [saas-e2e] add test case for tracked changes

* [web] fix accumulating changes from multiple chunks

* [web] optimize size check for compile request payload

* [clsi] deduplicate globalBlobs

* [clsi] add validation for request body details

* [clsi] add metrics for compile from history

* [clsi] download binary files concurrently

* [clsi] skip download of empty file blob

* [clsi] break down e2e compile time metric by compileFromHistory

GitOrigin-RevId: 0dadef93e89d8a172c35cb130a1042d9d1bec42a
2026-03-06 09:12:07 +00:00
Jakob Ackermann
6c6e8d9a97 [monorepo] switch all output file reads to clsi-nginx (#31691)
* [monorepo] switch all output file reads to clsi-nginx

* [clsi-lb] allow gallery download requests

* [terraform] clsi: use nginx.conf from clsi service

* [clsi] fix flakey tests

* [clsi] replace alias with rewrite and root in nginx config

* [k8s] clsi-lb: expose download port on internal service

* [web] add explicit endpoint for downloading all output files

Serve the output.zip endpoint from clsi.

* [clsi] fix regex for latexqc submission ids

Previously, we only handled template submission ids.

GitOrigin-RevId: 6c3b21b01ec41ae767530b14aac31fbe3d640dd5
2026-02-24 09:07:12 +00:00
Jakob Ackermann
0ee8b25298 [k8s] clsi-cache: migrate to StatefulSet (#30886)
* [k8s] clsi-cache: migrate to StatefulSet

* clsi-cache: optimize ILB services for GKE subsetting

Update the new clsi-cache internal load balancer services
to use optimal settings for GKE subsetting (NEG backends):

- set allocateLoadBalancerNodePorts: false (not needed with NEGs)
- set externalTrafficPolicy: Local (preserve source IP, keep traffic in zone)
- add trafficDistribution: PreferClose (zone affinity)

These settings ensure traffic from CLSI VMs stays within the same zone
when possible, reducing latency and cross-zone network costs.

* [k8s] clsi-cache: add missing resource paths

* [clsi] exclude readOnly clsi-cache shards

---------

Co-authored-by: Daniel Kontsek <daniel.kontsek@overleaf.com>
GitOrigin-RevId: 34f18b319a0e859ff149a135131c95a44bc674d6
2026-01-27 09:05:50 +00:00
Jakob Ackermann
866e67cef1 [k8s] clsi-cache tweaks (#30949)
* [k8s] clsi-cache: switch parent-app label to 'clsi-cache-legacy'

* [k8s] clsi-cache: add service account from kustomization.yaml

* [k8s] clsi-cache: consolidate on a single array of CLSI_CACHE_INSTANCES

* [clsi-cache] make prettier happy

GitOrigin-RevId: 4082a71df591904cfe437c4bde74759ddd83634c
2026-01-23 09:06:15 +00:00
Andrew Rumble
cd7da983d1 Merge pull request #30232 from overleaf/ar/convert-clsi-to-es-modules
[clsi] convert to ES modules

GitOrigin-RevId: fb7fa52cc8f678ee31be352e62a5dff95e88008b
2026-01-22 09:06:23 +00:00
Brian Gough
d24f37d3a4 Merge pull request #28880 from overleaf/bg-add-time-option-to-clsi
add latexmk `-time` option to clsi and record performance logs

GitOrigin-RevId: 467473859359913da73f83e10b63b45603ea175c
2025-10-09 08:06:12 +00:00
Jakob Ackermann
25577379fc [clsi] add env var override for seccomp profile (#25894)
GitOrigin-RevId: 6ef9a5c1f9149147641abb9fe1798b1b41a14c05
2025-05-23 13:25:58 +00:00
Antoine Clausse
b667cef262 Revert "Update defaultHighWaterMark to 64KiB (Node 22's default) (#25522)" (#25789)
This reverts commit 19d731abf683066654027de3a4f9ac0b8916f22c.

GitOrigin-RevId: eb7c45ab45e02054601b607a4bfeb432424a1837
2025-05-22 08:07:38 +00:00
Jakob Ackermann
ec1bd69605 [clsi-cache] remove non sharded instances (#25645)
* Revert "[clsi-cache] only use sharding from updated project editor tabs (#25326)"

This reverts commit 1754276bed3186c0536055c983e32476cc90d416.

* [clsi-cache] remove non sharded instances

GitOrigin-RevId: aa3ac46140dfc1722a3350cf7071e5b11af61199
2025-05-16 08:05:02 +00:00
Antoine Clausse
aa002369cb Update defaultHighWaterMark to 64KiB (Node 22's default) (#25522)
* Set defaultHighWaterMark to 16KiB

This is already the default in Node 20

* Set defaultHighWaterMark to 64KiB

Per https://github.com/overleaf/internal/pull/25522#issuecomment-2872035192

GitOrigin-RevId: 19d731abf683066654027de3a4f9ac0b8916f22c
2025-05-14 08:05:16 +00:00
Jakob Ackermann
5cc0895c56 [clsi] enable keepAlive on global HTTP agents (#25350)
GitOrigin-RevId: c9478b405ac32ca55aeb3bcf9f24052477464667
2025-05-07 08:08:10 +00:00
Jakob Ackermann
6881ba956a [clsi-cache] only use sharding from updated project editor tabs (#25326)
GitOrigin-RevId: 1754276bed3186c0536055c983e32476cc90d416
2025-05-07 08:06:39 +00:00
Jakob Ackermann
5ce1685b5b [clsi-cache] shard each zone into three instances (#25301)
* [clsi-cache] shard per zone into three instances

Keep the old instance as read fallback. We can remove it in 4 days.

Disk size: 2Ti gives us the maximum write throughput of 240MiB/s on a
N2D instance with fewer than 8 vCPUs.

* [clsi] fix format

* [k8s] clsi-cache: bring back storage-classes

* [k8s] clsi-cache: fix reference to zonal storage-classes

* [k8s] clsi-cache: add logging configs

* [clsi] improve sharding

Co-authored-by: Brian Gough <brian.gough@overleaf.com>

* [clsi] fix sharding

Index needs to be positive.

* [clsi] fix sharding

The random part is static per machine/process.

* [clsi] restrict clsi-cache to user projects

Co-authored-by: Brian Gough <brian.gough@overleaf.com>

* [k8s] clsi-cache: align CLSI_CACHE_NGINX_HOST with service LB

---------

Co-authored-by: Brian Gough <brian.gough@overleaf.com>
GitOrigin-RevId: 1efb1b3245c8194c305420b25e774ea735251fb3
2025-05-07 08:06:16 +00:00
Jakob Ackermann
a9780ccf96 [clsi] merge sandboxed compiles config from Server Pro and SaaS (#25062)
* [clsi] merge sandboxed compiles config from Server Pro and SaaS

* [clsi] reorder fallback env vars

Co-authored-by: Mathew Evans <matt.evans@overleaf.com>

* [server-pro] bump version of expected release with these changes

---------

Co-authored-by: Mathew Evans <matt.evans@overleaf.com>
GitOrigin-RevId: bada93fec89bcc3f2bab85b6e60b2e27de88b9c2
2025-04-28 08:05:21 +00:00
Jakob Ackermann
d99ba08d01 [clsi] run SyncTeX in specific output dir rather than compile dir (#24690)
* [clsi] drop support for docker-in-docker

* [clsi] run SyncTeX in specific output dir rather than compile dir

* [clsi] store output.synctex.gz outside of tar-ball in clsi-cache

* [clsi] add documentation for rewriting of docker bind-mounts

* [server-pro] update env vars for sandboxed compiles in sample config

GitOrigin-RevId: 8debd7102ac612544961f237aa4ff1c530aa3da3
2025-04-10 08:05:26 +00:00
Jakob Ackermann
b538d56591 [clsi-cache] backend (#24388)
* [clsi-cache] initial revision of the clsi-cache service

* [clsi] send output files to clsi-cache and import from clsi-cache

* [web] pass editorId to clsi

* [web] clear clsi-cache when clearing clsi cache

* [web] add split-tests for controlling clsi-cache rollout

* [web] populate clsi-cache when cloning/creating project from template

* [clsi-cache] produce less noise when populating cache hits 404

* [clsi-cache] push docker image to AR

* [clsi-cache] push docker image to AR

* [clsi-cache] allow compileGroup in job payload

* [clsi-cache] set X-Zone header from latest endpoint

* [clsi-cache] use method POST for /enqueue endpoint

* [web] populate clsi-cache in zone b with template data

* [clsi-cache] limit number of editors per project/user folder to 10

* [web] clone: populate the clsi-cache unless the TeXLive release changed

* [clsi-cache] keep user folder when clearing cache as anonymous user

* [clsi] download old output.tar.gz when synctex finds empty compile dir

* [web] fix lint

* [clsi-cache] multi-zonal lookup of single build output

* [clsi-cache] add more validation and limits

Co-authored-by: Brian Gough <brian.gough@overleaf.com>

* [clsi] do not include clsi-cache tar-ball in output.zip

* [clsi-cache] fix reference after remaining constant

Co-authored-by: Alf Eaton <alf.eaton@overleaf.com>

* [web] consolidate validation of filename into ClsiCacheHandler

* [clsi-cache] extend metrics and event tracking

- break down most of the clsi metrics by label
  - compile=initial - new compile dir without previous output files
  - compile=recompile - recompile in existing compile dir
  - compile=from-cache - compile using previous clsi-cache
- extend segmentation on compile-result-backend event
  - isInitialCompile=true - found new compile dir at start of request
  - restoredClsiCache=true - restored compile dir from clsi-cache

* [clsi] rename metrics labels for download of clsi-cache

This is in preparation for synctex changes.

* [clsi] use constant for limit of entries in output.tar.gz

Co-authored-by: Eric Mc Sween <eric.mcsween@overleaf.com>

* [clsi-cache] fix cloning of project cache

---------

Co-authored-by: Brian Gough <brian.gough@overleaf.com>
Co-authored-by: Alf Eaton <alf.eaton@overleaf.com>
Co-authored-by: Eric Mc Sween <eric.mcsween@overleaf.com>
GitOrigin-RevId: 4901a65497af13be1549af7f38ceee3188fcf881
2025-04-10 08:05:17 +00:00
Andrew Rumble
13b4e6333c Merge pull request #23874 from overleaf/ar-use-gcp-pre-emptible-signal
[clsi] Use GCP pre-emptible metadata instead of hostname

GitOrigin-RevId: 2df305e68f2999c9d3bde051dbb533025478800f
2025-03-07 09:04:45 +00:00
Andrew Rumble
59b708e3f7 Allow draining to be prevented in CLSI agent
Co-authored-by: Brian Gough <brian.gough@overleaf.com>
Co-authored-by: Christopher Hoskin <mans0954@users.noreply.github.com>
GitOrigin-RevId: f55de7783cb1c14108eb347eebb74ec329180000
2025-02-26 09:04:26 +00:00
Brian Gough
6672372828 remove sentry from backend services (#20752)
* remove sentry from backend services - no longer required

* Remove Sentry integration from logging manager

* Remove Sentry from clsi default settings

* Remove `initializeErrorReporting` in libraries/logger

* Remove `@sentry/node` from `libraries/logger`

---------

Co-authored-by: Antoine Clausse <antoine.clausse@overleaf.com>
GitOrigin-RevId: 8149a885f5258804b93ae39cde7b7333e992532a
2024-11-27 09:04:50 +00:00
Andrew Rumble
11213d85fe Merge pull request #21883 from overleaf/ar-disable-keep-alive-remaining-services
Disable keepAlive in all services

GitOrigin-RevId: 9a3bac37e3fb09ee64b05cfda300dfe8d8672aad
2024-11-15 09:05:52 +00:00
Antoine Clausse
7f48c67512 Add prefer-node-protocol ESLint rule (#21532)
* Add `unicorn/prefer-node-protocol`

* Fix `unicorn/prefer-node-protocol` ESLint errors

* Run `npm run format:fix`

* Add sandboxed-module sourceTransformers in mocha setups

Fix `no such file or directory, open 'node:fs'` in `sandboxed-module`

* Remove `node:` in the SandboxedModule requires

* Fix new linting errors with `node:`

GitOrigin-RevId: 68f6e31e2191fcff4cb8058dd0a6914c14f59926
2024-11-11 09:04:51 +00:00
Liangjun Song
5d472e9b38 limit the number of concurrent compile requests in clsi (#19717)
GitOrigin-RevId: 17909a4dd0717ea4a75288f734ddef19c7d6592e
2024-08-06 08:04:59 +00:00
Jakob Ackermann
a486f7e481 Merge pull request #18537 from overleaf/jpa-cleanup-synctex
[clsi] cleanup unused SYNCTEX_BIN_HOST_PATH

GitOrigin-RevId: 062aed9dceb1388fab0e59770aded5a440eaa3bd
2024-05-29 08:03:57 +00:00
Jakob Ackermann
a540754f6e Merge pull request #18116 from overleaf/jpa-bulk-replace-localhost
[misc] bulk replace localhost with 127.0.0.1

GitOrigin-RevId: d238f3635302e8ff5500d611108c4d1bef216726
2024-04-26 08:04:39 +00:00
Jakob Ackermann
51994579cd Merge pull request #17472 from overleaf/jpa-clsi-debug
[clsi] disable load shedding in dev-env

GitOrigin-RevId: 8ebd1d6d8f45947690d3c4480ccd734e02f1d8be
2024-03-12 09:03:50 +00:00
Eric Mc Sween
25f75bbc41 Merge pull request #11661 from overleaf/jpa-cleanup-clsi
[clsi] delete unused DbQueue module and related config option

GitOrigin-RevId: d9f1f84f82149efe1d7cdd75b2b45ba78e4defd2
2023-02-09 14:34:42 +00:00
Jakob Ackermann
a78bcee15f Merge pull request #8135 from overleaf/jpa-refactor-zonal-download
[misc] refactor handling of zone prefix in compile response

GitOrigin-RevId: f1f33d7d257854176f383bb5d786710f6b09f737
2022-05-26 08:03:53 +00:00
Jakob Ackermann
9a71372c36 Merge pull request #8071 from overleaf/jpa-clsi-zonal
[misc] prefix output file downloads with /zone/X

GitOrigin-RevId: ba59e97ae0284c68ba551dd49dc5d3daa4d61aa9
2022-05-25 08:08:39 +00:00
Simon Detheridge
fe1cb010b4 Merge pull request #5718 from overleaf/spd-built-images
Support prebuilt images and add overrides for using mapped folders

GitOrigin-RevId: b1a1f7fe61ead1c08f3b8eef35ef90926ca42f36
2022-02-16 11:31:30 +00:00
Christopher Hoskin
d8f4780679 Merge pull request #5839 from overleaf/csh-issue-5677-clsi-env
Add CLSI env var to indicate running in CLSI

GitOrigin-RevId: 9d55b001a32ecf7fa792023312c31bd9e6b64646
2021-11-19 09:02:52 +00:00
Jakob Ackermann
94c208311c Merge pull request #5351 from overleaf/jpa-clsi-drop-sqlite
[clsi] goodbye sqlite db

GitOrigin-RevId: e7f16920c44e74a425b92884b48a60272dfb015b
2021-10-07 08:03:50 +00:00
Brian Gough
dc576f3b6f Merge pull request #4943 from overleaf/bg-increase-max-print-line
allow setting texlive max_print_line variable

GitOrigin-RevId: 7588fed6aa5868a6ed6b6121cbd6f9c008c2aa0f
2021-09-22 08:03:21 +00:00
Jakob Ackermann
56a3b0dcde Merge pull request #4819 from overleaf/jpa-pdf-caching-new-event-loop
[clsi] pdf-caching: move cpu intensive work onto a new event loop

GitOrigin-RevId: 4cb5cd4528fa1c5df6a8e91f9caa38cb64d94463
2021-08-25 08:03:38 +00:00
Brian Gough
262793c04f add option for apparmor profile 2021-07-21 14:53:35 +01:00
Jakob Ackermann
f285e503b4 [misc] run format_fix and lint:fix 2021-07-13 12:04:48 +01:00
Jakob Ackermann
516525126b [UrlFetcher] do not override domain for clsi-perf requests 2021-07-02 09:17:29 +01:00
Jakob Ackermann
b09e52510f [misc] bail out from pdf caching processing after 10s or earlier
...for fast compiles.
2021-06-23 14:20:04 +01:00
Jakob Ackermann
b456ea726d [misc] merge pdf caching into main (#226)
* wip generate directory for hash content

* cleanup, remove console logging

* add content caching module

* Return PDF stream ranges with compile response

* Return the PDF file size in the compile response

* PDF range endpoint

* [misc] WIP: pdf caching: preserve the m-time on static content files

* [misc] WIP: pdf caching: improve browser caching, emit caching headers

* [misc] WIP: pdf caching: do not emit very small chunks <1kB

* [misc] keep up with moving output files into a separate directory

* [OutputCacheManager] add global feature flag for enabling pdf caching

* [misc] add contentId into the URL for protecting PDF stream contents

* [misc] support PDF stream caching for anonymous users

* [misc] add per-request feature flag for enabling PDF stream caching

* [misc] enable pdf caching in CI and emit metrics at the end of run

* [misc] expose compile stats and timings to the frontend

* [misc] log an error in case saving output files fails

* [misc] add metrics for pdf bandwidth and pdf caching performance

* [misc] add a dark mode to the pdf caching for computing ranges only

* [misc] move pdf caching metrics into ContentCacheMetrics

* [misc] add a config option for the min chunk size of pdf ranges

Co-authored-by: Brian Gough <brian.gough@overleaf.com>
Co-authored-by: Eric Mc Sween <eric.mcsween@overleaf.com>
2021-05-13 14:07:54 +01:00
Brian Gough
fc11574698 add uncaughtException handler 2021-01-26 14:08:29 +00:00
Brian Gough
692dbc8d6b add output directory 2021-01-22 11:05:52 +00:00
Jakob Ackermann
aa02df7b81 [misc] apply review feedback
- move setting into clsi.docker namespace
- rename the variable for images to allowedImages / ALLOWED_IMAGES
- add an additional check for the image name into the DockerRunner

Co-Authored-By: Brian Gough <brian.gough@overleaf.com>
2020-06-30 12:01:21 +01:00
Jakob Ackermann
8846efe7ce [misc] RequestParser: restrict imageName to an allow list and add tests 2020-06-26 13:28:09 +01:00
Brian Gough
85f4f348dd add default settings to remove wordcount and synctex containers 2020-06-15 15:49:38 +01:00