Commit Graph

252 Commits

Author SHA1 Message Date
yu-i-i
330c458bf5 Revert "[clsi] remove locking from docker actions (#32373)"
This reverts commit d66a856baa.
2026-05-19 15:51:39 +02:00
yu-i-i
fdc092371b Sandboxed compiles: convert DockerRunner to ESM 2026-05-19 15:51:36 +02:00
Mathias Jakobsen
ce6f9b8e8c Merge pull request #33705 from overleaf/mj-clsi-cwd-for-conversions
[clsi] Add cwd argument to CommandRunner and use to simplify conversions

GitOrigin-RevId: 5333e3262a99e602ab5470ae1e23facb5b28a170
2026-05-19 08:04:51 +00:00
Mathias Jakobsen
6b28a4ee5a Merge pull request #33560 from overleaf/mj-conversion-cleanup
[clsi+web] Small cleanups and improvements to conversions / exports

GitOrigin-RevId: 300adfbb91e89f754ee7f835db792ccb50b27613
2026-05-12 08:06:17 +00:00
Mathias Jakobsen
32da6548c8 Merge pull request #33277 from overleaf/mj-pandoc-clsi-two-step-download
[clsi] Use clsi-nginx for downloading pandoc exports

GitOrigin-RevId: b6013fae6f53c7af714634d700ceed491d724653
2026-05-08 08:09:18 +00:00
Mathias Jakobsen
ae31ad218c Merge pull request #33104 from overleaf/mj-pandoc-arguments
[clsi] Add pandoc arguments for better conversions

GitOrigin-RevId: 76cddc5959237d6d2801c56471d8d3f63d111200
2026-05-08 08:09:12 +00:00
Mathias Jakobsen
5dc67db403 Merge pull request #33089 from overleaf/ds-export-md-files-pandoc
[WEB + CLSI] Download as markdown

GitOrigin-RevId: 181eddf2513e9c5edacbab37e93f9cac2191ee1a
2026-05-08 08:09:07 +00:00
Mathias Jakobsen
eddcc5a42e Merge pull request #32857 from overleaf/ds-pandoc-import-md
[WEB + CLSI] Import markdown files using pandoc

GitOrigin-RevId: adad7831ddb13a8fcb8063871166bde13cbbf1b6
2026-05-08 08:09:02 +00:00
Jakob Ackermann
37cc65ec7e [web] consolidate clsi downloads and add zod validation (#33069)
* [web] consolidate clsi downloads and add zod validation

* [validation-tools] make prettier happy

* [web] make clsiServerId optional

* [web] fix type of buildId

* [web] gracefully handle ObjectId

* [web] fix type of buildId

* [monorepo] address review feedback

- cjs export
- update module path in comments
- skip adding ?clsiserverid if not set
- allow nested output file download for submissions and add tests

* [web] address review feedback

* [web] cache one more zod schema

* [web] fix unit tests

GitOrigin-RevId: 0a1e618955983e035defd6d3c0528b81e0e85c95
2026-05-05 08:06:05 +00:00
Anna Claire Fields
0d64a88a46 Yarn 4 Migration (#32253)
Migrates the Overleaf monorepo package manager from npm (v11) to Yarn 4 (v4.9.1) using node-modules linker mode.

GitOrigin-RevId: 50d32ab01955c15e29679eff9e9e9cfb897fab2d
2026-04-28 08:52:37 +00:00
Davinder Singh
be5a7b56c8 [WEB + CLSI] Download as docx file feature (#32851)
* using CLSI logic for fetching the project contents and skip the .zip export

* Use unique conversion directory for project-to-docx export to avoid corrupting the shared compile
  directory when a compile runs concurrently

* Remove X-Accel-Buffering header — not needed as CLSI does not run behind nginx

* moving log before sending the data

* Return CLSI stream directly instead of buffering to disk on web

  Previously convertProjectToDocx wrote the CLSI response to a temp file
  on disk, then the controller read it back to stream to the client.
  Now the stream is returned directly and piped to the response,
  avoiding unnecessary disk I/O on the web server.

* Use href redirect for docx export instead of fetching blob into memory

* making functions and files more generic so they can be used in future for other documents exports as well

* adding export-docx split test

* adding unit tests

* adding cypress E2E test

* format:fix

* renaming the route to download from convert

* adding new icon for export docx button

* format:fix

* remove unused showExportDocumentErrorToast export and adding guard against invalid Content-Length header from CLSI

* format:fix

* refactor(clsi): move promisify(parse) into RequestParser

* refactor: generic conversion endpoint with type as route
  param

* refactor: use type→extension map for validated conversion types

* refactor(clsi): remove --standalone flag and fix rejection test

* fixing the href in cypress test

* renaming function

* adding type to Metrics.inc

* fix: rename exportProjectDocument, add WithLock wrapper and metrics type label

* format:fix

* fix: hide docx export from anonymous users and add WithLock wrapper

* format fix

* remove redundant Content-Length validation from DocumentConversionManager

* format:fix

* removing trailing icon

GitOrigin-RevId: e9764fefac2c4b625d23be9e942ea4a8b283c70d
2026-04-24 08:06:10 +00:00
Andrew Rumble
72586d2ea2 Merge pull request #32497 from overleaf/ar-picomatch-4.0.4
[monorepo] picomatch 4.0.4

GitOrigin-RevId: 433c2b436123b3eff336ef6597a67c7dccc9d6ba
2026-04-01 08:05:59 +00:00
Jakob Ackermann
d66a856baa [clsi] remove locking from docker actions (#32373)
* [clsi] remove locking from docker actions

Start:
- We have an in-memory lock on the compile request

Destroy:
- as part of run: see above
- as part of cleanup: we check the last access time now, so it cannot
  happen concurrent to compiling anymore.

Co-authored-by: Anna Claire Fields <anna.fields@overleaf.com>

* [clsi] update comment

---------

Co-authored-by: Anna Claire Fields <anna.fields@overleaf.com>
GitOrigin-RevId: a58df45416ae31c0b38d5efec7f9371d747303df
2026-03-27 09:06:28 +00:00
Mathias Jakobsen
9c97876268 [web]+clsi] Allow docx import via pandoc (#32004)
Co-authored-by: Jakob Ackermann <jakob.ackermann@overleaf.com>
GitOrigin-RevId: 246b3290ec04867f71545b1a7c5d95d0f68379ff
2026-03-27 09:06:23 +00:00
Jakob Ackermann
a9c413857a [clsi] avoid destroying containers of recently accessed projects (#32186)
* [clsi] avoid destroying containers of recently accessed projects

Co-authored-by: Anna Claire Fields <anna.fields@overleaf.com>

* [clsi] gracefully handle missing access time during container cleanup

* [clsi] fix cyclic import

---------

Co-authored-by: Anna Claire Fields <anna.fields@overleaf.com>
GitOrigin-RevId: 8195b6fccbe26d2fd673d38356af5d44cf4042a3
2026-03-18 09:07:01 +00:00
Jakob Ackermann
81b7121408 [clsi] initial implementation of compile from history (#31883)
* [clsi] initial implementation of compile from history

* [clsi] copy changes

* [saas-e2e] extend test case with nested folder

* [saas-e2e] add test case for tracked changes

* [web] fix accumulating changes from multiple chunks

* [web] optimize size check for compile request payload

* [clsi] deduplicate globalBlobs

* [clsi] add validation for request body details

* [clsi] add metrics for compile from history

* [clsi] download binary files concurrently

* [clsi] skip download of empty file blob

* [clsi] break down e2e compile time metric by compileFromHistory

GitOrigin-RevId: 0dadef93e89d8a172c35cb130a1042d9d1bec42a
2026-03-06 09:12:07 +00:00
Jakob Ackermann
eca31afb4a [clsi] remove unused endpoints for downloading output files (#31692)
GitOrigin-RevId: a0cac10f3585414779b026f38c2af2773c80082f
2026-03-06 09:06:33 +00:00
Jakob Ackermann
6c6e8d9a97 [monorepo] switch all output file reads to clsi-nginx (#31691)
* [monorepo] switch all output file reads to clsi-nginx

* [clsi-lb] allow gallery download requests

* [terraform] clsi: use nginx.conf from clsi service

* [clsi] fix flakey tests

* [clsi] replace alias with rewrite and root in nginx config

* [k8s] clsi-lb: expose download port on internal service

* [web] add explicit endpoint for downloading all output files

Serve the output.zip endpoint from clsi.

* [clsi] fix regex for latexqc submission ids

Previously, we only handled template submission ids.

GitOrigin-RevId: 6c3b21b01ec41ae767530b14aac31fbe3d640dd5
2026-02-24 09:07:12 +00:00
Andrew Rumble
cd7da983d1 Merge pull request #30232 from overleaf/ar/convert-clsi-to-es-modules
[clsi] convert to ES modules

GitOrigin-RevId: fb7fa52cc8f678ee31be352e62a5dff95e88008b
2026-01-22 09:06:23 +00:00
Jakob Ackermann
3f9a7cf463 [clsi] consolidate metrics for clsi-perf (#30746)
* [clsi] remove all clsi-perf/health-check metrics

* [clsi] always emit E2E compile time metric

* [clsi] do not collect metrics for clsi-cache-template compiles

* [clsi] fix unit tests: request.metricsOpts always exists

* [clsi] use a gauge for the e2e compile time metric of clsi-perf

Co-authored-by: Eric Mc Sween <eric.mcsween@overleaf.com>

* [clsi] remove metrics for binary file downloads from clsi-perf

---------

Co-authored-by: Eric Mc Sween <eric.mcsween@overleaf.com>
GitOrigin-RevId: 7995512e57c802086350e3d1a0ec5213ecdf0a05
2026-01-19 09:06:34 +00:00
Brian Gough
67aa42a57a Merge pull request #29650 from overleaf/bg-update-clsi-tests-to-2025
update clsi acceptance tests to use texlive 2025.1 by default

GitOrigin-RevId: d69e97132e87873a8b91c39494c545250298d935
2025-11-13 09:06:23 +00:00
Jakob Ackermann
5140fff347 [clsi] gracefully handle fast exit of synctex/wordcount containers (#29505)
* [clsi] gracefully handle fast exit of synctex/wordcount containers

* [clsi] do not change container options in-place for logging

GitOrigin-RevId: 0b685310a3c72f8f46125fefaa30c1ddb19e7b07
2025-11-05 09:06:40 +00:00
Eric Mc Sween
d66c73a29e Merge pull request #29176 from overleaf/em-clsi-image-timings
CLSI: Capture image processing timings
GitOrigin-RevId: 28c2f73f260f2e82a64751bb46655e7546a458ef
2025-10-20 08:05:42 +00:00
Eric Mc Sween
f09a494e56 Merge pull request #29106 from overleaf/bg-fix-capdrop-in-docker-runner
fix capdrop in docker runner

GitOrigin-RevId: 1e8c81723a9e152ec85a3a2776965891fbe07606
2025-10-16 08:06:47 +00:00
Eric Mc Sween
9813bc4b51 Merge pull request #28992 from overleaf/em-compile-metrics-runs
Add metric measuring the execution time of each latexmk rule

GitOrigin-RevId: fcb7215f7f53063e6fe046c01bbcc81e6441c064
2025-10-13 08:07:07 +00:00
Eric Mc Sween
74524db293 Merge pull request #28909 from overleaf/em-compile-metrics
Use histograms to track CLSI compile times

GitOrigin-RevId: cf25f1e6d2094186f419acc70748f0c71b6c3240
2025-10-13 08:07:02 +00:00
Brian Gough
58094ebcd6 Merge pull request #28988 from overleaf/bg-add-file-info-to-performance-logs
add latexmk fdb file info to performance logs

GitOrigin-RevId: 3cc5709cd10fd55c2cd8aff7754fb7868aacdf0c
2025-10-13 08:05:23 +00:00
Brian Gough
da3f366643 Merge pull request #28959 from overleaf/bg-exclude-health-checks-from-performance-logs
exclude health checks from performance logs

GitOrigin-RevId: 88db63e00b32b2b015ee25c7d555546ed7d9a95b
2025-10-13 08:05:18 +00:00
Brian Gough
d24f37d3a4 Merge pull request #28880 from overleaf/bg-add-time-option-to-clsi
add latexmk `-time` option to clsi and record performance logs

GitOrigin-RevId: 467473859359913da73f83e10b63b45603ea175c
2025-10-09 08:06:12 +00:00
Jakob Ackermann
d489e35782 [web] emit event when synctex mapping was downloaded from clsi-cache (#25424)
* [clsi] tell frontend when synctex mapping was downloaded from clsi-cache

* [web] emit event when synctex mapping was downloaded from clsi-cache

GitOrigin-RevId: 1f6b7e0faaa7dd76449aad566802da971a4cf9ed
2025-05-09 08:06:00 +00:00
Jakob Ackermann
5ce1685b5b [clsi-cache] shard each zone into three instances (#25301)
* [clsi-cache] shard per zone into three instances

Keep the old instance as read fallback. We can remove it in 4 days.

Disk size: 2Ti gives us the maximum write throughput of 240MiB/s on a
N2D instance with fewer than 8 vCPUs.

* [clsi] fix format

* [k8s] clsi-cache: bring back storage-classes

* [k8s] clsi-cache: fix reference to zonal storage-classes

* [k8s] clsi-cache: add logging configs

* [clsi] improve sharding

Co-authored-by: Brian Gough <brian.gough@overleaf.com>

* [clsi] fix sharding

Index needs to be positive.

* [clsi] fix sharding

The random part is static per machine/process.

* [clsi] restrict clsi-cache to user projects

Co-authored-by: Brian Gough <brian.gough@overleaf.com>

* [k8s] clsi-cache: align CLSI_CACHE_NGINX_HOST with service LB

---------

Co-authored-by: Brian Gough <brian.gough@overleaf.com>
GitOrigin-RevId: 1efb1b3245c8194c305420b25e774ea735251fb3
2025-05-07 08:06:16 +00:00
Antoine Clausse
fa62529d82 [clsi] Replace diskusage by fs (#24789)
* Replace `diskusage` by `fs` in clsi

* Replace `diskusage` by `fs` in clsi-cache

* Update disk space calculations to include block size in bytes

Co-authored-by: Jakob Ackermann <jakob.ackermann@overleaf.com>

* Add warning comments about Docker-for-Mac fs stats being off by a factor

---------

Co-authored-by: Jakob Ackermann <jakob.ackermann@overleaf.com>
GitOrigin-RevId: 02ea07e531b89bb3d10ddfe780348b19cbddad1f
2025-04-17 08:06:16 +00:00
Jakob Ackermann
4a17a1e713 [web] gracefully access compile stats for event (#24818)
* [web] gracefully access compile stats for event

* [clsi] always emit stats and timings

GitOrigin-RevId: 959e5fe1508245ffecfab1219fd86e53b210fca1
2025-04-14 08:04:51 +00:00
Jakob Ackermann
d99ba08d01 [clsi] run SyncTeX in specific output dir rather than compile dir (#24690)
* [clsi] drop support for docker-in-docker

* [clsi] run SyncTeX in specific output dir rather than compile dir

* [clsi] store output.synctex.gz outside of tar-ball in clsi-cache

* [clsi] add documentation for rewriting of docker bind-mounts

* [server-pro] update env vars for sandboxed compiles in sample config

GitOrigin-RevId: 8debd7102ac612544961f237aa4ff1c530aa3da3
2025-04-10 08:05:26 +00:00
Jakob Ackermann
b538d56591 [clsi-cache] backend (#24388)
* [clsi-cache] initial revision of the clsi-cache service

* [clsi] send output files to clsi-cache and import from clsi-cache

* [web] pass editorId to clsi

* [web] clear clsi-cache when clearing clsi cache

* [web] add split-tests for controlling clsi-cache rollout

* [web] populate clsi-cache when cloning/creating project from template

* [clsi-cache] produce less noise when populating cache hits 404

* [clsi-cache] push docker image to AR

* [clsi-cache] push docker image to AR

* [clsi-cache] allow compileGroup in job payload

* [clsi-cache] set X-Zone header from latest endpoint

* [clsi-cache] use method POST for /enqueue endpoint

* [web] populate clsi-cache in zone b with template data

* [clsi-cache] limit number of editors per project/user folder to 10

* [web] clone: populate the clsi-cache unless the TeXLive release changed

* [clsi-cache] keep user folder when clearing cache as anonymous user

* [clsi] download old output.tar.gz when synctex finds empty compile dir

* [web] fix lint

* [clsi-cache] multi-zonal lookup of single build output

* [clsi-cache] add more validation and limits

Co-authored-by: Brian Gough <brian.gough@overleaf.com>

* [clsi] do not include clsi-cache tar-ball in output.zip

* [clsi-cache] fix reference after remaining constant

Co-authored-by: Alf Eaton <alf.eaton@overleaf.com>

* [web] consolidate validation of filename into ClsiCacheHandler

* [clsi-cache] extend metrics and event tracking

- break down most of the clsi metrics by label
  - compile=initial - new compile dir without previous output files
  - compile=recompile - recompile in existing compile dir
  - compile=from-cache - compile using previous clsi-cache
- extend segmentation on compile-result-backend event
  - isInitialCompile=true - found new compile dir at start of request
  - restoredClsiCache=true - restored compile dir from clsi-cache

* [clsi] rename metrics labels for download of clsi-cache

This is in preparation for synctex changes.

* [clsi] use constant for limit of entries in output.tar.gz

Co-authored-by: Eric Mc Sween <eric.mcsween@overleaf.com>

* [clsi-cache] fix cloning of project cache

---------

Co-authored-by: Brian Gough <brian.gough@overleaf.com>
Co-authored-by: Alf Eaton <alf.eaton@overleaf.com>
Co-authored-by: Eric Mc Sween <eric.mcsween@overleaf.com>
GitOrigin-RevId: 4901a65497af13be1549af7f38ceee3188fcf881
2025-04-10 08:05:17 +00:00
Jakob Ackermann
13bf214a3c [web] generate clsi buildId ahead of fetching project content (#24337)
* [web] generate clsi buildId ahead of fetching project content

The buildIds timestamp component will be used for cache invalidation.

* [clsi] strict validation for buildId

* [clsi] validate buildId parameter

GitOrigin-RevId: 88e8b2d48e78fa137b6dca7f2e6b93bbcf88a777
2025-03-24 10:46:02 +00:00
Jakob Ackermann
f7e716c826 [clsi] add metric for disk usage (#24303)
GitOrigin-RevId: e21b867a2fdaf54e9ec5b29b0f80b29349eb901c
2025-03-14 09:05:23 +00:00
Jakob Ackermann
d19c5e236f Merge pull request #22208 from overleaf/jpa-clsi-hash
[misc] clsi: read files from history-v1 with fallback to filestore

GitOrigin-RevId: c54bb128780198c14e7a63818f39fad62ce65d4e
2024-11-29 09:05:39 +00:00
Antoine Clausse
7f48c67512 Add prefer-node-protocol ESLint rule (#21532)
* Add `unicorn/prefer-node-protocol`

* Fix `unicorn/prefer-node-protocol` ESLint errors

* Run `npm run format:fix`

* Add sandboxed-module sourceTransformers in mocha setups

Fix `no such file or directory, open 'node:fs'` in `sandboxed-module`

* Remove `node:` in the SandboxedModule requires

* Fix new linting errors with `node:`

GitOrigin-RevId: 68f6e31e2191fcff4cb8058dd0a6914c14f59926
2024-11-11 09:04:51 +00:00
andrew rumble
d4911ea246 Stop waiting for finalize
Finalize promise will only resolve when the archive is closed. We want
to stream as soon as we have the data so this does not suit us. We want
to log errors that are thrown due to finalize, these should be
propogated by archiver to the response already.

GitOrigin-RevId: 4f9d727a00ead1be3caee62e1e0adba069a545c7
2024-09-24 08:05:28 +00:00
andrew rumble
487d9125a2 Improve stream error safety
GitOrigin-RevId: de4c512a62d304b3eeb2a1521aac173fa93d8411
2024-09-24 08:05:15 +00:00
andrew rumble
1409e32010 Move logging into ArchiveManager
GitOrigin-RevId: 71a28a0215c5f1a6975c9e2fcdcd476513df1cbc
2024-09-24 08:05:10 +00:00
andrew rumble
c387e60a28 Remove unnecessary symlink check
GitOrigin-RevId: 08d7295403a258818276b9fbd7666a20fbc2e00f
2024-09-24 08:05:06 +00:00
andrew rumble
c764566148 Allow all files to be in zip (in same directory)
GitOrigin-RevId: 14645a0c3db88faf00e2718b9574b5892ac3efcb
2024-09-24 08:04:53 +00:00
Liangjun Song
2133dde8bf remove dry run (#19820)
GitOrigin-RevId: b92e08da6654cdd37314f7c52a6946cc7ec8983a
2024-08-08 08:04:17 +00:00
Liangjun Song
5d472e9b38 limit the number of concurrent compile requests in clsi (#19717)
GitOrigin-RevId: 17909a4dd0717ea4a75288f734ddef19c7d6592e
2024-08-06 08:04:59 +00:00
Jakob Ackermann
9f68bc5660 Merge pull request #19296 from overleaf/jpa-issue-19290-3
[clsi] atomic writing of LaTeXMk output

GitOrigin-RevId: d81c497370587b98fc7ad282035cd59b0ae09ec8
2024-07-15 09:01:04 +00:00
Jakob Ackermann
51a24601ec Merge pull request #19293 from overleaf/jpa-issue-19290-2
[clsi] fix parsing of the requested file in symlink validation

GitOrigin-RevId: 86cfe8d62bb99ed6844faee0ff4af507e571e04d
2024-07-15 09:00:59 +00:00
Andrew Rumble
80ede301fa Merge pull request #18474 from overleaf/ar-return-build-id-from-clsi-after-compile
[clsi] Return buildId after compiles

GitOrigin-RevId: 872048f4fea8f5a00b817e29bd26a444d179a45f
2024-05-27 10:24:06 +00:00
Christopher Hoskin
3342d672c2 Merge pull request #18397 from overleaf/em-revert-download-all-link
Revert "Merge pull request #18190 from overleaf/ar-add-download-all-l…

GitOrigin-RevId: 681eb2734636d76558e682dc85083bfcaa6b7d2d
2024-05-17 08:05:10 +00:00