* using CLSI logic for fetching the project contents and skip the .zip export
* Use unique conversion directory for project-to-docx export to avoid corrupting the shared compile
directory when a compile runs concurrently
* Remove X-Accel-Buffering header — not needed as CLSI does not run behind nginx
* moving log before sending the data
* Return CLSI stream directly instead of buffering to disk on web
Previously convertProjectToDocx wrote the CLSI response to a temp file
on disk, then the controller read it back to stream to the client.
Now the stream is returned directly and piped to the response,
avoiding unnecessary disk I/O on the web server.
* Use href redirect for docx export instead of fetching blob into memory
* making functions and files more generic so they can be used in future for other documents exports as well
* adding export-docx split test
* adding unit tests
* adding cypress E2E test
* format:fix
* renaming the route to download from convert
* adding new icon for export docx button
* format:fix
* remove unused showExportDocumentErrorToast export and adding guard against invalid Content-Length header from CLSI
* format:fix
* refactor(clsi): move promisify(parse) into RequestParser
* refactor: generic conversion endpoint with type as route
param
* refactor: use type→extension map for validated conversion types
* refactor(clsi): remove --standalone flag and fix rejection test
* fixing the href in cypress test
* renaming function
* adding type to Metrics.inc
* fix: rename exportProjectDocument, add WithLock wrapper and metrics type label
* format:fix
* fix: hide docx export from anonymous users and add WithLock wrapper
* format fix
* remove redundant Content-Length validation from DocumentConversionManager
* format:fix
* removing trailing icon
GitOrigin-RevId: e9764fefac2c4b625d23be9e942ea4a8b283c70d
* [clsi] handle draft mode and tikzexternalize as part of sync phase
* [clsi] emit empty string from SafeReader on ENOENT
* [clsi] persist history state after clearing dirty state without changes
GitOrigin-RevId: d9dcd2e6887017f7935b5e95bdbdc6e11a3b18f5
* [clsi] remove locking from docker actions
Start:
- We have an in-memory lock on the compile request
Destroy:
- as part of run: see above
- as part of cleanup: we check the last access time now, so it cannot
happen concurrent to compiling anymore.
Co-authored-by: Anna Claire Fields <anna.fields@overleaf.com>
* [clsi] update comment
---------
Co-authored-by: Anna Claire Fields <anna.fields@overleaf.com>
GitOrigin-RevId: a58df45416ae31c0b38d5efec7f9371d747303df
* [clsi] avoid server error when clearing cache while compiling
* [clsi] tweak API around releasing locks
Co-authored-by: Eric Mc Sween <eric.mcsween@overleaf.com>
---------
Co-authored-by: Eric Mc Sween <eric.mcsween@overleaf.com>
GitOrigin-RevId: d3f171467d3bc26941758dd333f30049b37a05c8
* [k8s] clsi-cache: double the number of shards
* [monorepo] add missing clsi-cache env vars to dev-env
* [clsi] flip direction of clsi-cache shard migration
* [clsi] remove upper bound from clsi-cache shard migration
GitOrigin-RevId: a325a11c3ac9e22a12ad2d8ea802b91d2e175e24
* handle old versions of latexmk in run count extraction
the log lines for the run number change from stderr to stdout in TL2022
* extend SimpleLatexFileTest to include TL2017
* reset metrics for each scenario in SimpleLatexFileTests
* fix buildscript merge conflict
GitOrigin-RevId: fb74f2025d21ddf43be6a3b90ac6f7df4d975db6
* [clsi] only download history snapshot from clsi-cache when enabled
* [clsi-perf] migrate to compile from history mode
GitOrigin-RevId: 2dd54e032bd85d6335488741c039a5a1bd60090d
* [saas-e2e] test gallery templates with binary file
* [rails] add make target for fixing rubocop errors
* [rails] migrate compiles of conversions/submissions to history mode
* [rails] forward version to clsi request
* [rails] trim down compile request
* [saas-e2e] source v1 secrets after make install
GitOrigin-RevId: 65269e1df1051c9f3b4f1813d2e9dcf32a01be50
* [clsi] do not overwrite last access during initial scan
* [clsi] cleanup submission cache 5-10min after startup
* [clsi] address review comments
GitOrigin-RevId: e03beec1b3deaee50629ada72b0242a8a2b2ae66
* [clsi] initial implementation of compile from history
* [clsi] copy changes
* [saas-e2e] extend test case with nested folder
* [saas-e2e] add test case for tracked changes
* [web] fix accumulating changes from multiple chunks
* [web] optimize size check for compile request payload
* [clsi] deduplicate globalBlobs
* [clsi] add validation for request body details
* [clsi] add metrics for compile from history
* [clsi] download binary files concurrently
* [clsi] skip download of empty file blob
* [clsi] break down e2e compile time metric by compileFromHistory
GitOrigin-RevId: 0dadef93e89d8a172c35cb130a1042d9d1bec42a
* [monorepo] switch all output file reads to clsi-nginx
* [clsi-lb] allow gallery download requests
* [terraform] clsi: use nginx.conf from clsi service
* [clsi] fix flakey tests
* [clsi] replace alias with rewrite and root in nginx config
* [k8s] clsi-lb: expose download port on internal service
* [web] add explicit endpoint for downloading all output files
Serve the output.zip endpoint from clsi.
* [clsi] fix regex for latexqc submission ids
Previously, we only handled template submission ids.
GitOrigin-RevId: 6c3b21b01ec41ae767530b14aac31fbe3d640dd5
* [clsi] tweak logging for clsi-cache
- Use `clsi-cache` identifier on log line
- Add shard to context
- Record nFiles on "too many entries for tar" error
* [clsi] do not trip clsi-cache circuit breaker on ENOENT errors
These can happen when an output/compile-dir is purged while we download
files.
GitOrigin-RevId: ffa73ef312bce5232ef72e3b81966bb6e14d2255
* [k8s] clsi-cache: migrate to StatefulSet
* clsi-cache: optimize ILB services for GKE subsetting
Update the new clsi-cache internal load balancer services
to use optimal settings for GKE subsetting (NEG backends):
- set allocateLoadBalancerNodePorts: false (not needed with NEGs)
- set externalTrafficPolicy: Local (preserve source IP, keep traffic in zone)
- add trafficDistribution: PreferClose (zone affinity)
These settings ensure traffic from CLSI VMs stays within the same zone
when possible, reducing latency and cross-zone network costs.
* [k8s] clsi-cache: add missing resource paths
* [clsi] exclude readOnly clsi-cache shards
---------
Co-authored-by: Daniel Kontsek <daniel.kontsek@overleaf.com>
GitOrigin-RevId: 34f18b319a0e859ff149a135131c95a44bc674d6
* [clsi] remove all clsi-perf/health-check metrics
* [clsi] always emit E2E compile time metric
* [clsi] do not collect metrics for clsi-cache-template compiles
* [clsi] fix unit tests: request.metricsOpts always exists
* [clsi] use a gauge for the e2e compile time metric of clsi-perf
Co-authored-by: Eric Mc Sween <eric.mcsween@overleaf.com>
* [clsi] remove metrics for binary file downloads from clsi-perf
---------
Co-authored-by: Eric Mc Sween <eric.mcsween@overleaf.com>
GitOrigin-RevId: 7995512e57c802086350e3d1a0ec5213ecdf0a05
* [clsi] try harder at sending files off to a working clsi-cache shard
* [clsi] use a crc for generating a stable sequence of shards to try
Co-authored-by: Brian Gough <brian.gough@overleaf.com>
* [clsi] gradually migrate to crc based shard assigment
* [clsi] tweak selecting clsi-cache shard from crc
Co-authored-by: Brian Gough <brian.gough@overleaf.com>
* [clsi] bump rollout dates of new clsi-cache shard change
---------
Co-authored-by: Brian Gough <brian.gough@overleaf.com>
GitOrigin-RevId: 9386e170503b405580e4d0a8641832f3fcb1fa83
* [clsi] fix circuit breaker for clsi-cache
* [clsi] enable ts-check for CLSICacheHandler
* [clsi] limit the number of .blg files in clsi-cache to 50
* [clsi-cache] limit the number of files per job to 100
* [clsi-cache] explain early registration of buildId
* [clsi-cache] lock down downloads via nginx to project folder
GitOrigin-RevId: 081d0c40b08db3a384c4d765b71a50b973f42151
* [clsi] gracefully handle fast exit of synctex/wordcount containers
* [clsi] do not change container options in-place for logging
GitOrigin-RevId: 0b685310a3c72f8f46125fefaa30c1ddb19e7b07
Stage timeouts:
- frontend waits 5s
- web/clsi waits 4s
- clsi-cache waits 3s
This should ensure that the frontend can receive a valid response after
any of the backend requests failed.
The circuit breaker will remain closed for TIMEOUT + jitter of 0-3 times
the TIMEOUT of the respective service. This should avoid the bulk of
traffic to fail and occasionally issue retries without hammering the
instances while down.
Also do not try the next backend when the abort signal has expired.
GitOrigin-RevId: d612125616a9e416beff2f4c6d7f30066b5b9d6d
* [clsi] add stats and timings to compile response from clsi-cache
* [clsi] set downloadedFromCache when previously downloaded for synctex
Assumption: every compile will emit an output.log. When the output.log
is missing, but the output.synctex.gz exists, it must have been
downloaded from the cache.
GitOrigin-RevId: 41ea34880931e3c43dda3bc9eb26c0d02054894d
* [mics] fix "app" label in clsi-cache metrics in dev-env
* [clsi-cache] validate filePath when processing file
* [clsi-cache] meter ingress and egress bandwidth
Files are downloaded directly from nginx, hence we cannot meter egress
in clsi-cache easily.
GitOrigin-RevId: 24de8c41728f0e9c984113c1470dec6153e75f20
* [clsi] shed load when detecting out-of-disk condition
* [clsi] mark VM as unhealthy when detecting of-of-disk condition
GitOrigin-RevId: 25cda6785c0d973f50ec6206bee389804f35917e
* Revert "[clsi-cache] only use sharding from updated project editor tabs (#25326)"
This reverts commit 1754276bed3186c0536055c983e32476cc90d416.
* [clsi-cache] remove non sharded instances
GitOrigin-RevId: aa3ac46140dfc1722a3350cf7071e5b11af61199
* [clsi] tell frontend when synctex mapping was downloaded from clsi-cache
* [web] emit event when synctex mapping was downloaded from clsi-cache
GitOrigin-RevId: 1f6b7e0faaa7dd76449aad566802da971a4cf9ed