Commit Graph

8 Commits

Author SHA1 Message Date
Jakob Ackermann 8c39add865 [clsi-cache] meter ingress and egress bandwidth (#27143)
* [mics] fix "app" label in clsi-cache metrics in dev-env

* [clsi-cache] validate filePath when processing file

* [clsi-cache] meter ingress and egress bandwidth

Files are downloaded directly from nginx, hence we cannot meter egress
in clsi-cache easily.

GitOrigin-RevId: 24de8c41728f0e9c984113c1470dec6153e75f20
2025-07-16 08:05:59 +00:00
Jakob Ackermann eebda2427e [clsi-cache] fix path traversal (#25585)
* [clsi-cache] fix path traversal

* [clsi-cache] double down on path traversal validation

Co-authored-by: Brian Gough <brian.gough@overleaf.com>

---------

Co-authored-by: Brian Gough <brian.gough@overleaf.com>
GitOrigin-RevId: 28a6a2024aae81e9b361db7918dc0c5381cd8246
2025-05-14 08:06:54 +00:00
Jakob Ackermann e25a69936e [clsi-cache] base64 encode X-All-Files header if needed (#25579)
* [clsi-cache] base64 encode X-All-Files header if needed

* [clsi-cache] add explicit error check

Co-authored-by: Brian Gough <brian.gough@overleaf.com>

---------

Co-authored-by: Brian Gough <brian.gough@overleaf.com>
GitOrigin-RevId: bd3b6381b68398aac4a07e48cd69e6aa97e94f18
2025-05-14 08:06:49 +00:00
Jakob Ackermann 5ce1685b5b [clsi-cache] shard each zone into three instances (#25301)
* [clsi-cache] shard per zone into three instances

Keep the old instance as read fallback. We can remove it in 4 days.

Disk size: 2Ti gives us the maximum write throughput of 240MiB/s on a
N2D instance with fewer than 8 vCPUs.

* [clsi] fix format

* [k8s] clsi-cache: bring back storage-classes

* [k8s] clsi-cache: fix reference to zonal storage-classes

* [k8s] clsi-cache: add logging configs

* [clsi] improve sharding

Co-authored-by: Brian Gough <brian.gough@overleaf.com>

* [clsi] fix sharding

Index needs to be positive.

* [clsi] fix sharding

The random part is static per machine/process.

* [clsi] restrict clsi-cache to user projects

Co-authored-by: Brian Gough <brian.gough@overleaf.com>

* [k8s] clsi-cache: align CLSI_CACHE_NGINX_HOST with service LB

---------

Co-authored-by: Brian Gough <brian.gough@overleaf.com>
GitOrigin-RevId: 1efb1b3245c8194c305420b25e774ea735251fb3
2025-05-07 08:06:16 +00:00
Jakob Ackermann 86d310c741 [web] clsi-cache: fix download of .blg files (#25083)
GitOrigin-RevId: 69c8f789b8f8fa4b241c7563722e9a1cb6f86244
2025-04-25 08:05:37 +00:00
Jakob Ackermann d99ba08d01 [clsi] run SyncTeX in specific output dir rather than compile dir (#24690)
* [clsi] drop support for docker-in-docker

* [clsi] run SyncTeX in specific output dir rather than compile dir

* [clsi] store output.synctex.gz outside of tar-ball in clsi-cache

* [clsi] add documentation for rewriting of docker bind-mounts

* [server-pro] update env vars for sandboxed compiles in sample config

GitOrigin-RevId: 8debd7102ac612544961f237aa4ff1c530aa3da3
2025-04-10 08:05:26 +00:00
Jakob Ackermann b831a0b3f7 [clsi-cache] frontend (#24389)
* [clsi-lb] forward ?clsiserverid=cache requests to clsi-cache

* [web] use clsi-cache in frontend

* [web] upgrade compile from cache to full compile when triggered inflight

* [web] fix pdf-js-viewer.spec.tsx tests -- add ?clsiserverid=foo to url

* [web] fix renamed reference after merge

* [web] fix download of other output files and use specific build

* [web] consolidate validation of filename into ClsiCacheHandler

* [web] remove unused projectName from getLatestBuildFromCache

* [web] avoid hitting the same clsi-cache instance first all the time

* [web] update documentation

GitOrigin-RevId: d48265a7ba89d6731092640e1492bc9f103f5c33
2025-04-10 08:05:22 +00:00
Jakob Ackermann b538d56591 [clsi-cache] backend (#24388)
* [clsi-cache] initial revision of the clsi-cache service

* [clsi] send output files to clsi-cache and import from clsi-cache

* [web] pass editorId to clsi

* [web] clear clsi-cache when clearing clsi cache

* [web] add split-tests for controlling clsi-cache rollout

* [web] populate clsi-cache when cloning/creating project from template

* [clsi-cache] produce less noise when populating cache hits 404

* [clsi-cache] push docker image to AR

* [clsi-cache] push docker image to AR

* [clsi-cache] allow compileGroup in job payload

* [clsi-cache] set X-Zone header from latest endpoint

* [clsi-cache] use method POST for /enqueue endpoint

* [web] populate clsi-cache in zone b with template data

* [clsi-cache] limit number of editors per project/user folder to 10

* [web] clone: populate the clsi-cache unless the TeXLive release changed

* [clsi-cache] keep user folder when clearing cache as anonymous user

* [clsi] download old output.tar.gz when synctex finds empty compile dir

* [web] fix lint

* [clsi-cache] multi-zonal lookup of single build output

* [clsi-cache] add more validation and limits

Co-authored-by: Brian Gough <brian.gough@overleaf.com>

* [clsi] do not include clsi-cache tar-ball in output.zip

* [clsi-cache] fix reference after remaining constant

Co-authored-by: Alf Eaton <alf.eaton@overleaf.com>

* [web] consolidate validation of filename into ClsiCacheHandler

* [clsi-cache] extend metrics and event tracking

- break down most of the clsi metrics by label
  - compile=initial - new compile dir without previous output files
  - compile=recompile - recompile in existing compile dir
  - compile=from-cache - compile using previous clsi-cache
- extend segmentation on compile-result-backend event
  - isInitialCompile=true - found new compile dir at start of request
  - restoredClsiCache=true - restored compile dir from clsi-cache

* [clsi] rename metrics labels for download of clsi-cache

This is in preparation for synctex changes.

* [clsi] use constant for limit of entries in output.tar.gz

Co-authored-by: Eric Mc Sween <eric.mcsween@overleaf.com>

* [clsi-cache] fix cloning of project cache

---------

Co-authored-by: Brian Gough <brian.gough@overleaf.com>
Co-authored-by: Alf Eaton <alf.eaton@overleaf.com>
Co-authored-by: Eric Mc Sween <eric.mcsween@overleaf.com>
GitOrigin-RevId: 4901a65497af13be1549af7f38ceee3188fcf881
2025-04-10 08:05:17 +00:00