Skip to content
Open
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
678d652
Add cache (300ms default) on response with computation of the file mi…
flovntp Mar 13, 2026
6992e19
Potential fix for pull request finding
flovntp Mar 13, 2026
ffefecd
Potential fix for pull request finding
flovntp Mar 13, 2026
43adca9
Potential fix for pull request finding
flovntp Mar 13, 2026
16bd005
Potential fix for pull request finding
flovntp Mar 13, 2026
1fba813
Potential fix for pull request finding
flovntp Mar 13, 2026
c014cba
Potential fix for pull request finding
flovntp Mar 13, 2026
035bc38
Optimization to not request Github if not modified
flovntp Mar 13, 2026
6806090
Potential fix for pull request finding
flovntp Mar 13, 2026
c51c771
Potential fix for pull request finding
flovntp Mar 13, 2026
a958f12
Potential fix for pull request finding
flovntp Mar 13, 2026
af5e123
Potential fix for pull request finding
flovntp Mar 13, 2026
56a33c9
Potential fix for pull request finding
flovntp Mar 13, 2026
e89a202
Potential fix for pull request finding
flovntp Mar 13, 2026
59b8f74
Potential fix for pull request finding
flovntp Mar 13, 2026
3c7b15a
Potential fix for pull request finding
flovntp Mar 13, 2026
2517d1d
Potential fix for pull request finding
flovntp Mar 24, 2026
b51219a
Potential fix for pull request finding
flovntp Mar 24, 2026
d544d0b
Potential fix for pull request finding
flovntp Mar 24, 2026
44bab99
Potential fix for pull request finding
flovntp Mar 24, 2026
2b1ef72
Potential fix for pull request finding
flovntp Mar 24, 2026
4aad402
Potential fix for pull request finding
flovntp Mar 24, 2026
e09c32b
Potential fix for pull request finding
flovntp Mar 24, 2026
825f20e
Potential fix for pull request finding
flovntp Mar 24, 2026
99bc57c
Potential fix for pull request finding
flovntp Mar 24, 2026
10f48ed
Potential fix for pull request finding
flovntp Mar 24, 2026
f01d4b8
Potential fix for pull request finding
flovntp Mar 24, 2026
656132c
Potential fix for pull request finding
flovntp Mar 24, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions backend/.env.example
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,10 @@ GITHUB_BASE_PATH=resources
# Optional: GitHub Personal Access Token for private repositories (not needed for public repos)
GITHUB_TOKEN=

# Cache Configuration
# TTL (Time To Live) in seconds for HTTP cache headers (default: 300 = 5 minutes)
CACHE_TTL=300

# Local Resources Path (when RESOURCES_MODE=local)
# Relative to backend folder
LOCAL_RESOURCES_PATH=../../../resources
Expand Down
174 changes: 174 additions & 0 deletions backend/doc/CONDITIONAL_FETCH_OPTIMIZATION.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,174 @@
# Conditional Fetch Optimization

## Problem
Previously, `getGithubResourceWithMetadata` always downloaded and parsed the full file from GitHub *before* the application could decide to return a 304 Not Modified response. This meant that even when a client had a valid cached version (sent `If-None-Match`), the server still:
1. Fetched the entire file from GitHub
2. Parsed the JSON/YAML content
3. Only then checked if the client cache was valid
4. Returned 304 (but the expensive work was already done)

This was especially costly in GitHub mode where network latency and API rate limits are concerns.

## Solution
The optimization implements **upstream conditional requests** that propagate client cache validation headers to GitHub, allowing the entire fetch/parse cycle to be skipped when content hasn't changed.

### How it Works

#### 1. Extract Conditional Headers
```typescript
const conditionalHeaders = extractConditionalHeaders(req);
// Returns: { ifNoneMatch: "etag-value", ifModifiedSince: "date" }
```

#### 2. Pass Headers to GitHub
```typescript
const { data, metadata, notModified } = await resourceManager.getResourceWithMetadata(
'image/registry.json',
conditionalHeaders
);
```

#### 3. GitHub Returns 304 (if unchanged)
When GitHub's content matches the client's cached ETag or hasn't been modified since the specified date, GitHub returns `304 Not Modified` with no body.

#### 4. Short-Circuit Response
```typescript
if (notModified) {
return sendNotModified(res, metadata); // No parsing, immediate 304
}
// Only reached if content changed - process data normally
```

### Request Flow

**Before (always fetches):**
```
Client → API: GET /images (If-None-Match: "abc123")
API → GitHub: GET registry.json (no conditional headers)
GitHub → API: 200 OK + 300KB JSON body
API: Parse JSON (expensive)
API: Check client ETag matches
API → Client: 304 Not Modified (but work already done!)
```

**After (conditional fetch):**
```
Client → API: GET /images (If-None-Match: "abc123")
API → GitHub: GET registry.json (If-None-Match: "abc123")
GitHub → API: 304 Not Modified (no body!)
API → Client: 304 Not Modified (skipped parsing entirely)
```

## Benefits

1. **Reduced Bandwidth**: No file download when content unchanged
2. **Faster Responses**: No JSON/YAML parsing overhead
3. **Lower GitHub API Usage**: Conditional requests don't consume as much rate limit
4. **Better Resource Utilization**: Server CPU time saved on parsing
5. **Improved Scalability**: Can handle more concurrent clients with same resources

## Implementation Details

### New Types & Interfaces

```typescript
export interface ConditionalHeaders {
ifNoneMatch?: string;
ifModifiedSince?: string;
}

export interface ResourceWithMetadata<T = any> {
data: T;
metadata: ResourceMetadata;
notModified?: boolean; // true when upstream returns 304
}
```

### Modified Functions

#### ResourceManager
- `getResourceWithMetadata(filePath, conditionalHeaders?)` - Added optional conditional headers parameter
- `getResourceRawWithMetadata(filePath, conditionalHeaders?)` - Added optional conditional headers parameter
- `getGithubResourceWithMetadata(filePath, conditionalHeaders?)` - Now passes headers to GitHub and handles 304
- `getGithubResourceRawWithMetadata(filePath, conditionalHeaders?)` - Same for raw content

#### CacheManager
- `extractConditionalHeaders(req)` - New helper to extract `If-None-Match`/`If-Modified-Since` from Express request

### Updated Routes
All routes using `getResourceWithMetadata` or `getResourceRawWithMetadata` were updated:
- `/images` and `/images/:id`
- `/extensions/php`, `/extensions/php/cloud`, `/extensions/php/cloud/:id`
- `/regions` and `/regions/:id`
- `/composable`
- `/openapi-spec`
- `/schema/upsun`, `/schema/image-registry`, `/schema/service-versions`, `/schema/runtime-versions`

## Testing

### Manual Test
```bash
# First request - gets full content with ETag
curl -i http://localhost:3000/images

# Note the ETag from response headers:
# ETag: "1773391306186-305001"

# Second request - uses If-None-Match
curl -i -H 'If-None-Match: "1773391306186-305001"' http://localhost:3000/images

# Should return:
# HTTP/1.1 304 Not Modified
# (no body)
```

### Logs to Monitor
When GitHub mode is active, logs will show:
```
"GitHub returned 304 Not Modified - cache still valid"
```

## Configuration

The optimization is **automatic** when using GitHub-backed resources:

- **Local mode** (`RESOURCES_MODE=local`): Serves files directly from disk; conditional headers are not used and responses always include the full body
- **GitHub mode** (`RESOURCES_MODE=github`): Passes conditional headers to GitHub and returns 304 if GitHub returns 304, skipping download and parse on cache hits

No configuration changes needed - it's transparent to clients.

## Performance Impact

### Expected Improvements (GitHub mode)
- **Cache hit latency**: ~200-500ms network roundtrip (vs 1000-2000ms fetch+parse)
- **Bandwidth savings**: 100% on cache hits (no body transferred)
- **CPU usage**: Eliminates JSON/YAML parsing on cache hits
- **GitHub rate limit**: Conditional requests may have reduced impact on rate limits

### Measurement
Monitor these metrics:
1. Response time for requests with `If-None-Match` header
2. GitHub API calls per minute
3. Server CPU utilization
4. Ratio of 304 vs 200 responses

## Backward Compatibility

✅ **Fully backward compatible**
- Clients without `If-None-Match` header still get 200 responses
- Local mode continues to work as before
- No breaking changes to API responses or behavior

## Future Enhancements

Potential improvements:
1. **Client-side caching**: Include `Cache-Control: max-age=300` already set
2. **ETag with content hash**: For even better cache validation
3. **Stale-while-revalidate**: Return stale content while checking for updates in background
4. **Metrics dashboard**: Track cache hit rates and performance gains

## References

- RFC 7232: HTTP/1.1 Conditional Requests
- [GitHub ETag documentation](https://docs.github.com/en/rest/overview/resources-in-the-rest-api#conditional-requests)
- Repository memory: `/memories/repo/caching-etag.md`
16 changes: 16 additions & 0 deletions backend/src/config/env.config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,11 @@ class EnvironmentConfig {
TOKEN?: string;
};

// Cache Configuration
readonly cache: {
TTL: number;
};

// Documentation Configuration
readonly docs: {
BASE_URL: string;
Expand Down Expand Up @@ -94,6 +99,17 @@ class EnvironmentConfig {
TOKEN: process.env.GITHUB_TOKEN,
};

const rawCacheTTL = this.getNumber('CACHE_TTL', 300);
const cacheTTL =
!Number.isFinite(rawCacheTTL) || rawCacheTTL < 0
? 0
: Math.floor(rawCacheTTL);

// Cache Configuration (TTL in seconds, non-negative integer)
this.cache = {
TTL: cacheTTL, // 5 minutes by default when env var is not set
};

// Documentation Configuration
this.docs = {
BASE_URL: this.getString('DOCS_URL', 'https://developer.upsun.com/docs'),
Expand Down
5 changes: 5 additions & 0 deletions backend/src/middleware/httpLogger.middleware.ts
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,11 @@ const requestLogger = logger.child({ component: 'HTTP' });
*/
export function httpLogger() {
return (req: Request, res: Response, next: NextFunction) => {
// Skip logging for Socket.IO polling requests (from dev tools/extensions)
if (req.path.startsWith('/socket.io')) {
return next();
}

const startTime = Date.now();

// Log incoming request
Expand Down
15 changes: 12 additions & 3 deletions backend/src/routes/composable.routes.ts
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import { config } from '../config/env.config.js';
import { Request, Response } from 'express';
import { ApiRouter } from '../utils/api.router.js';
import { ResourceManager, logger } from '../utils/index.js';
import { ResourceManager, logger, extractConditionalHeaders, setCacheHeaders, sendNotModified } from '../utils/index.js';
import { sendErrorFormatted, sendFormatted } from '../utils/response.format.js';
import {
ComposableImageDto,
Expand Down Expand Up @@ -48,8 +48,14 @@ composableRouter.route({
},
handler: async (req: Request, res: Response) => {
try {
// Load composable resource
const composableData = await resourceManager.getResource('image/composable.json');
// Load composable resource with metadata, supporting conditional requests
const conditionalHeaders = extractConditionalHeaders(req);
const { data: composableData, metadata, notModified } = await resourceManager.getResourceWithMetadata('image/composable.json', conditionalHeaders);

// If upstream returned 304, respond with 304 (avoids unnecessary parsing)
if (notModified) {
return sendNotModified(res, metadata);
}

// Extract the "composable" object from the file
const composableRaw = composableData.composable;
Expand All @@ -69,6 +75,9 @@ composableRouter.route({
? ComposableImageSchemaDtoInternal.parse(composableRaw)
: ComposableImageSchemaDtoPublic.parse(composableRaw);

// Set cache headers
setCacheHeaders(res, metadata, config.cache.TTL);

// Send formatted response
sendFormatted<ComposableImageDto>(res, composableParsed);

Expand Down
41 changes: 36 additions & 5 deletions backend/src/routes/extension.routes.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ import { Request, Response } from 'express';
import { registry, z } from 'zod';
import { extendZodWithOpenApi } from '@asteasolutions/zod-to-openapi';
import { ApiRouter } from '../utils/api.router.js';
import { ResourceManager, escapeHtml, logger } from '../utils/index.js';
import { ResourceManager, escapeHtml, logger, extractConditionalHeaders, setCacheHeaders, sendNotModified } from '../utils/index.js';
import { ErrorDetailsSchema, HeaderAcceptSchema } from '../schemas/api.schema.js';
import {
RuntimeExtensionListSchema,
Expand Down Expand Up @@ -49,7 +49,14 @@ extensionRouter.route({
},
handler: async (req: Request, res: Response) => {
try {
const data = await resourceManager.getResource('extension/php_extensions.json');
const conditionalHeaders = extractConditionalHeaders(req);
const { data, metadata, notModified } = await resourceManager.getResourceWithMetadata('extension/php_extensions.json', conditionalHeaders);

// If upstream returned 304, respond with 304 (avoids unnecessary parsing)
if (notModified) {
return sendNotModified(res, metadata);
}

const baseUrl = `${config.server.BASE_URL}`;

data.cloud = withSelfLink(data.cloud, (id) => `${baseUrl}${PATH}/cloud/${encodeURIComponent(id)}`);
Expand All @@ -58,6 +65,9 @@ extensionRouter.route({
_links: { self: `${baseUrl}${PATH}/cloud` }
};

// Set cache headers
setCacheHeaders(res, metadata, config.cache.TTL);

sendFormatted<RuntimeExtensionList>(res, data);
} catch (error: any) {
apiLogger.error({ error: error.message }, 'Failed to read PHP extensions');
Expand Down Expand Up @@ -92,12 +102,22 @@ extensionRouter.route({
},
handler: async (req: Request, res: Response) => {
try {
const data = await resourceManager.getResource('extension/php_extensions.json');
const conditionalHeaders = extractConditionalHeaders(req);
const { data, metadata, notModified } = await resourceManager.getResourceWithMetadata('extension/php_extensions.json', conditionalHeaders);

// If upstream returned 304, respond with 304 (avoids unnecessary parsing)
if (notModified) {
return sendNotModified(res, metadata);
}

const cloudExtensions: CloudExtensions = data?.cloud || {};

const baseUrl = `${config.server.BASE_URL}`;
const cloudExtensionsWithLinks = withSelfLink(cloudExtensions, (id) => `${baseUrl}${PATH}/cloud/${encodeURIComponent(id)}`);

// Set cache headers
setCacheHeaders(res, metadata, config.cache.TTL);

sendFormatted<CloudExtensions>(res, cloudExtensionsWithLinks);
} catch (error: any) {
apiLogger.error({ error: error.message }, 'Failed to read PHP Cloud extensions');
Expand Down Expand Up @@ -144,17 +164,28 @@ extensionRouter.route({
const { id } = req.params as { id: string };
const imageId = escapeHtml(id);

const data = await resourceManager.getResource('extension/php_extensions.json');
const conditionalHeaders = extractConditionalHeaders(req);
const { data, metadata, notModified } = await resourceManager.getResourceWithMetadata('extension/php_extensions.json', conditionalHeaders);

// If upstream returned 304, respond with 304 (avoids unnecessary parsing)
if (notModified) {
return sendNotModified(res, metadata);
}

const extensionEntry = data?.cloud?.[id];

if (!extensionEntry) {
sendErrorFormatted(res, {
return sendErrorFormatted(res, {
title: 'Extension not found',
detail: `Extension "${imageId}" not found. See extra.availableExtensions for a list of valid extension IDs.`,
status: 404,
extra: { availableExtensions: Object.keys(data?.cloud || {}) }
});
}

// Set cache headers
setCacheHeaders(res, metadata, config.cache.TTL);

sendFormatted<RuntimeExtensionVersion>(res, extensionEntry);
} catch (error: any) {
apiLogger.error({ error: error.message }, 'Failed to read PHP Cloud extensions');
Expand Down
Loading
Loading