Back to Blog

How to Build a File Upload System That Handles Large Files

How to Build a File Upload System That Handles Large Files

File uploads seem simple until they are not. A basic <input type="file"> with a form submission works for profile photos. But the moment you need to handle 500MB video files, multiple concurrent uploads, progress bars, resumable uploads after network failures, and image optimization — you are building a real system.

We have built file upload systems for several projects. MindHyv handles profile images, business logos, and product photos. Trackelio accepts screenshot attachments with user feedback. VincelIO processes creator media kits with large PDF and video files. Each one taught us something about what breaks at scale.

Here is the architecture we have converged on, with working code.

Why Naive Uploads Break

The simplest approach — POST the file to your server, pipe it to storage — fails in predictable ways:

  1. Memory pressure. Your server buffers the entire file in memory. Ten users uploading 200MB files simultaneously means 2GB of RAM consumed just for uploads.
  2. Timeout risk. A 500MB file on a mediocre connection takes minutes. Load balancers, reverse proxies, and serverless function timeouts all conspire to kill long requests.
  3. No resumability. If the connection drops at 95%, the user starts over from zero.
  4. Wasted bandwidth. The file goes client -> your server -> storage. Your server is a middleman that adds latency and cost.

The fix is to stop routing files through your server entirely.

Presigned URLs: Upload Directly to Storage

Presigned URLs let the client upload directly to your storage provider (S3, Supabase Storage, GCS) without your server ever touching the file bytes. Your server just generates a temporary, signed URL that authorizes the upload.

Here is the flow:

Client                    Your API                  Storage (S3/Supabase)
  |                          |                            |
  |-- "I want to upload" --> |                            |
  |                          |-- generate signed URL ---> |
  |  <-- signed URL -------- |                            |
  |                          |                            |
  |-- PUT file directly --------------------------------> |
  |  <-- 200 OK ----------------------------------------- |
  |                          |                            |
  |-- "Upload complete" --> |                            |
  |                          |-- verify & save metadata   |

Your server never handles the file. It just authorizes the upload and records metadata afterward.

Server-Side: Generating Presigned URLs with Supabase

// api/upload/presign.ts
import { createClient } from '@supabase/supabase-js';

const supabase = createClient(
  process.env.SUPABASE_URL!,
  process.env.SUPABASE_SERVICE_ROLE_KEY!
);

interface PresignRequest {
  filename: string;
  contentType: string;
  size: number;
}

const MAX_FILE_SIZE = 500 * 1024 * 1024; // 500MB
const ALLOWED_TYPES = [
  'image/jpeg', 'image/png', 'image/webp',
  'application/pdf',
  'video/mp4', 'video/quicktime',
];

export async function generatePresignedUrl(
  userId: string,
  req: PresignRequest
): Promise<{ uploadUrl: string; filePath: string }> {
  // Validate before generating URL
  if (req.size > MAX_FILE_SIZE) {
    throw new Error(`File too large. Maximum size is ${MAX_FILE_SIZE / 1024 / 1024}MB`);
  }

  if (!ALLOWED_TYPES.includes(req.contentType)) {
    throw new Error(`File type ${req.contentType} is not allowed`);
  }

  // Generate a unique path to prevent collisions
  const ext = req.filename.split('.').pop();
  const filePath = `${userId}/${crypto.randomUUID()}.${ext}`;

  const { data, error } = await supabase.storage
    .from('uploads')
    .createSignedUploadUrl(filePath);

  if (error) throw error;

  return {
    uploadUrl: data.signedUrl,
    filePath,
  };
}

Client-Side: Uploading with Progress Tracking

// lib/upload.ts
interface UploadOptions {
  file: File;
  onProgress?: (percent: number) => void;
  onComplete?: (filePath: string) => void;
  onError?: (error: Error) => void;
}

export async function uploadFile({ file, onProgress, onComplete, onError }: UploadOptions) {
  try {
    // Step 1: Get presigned URL from our API
    const presignRes = await fetch('/api/upload/presign', {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({
        filename: file.name,
        contentType: file.type,
        size: file.size,
      }),
    });

    if (!presignRes.ok) {
      throw new Error(await presignRes.text());
    }

    const { uploadUrl, filePath } = await presignRes.json();

    // Step 2: Upload directly to storage with progress tracking
    await uploadWithProgress(uploadUrl, file, onProgress);

    // Step 3: Confirm upload with our API
    await fetch('/api/upload/confirm', {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({ filePath, filename: file.name, size: file.size }),
    });

    onComplete?.(filePath);
  } catch (err) {
    onError?.(err instanceof Error ? err : new Error('Upload failed'));
  }
}

function uploadWithProgress(
  url: string,
  file: File,
  onProgress?: (percent: number) => void
): Promise<void> {
  return new Promise((resolve, reject) => {
    const xhr = new XMLHttpRequest();

    xhr.upload.addEventListener('progress', (e) => {
      if (e.lengthComputable && onProgress) {
        onProgress(Math.round((e.loaded / e.total) * 100));
      }
    });

    xhr.addEventListener('load', () => {
      if (xhr.status >= 200 && xhr.status < 300) {
        resolve();
      } else {
        reject(new Error(`Upload failed with status ${xhr.status}`));
      }
    });

    xhr.addEventListener('error', () => reject(new Error('Network error')));
    xhr.addEventListener('abort', () => reject(new Error('Upload aborted')));

    xhr.open('PUT', url);
    xhr.setRequestHeader('Content-Type', file.type);
    xhr.send(file);
  });
}

Yes, we use XMLHttpRequest for uploads. The Fetch API still does not support upload progress natively in all browsers. XHR is ugly but it works.

Drag and drop upload interface on a modern application screen

Chunked Uploads for Large Files

Presigned URLs work great for files under 100MB. For larger files, you need chunked uploads. The idea: split the file into pieces, upload each piece independently, reassemble on the server.

Why chunked uploads matter:

  • Resumability. If chunk 47 of 50 fails, retry just that chunk.
  • Parallelism. Upload 3-4 chunks simultaneously for faster throughput.
  • Memory efficiency. Only one chunk is in memory at a time.
// lib/chunked-upload.ts
const CHUNK_SIZE = 5 * 1024 * 1024; // 5MB chunks

interface ChunkedUploadOptions {
  file: File;
  uploadId: string;
  onProgress?: (percent: number) => void;
  concurrency?: number;
}

export async function chunkedUpload({
  file,
  uploadId,
  onProgress,
  concurrency = 3,
}: ChunkedUploadOptions) {
  const totalChunks = Math.ceil(file.size / CHUNK_SIZE);
  let completedChunks = 0;

  // Get presigned URLs for all chunks
  const res = await fetch('/api/upload/multipart/urls', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({
      uploadId,
      filename: file.name,
      contentType: file.type,
      totalChunks,
    }),
  });

  const { chunkUrls } = await res.json();

  // Upload chunks with controlled concurrency
  const results: { partNumber: number; etag: string }[] = [];
  const queue = Array.from({ length: totalChunks }, (_, i) => i);

  async function processChunk(chunkIndex: number) {
    const start = chunkIndex * CHUNK_SIZE;
    const end = Math.min(start + CHUNK_SIZE, file.size);
    const chunk = file.slice(start, end);

    const response = await fetch(chunkUrls[chunkIndex], {
      method: 'PUT',
      body: chunk,
    });

    const etag = response.headers.get('ETag') || '';
    results.push({ partNumber: chunkIndex + 1, etag });

    completedChunks++;
    onProgress?.(Math.round((completedChunks / totalChunks) * 100));
  }

  // Process with concurrency limit
  const executing = new Set<Promise<void>>();

  for (const chunkIndex of queue) {
    const promise = processChunk(chunkIndex).then(() => {
      executing.delete(promise);
    });
    executing.add(promise);

    if (executing.size >= concurrency) {
      await Promise.race(executing);
    }
  }

  await Promise.all(executing);

  // Complete the multipart upload
  await fetch('/api/upload/multipart/complete', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({
      uploadId,
      parts: results.sort((a, b) => a.partNumber - b.partNumber),
    }),
  });
}

The concurrency limiter is important. Without it, a 1GB file split into 5MB chunks would fire 200 parallel requests — which thrashes the browser and gets you rate-limited by the storage provider.

Image Optimization on Upload

For MindHyv, every image uploaded by a business owner gets optimized before it reaches the end user. We do not optimize on upload (that would slow down the upload experience). We optimize on first access using a serverless function.

// api/images/[...path].ts
// On-demand image optimization with caching
import sharp from 'sharp';

const WIDTHS = [320, 640, 960, 1280, 1920] as const;
type Width = typeof WIDTHS[number];

export async function optimizeImage(
  storagePath: string,
  width: Width,
  format: 'webp' | 'avif' = 'webp'
): Promise<Buffer> {
  // Check cache first
  const cacheKey = `optimized/${width}/${format}/${storagePath}`;
  const cached = await storage.get(cacheKey);
  if (cached) return cached;

  // Fetch original from storage
  const original = await storage.download(storagePath);

  // Optimize
  const optimized = await sharp(original)
    .resize(width, undefined, {
      withoutEnlargement: true,
      fit: 'inside',
    })
    .toFormat(format, {
      quality: format === 'webp' ? 80 : 65,
      effort: format === 'avif' ? 4 : 6,
    })
    .toBuffer();

  // Cache the result
  await storage.put(cacheKey, optimized, {
    contentType: `image/${format}`,
    cacheControl: 'public, max-age=31536000, immutable',
  });

  return optimized;
}

This means the original high-resolution image is always preserved, and we serve optimized variants on demand. The first request for a given size/format combo is slow (200-500ms for the optimization). Every subsequent request is served from cache.

For more on how we layer caching, see our post on caching strategies for web apps.

Cloud storage management system with organized file directories

Database Schema for Upload Tracking

You need a database record for every upload. This is non-negotiable. Storage buckets do not give you searchable metadata, access control history, or association with your domain objects.

create table uploads (
  id uuid primary key default gen_random_uuid(),
  user_id uuid references auth.users(id) not null,
  storage_path text not null unique,
  original_filename text not null,
  content_type text not null,
  size_bytes bigint not null,
  status text not null default 'pending'
    check (status in ('pending', 'uploading', 'complete', 'failed', 'deleted')),
  metadata jsonb default '{}',
  created_at timestamptz default now(),
  completed_at timestamptz
);

-- Index for user's uploads
create index idx_uploads_user_id on uploads(user_id);

-- Index for cleanup jobs (find stale pending uploads)
create index idx_uploads_status_created
  on uploads(status, created_at)
  where status = 'pending';

-- RLS policy: users can only see their own uploads
alter table uploads enable row level security;

create policy "Users can view own uploads"
  on uploads for select
  using (auth.uid() = user_id);

create policy "Users can insert own uploads"
  on uploads for insert
  with check (auth.uid() = user_id);

The status field is critical. Uploads are a multi-step process — the presigned URL is generated (pending), the file is being uploaded (uploading), the upload completes and is confirmed (complete), or something goes wrong (failed). You need to track this so you can clean up orphaned uploads.

Cleaning Up Orphaned Uploads

Users close browser tabs. Networks drop. Uploads get abandoned. If you do not clean up, your storage bill grows forever with files nobody uses.

We run a cleanup job on a schedule:

// jobs/cleanup-orphaned-uploads.ts
export async function cleanupOrphanedUploads() {
  const STALE_THRESHOLD_HOURS = 24;

  // Find uploads that have been pending for too long
  const { data: staleUploads, error } = await supabase
    .from('uploads')
    .select('id, storage_path')
    .eq('status', 'pending')
    .lt('created_at', new Date(Date.now() - STALE_THRESHOLD_HOURS * 60 * 60 * 1000).toISOString());

  if (error || !staleUploads?.length) return;

  // Delete from storage
  const paths = staleUploads.map(u => u.storage_path);
  await supabase.storage.from('uploads').remove(paths);

  // Update database records
  const ids = staleUploads.map(u => u.id);
  await supabase
    .from('uploads')
    .update({ status: 'deleted' })
    .in('id', ids);

  console.log(`Cleaned up ${staleUploads.length} orphaned uploads`);
}

Document upload system with organized folders and file management

Security Considerations

File uploads are an attack vector. Here is what we validate:

1. Content type verification. Do not trust the Content-Type header from the client. Verify the actual file content on the server side after upload.

import { fileTypeFromBuffer } from 'file-type';

async function verifyFileType(storagePath: string, expectedType: string): Promise<boolean> {
  const file = await storage.download(storagePath);
  const buffer = Buffer.from(await file.arrayBuffer());

  // Read magic bytes to determine actual file type
  const detected = await fileTypeFromBuffer(buffer);

  if (!detected) return false;
  return detected.mime === expectedType;
}

2. Filename sanitization. Never use the original filename for storage. Generate a UUID-based path. If you display the original filename to users, sanitize it.

3. Size limits at every layer. Enforce in your presigned URL generation, in your storage bucket configuration, and in your upload client code. Defense in depth.

4. Virus scanning. For user-generated content that will be served to other users, consider running uploaded files through a virus scanner. ClamAV can be integrated into a post-upload processing pipeline.

5. Storage bucket configuration. Make your upload bucket private by default. Serve files through your API with proper authorization checks, or use signed download URLs with short expiry times.

The Upload Component

Here is a minimal but functional upload component in Svelte that ties everything together:

<script lang="ts">
  import { uploadFile } from '$lib/upload';

  let files: FileList | null = null;
  let progress = 0;
  let status: 'idle' | 'uploading' | 'complete' | 'error' = 'idle';
  let errorMessage = '';

  async function handleUpload() {
    if (!files?.[0]) return;

    status = 'uploading';
    progress = 0;

    await uploadFile({
      file: files[0],
      onProgress: (p) => (progress = p),
      onComplete: () => (status = 'complete'),
      onError: (err) => {
        status = 'error';
        errorMessage = err.message;
      },
    });
  }
</script>

<div class="upload-zone">
  {#if status === 'idle'}
    <input type="file" bind:files accept="image/*,.pdf" />
    <button onclick={handleUpload} disabled={!files?.length}>
      Upload
    </button>
  {:else if status === 'uploading'}
    <div class="progress-bar">
      <div class="progress-fill" style="width: {progress}%"></div>
    </div>
    <span>{progress}% uploaded</span>
  {:else if status === 'complete'}
    <p>Upload complete</p>
  {:else if status === 'error'}
    <p class="error">{errorMessage}</p>
    <button onclick={() => (status = 'idle')}>Try again</button>
  {/if}
</div>

What We Would Do Differently

After building upload systems for multiple projects, here is what we wish we had known from the start:

  • Start with presigned URLs from day one. Do not route files through your server “just for now.” That temporary solution becomes permanent, and migrating away is painful.
  • Track uploads in your database from the beginning. Relying solely on storage bucket listings for file management does not scale.
  • Set up cleanup jobs early. Orphaned files accumulate faster than you expect.
  • Build the progress UI before you need it. Users uploading anything over 1MB need visual feedback. Do not ship without it.

File uploads are one of those features that every app needs and few apps get right. The presigned URL + chunked upload pattern we have described here handles everything from avatar photos to 500MB video files. It is the same architecture powering uploads across our projects, and it has held up well.

If you are building an app that needs reliable file handling, reach out at [email protected]. We have done this enough times to get it right the first time.