Deployment Guide

Comprehensive guide for deploying iTensor in production environments, including both frontend and backend components.

Overview

When it's time to publish iTensor for broader use (whether internally or on a public website), you'll need to deploy both the documentation (and possibly the frontend UI) and the backend API. This guide covers how to host iTensor's documentation and backend API on a public-facing official website, outlining the steps and best practices for a smooth deployment.

Deploying the Documentation Site (Frontend)

iTensor's documentation (like this content) and possibly the user interface are built with Next.js (which can produce a static site or be server-rendered). We have a few options for deployment:

Static Export (Recommended for Docs)

If the documentation is primarily static content (Markdown/MDX pages in Docusaurus or static pages in Next.js), we can export it as a static website. Next.js can generate a static build using next export(if no server-side rendering is required for docs).

Static sites can be hosted easily on services like GitHub Pages, Netlify, or Vercel. For example, if using Docusaurus for docs: running npm run build and npm run deploy can publish to GitHub Pages automatically. If using MkDocs, mkdocs build generates a _site directory you can upload.

Server-Rendered (if needed)

If the Next.js frontend has dynamic features (perhaps a live editor for tensor input), you might deploy it as a Node.js service. Platforms like Vercel are designed for Next.js – you can push the project to Vercel and it will handle building and serving.

Alternatively, you could run npm run build and then npm run start on a server (this requires a Node environment on the server). Ensure you set environment variables appropriately (like NODE_ENV=productionand the API base URL pointing to the backend).

Custom Domain

The official website (for example, itensor.org) can be set up to serve the documentation. If using a static host like GitHub Pages, you'd configure a CNAME to point to GitHub's servers. If using Netlify or Vercel, you can add a custom domain in their dashboard. Ensure DNS records are updated accordingly. We likely want something like docs.itensor.org for documentation if separating from main site.

Continuous Deployment

It's good practice to automate the deployment. For instance, pushing to the main branch could trigger a GitHub Action that builds the docs and deploys them. Vercel automatically does this if connected to the repo (each push triggers a new build). Having automated deployment ensures the official site is always up-to-date with the latest docs after maintainers push changes.

Deploying the Backend API (Django)

Deploying the Django backend requires a production environment setup. In production, you typically do not use the simple runserver (that's for dev only). Instead, you use a WSGI or ASGI server with Django. Here are the steps and considerations:

1. Choose Hosting Environment

You can deploy Django to cloud VMs (AWS EC2, DigitalOcean droplet, etc.), PaaS (Heroku, if still using it, or cyclic.sh, etc.), or containerize it for Kubernetes/Docker services. For an official site, perhaps a cloud VM or container service is suitable for control and performance.

2. Production Settings

Update Django settings for production:

  • DEBUG = False
  • Set ALLOWED_HOSTS to your domain (e.g., ['api.itensor.org', 'itensor.org'])
  • Configure a proper database if needed (PostgreSQL is commonly used in prod instead of SQLite)
  • Set up secure settings: a strong SECRET_KEY, and security settings (HTTPS, secure cookies, etc.)
  • Enable CORS for the domain where the frontend is served (if docs site and API are on different subdomains, ensure CORS allows docs domain to call API)

3. WSGI Server

Use Gunicorn or uWSGI as the app server. For example, install gunicorn (pip install gunicorn), then you can run gunicorn itensor.wsgi:application --bind 0.0.0.0:8000. In a real deployment, you'd integrate this in a service manager or container.

4. Reverse Proxy / Web Server

It's common to put Nginx (or Apache) in front of Gunicorn:

  • Nginx can serve static files (if any) and proxy API requests to Gunicorn
  • This also allows handling TLS (HTTPS) at the Nginx level. You'd get a certificate (using Let's Encrypt for example) for api.itensor.org and itensor.org and configure Nginx to use it, so all traffic is secure
  • Example: Nginx listens on 443 for api.itensor.org, passes requests to Gunicorn running on 127.0.0.1:8000. It can also listen on 443 for the docs site domain if the docs is static hosted elsewhere, though often the docs might be on a different service entirely

5. Process Management

If on a VM, use systemd to run Gunicorn as a service that starts on boot and restarts if it crashes. Or use Docker: create a Dockerfile for the Django app, possibly one for Nginx, and use docker-compose or Kubernetes to manage containers.

6. Static Files (Django)

If the Django app has static files (like an admin interface or if it served a minimal homepage), runpython manage.py collectstatic to gather static files, and let Nginx serve them from the collected directory. For the iTensor API, static files might not be heavily used unless the browsable API of DRF is enabled or admin site.

7. Scaling

In production, you might run multiple Gunicorn workers (processes) to handle concurrent requests. For CPU-bound tasks like numerical computing, you might not want too many (to avoid contention). Maybe start with 2-4 workers and adjust. Use a tool like supervisor or systemd to manage multiple instances if not using Gunicorn's built-in ability to spawn workers.

8. Monitoring

Set up monitoring/logging. Ensure logs from Django (especially error logs) are being recorded (in files or a logging service). Monitor resource usage (CPU, memory) since heavy tensor computations can be intensive. We might use services or simple scripts to ensure the API is responding (health check endpoint could be helpful, like /api/health/ returning OK).

9. Domain and Networking

Point a subdomain (e.g., api.itensor.org) to the server's IP. Ensure firewall rules allow HTTP/HTTPS. Use HTTPS in production exclusively (redirect HTTP to HTTPS).

Integration of Frontend and Backend

There are a few ways to structure the official site:

Option 1: Separate Subdomains (Recommended)

Serve docs and API on separate subdomains (e.g., docs.itensor.org for documentation UI, and api.itensor.org for the API). This is clean and aligns with how many projects separate concerns. Just ensure CORS is configured so that docs site can call the API domain.

Option 2: Single Domain with Path Routing

Serve everything on one domain (e.g., itensor.org). In this case, you might integrate the systems in one of these ways:

  • Django Serving Static Content: Host the static docs on the same server as Django (Django can serve a few static pages). For example, you could build the Next.js app to static, then have Django serve those files at the root URL. The API could then live under /api/. This way, one domain serves both. However, Django serving a whole React app is feasible but not as optimal as using a CDN or specialized static hosting.
  • Reverse Proxy with Path Routing: Use a reverse proxy with path routing. Nginx can route itensor.org/api/... to the Django app, and itensor.org/docs/... to a different service (or serve static files from disk). This keeps one domain. This can be convenient for users (only one base URL), but slightly complex to set up. It might be easier to just use subdomains.

Given modern practices, we lean towards using subdomains for clarity and ease of using dedicated hosting solutions for each. For instance, host docs on GitHub Pages (which would be itensor.github.io or custom domain docs.itensor.org) and host the API on an AWS/DO server (api.itensor.org). The main domain itensor.org could then either redirect to docs.itensor.org (if we want the docs as the main site) or host a landing page.

Additional Considerations

Deployment of Simulation Modules (Future)

When future modules (like simulations) are added, they might introduce long-running tasks or additional services (for example, a background worker processing simulations). Deployment architecture might then involve additional components:

  • A task queue (Celery + Redis, etc.) for handling asynchronous jobs
  • More complex database needs for storing simulation state

For now, deploying the core system as described is sufficient, but we keep in mind that the architecture might evolve.

SEO and Discoverability

When deploying an official site, ensure we handle SEO if relevant:

  • The documentation should be crawlable (static sites help with that). Use proper meta tags and a sitemap if possible.
  • If using Next.js, consider using next build with prerendering for docs pages so that search engines can index the content.
  • The landing page (if any) should clearly direct users to documentation and possibly a "Get API Key" or "Try it Now" section.

Testing the Live Deployment

Once deployed, test the live site thoroughly:

  • Can we retrieve the documentation pages?
  • Are all links working?
  • Can the frontend perform a sample tensor operation via the live API?
  • Are HTTPS certificates valid and not showing security warnings?