Google Drive Pipeline Template
Simple pipeline to sync Google Drive documents to Swiss AI Hub data lake.
⚠️ OAuth Limitation: Google Drive requires interactive browser authorization. Choose your setup method below.
Setup Options
Option A: Service Account (Recommended for Workspace)
Best for: Google Workspace domains with admin access
Steps:
- Create service account in Google Cloud Console
- Enable "Google Drive API"
- Download JSON key file → save as
gdrive-service-account.json - (Workspace only) Enable domain-wide delegation
- Share Drive folders with service account email
Configure:
bash
# .env
RCLONE_GDRIVE_NAME=gdrive
RCLONE_GDRIVE_TYPE=drive
RCLONE_GDRIVE_SERVICE_ACCOUNT_FILE=/secrets/gdrive-service-account.jsonMount key file in infra/docker-compose.dev.yml:
yaml
rclone:
volumes:
- ./gdrive-service-account.json:/secrets/gdrive-service-account.json:roOption B: OAuth Token (Works for Personal Accounts)
Best for: Personal Gmail accounts or simpler setup
Steps:
One-time authorization (run on your machine):
bashdocker exec -it rclone rclone config # Choose: n (new remote) # Name: gdrive # Storage: drive # client_id: (your client ID) # client_secret: (your client secret) # Follow browser prompts to authorizeExport the token:
bashdocker exec rclone rclone config show gdriveSave to environment (copy the
tokenfield):bash# .env RCLONE_GDRIVE_NAME=gdrive RCLONE_GDRIVE_TYPE=drive RCLONE_GDRIVE_CLIENT_ID=your-client-id RCLONE_GDRIVE_CLIENT_SECRET=your-client-secret RCLONE_GDRIVE_TOKEN={"access_token":"...","refresh_token":"..."}
Run Pipeline
bash
make playground