Beginner

Project Setup

In this first lesson, you will create the project structure, install all dependencies, set up Gmail API credentials with OAuth 2.0, configure OpenAI, create the SQLite database schema, and verify everything is connected. By the end, you will have a running Flask server ready to power your AI email assistant.

Architecture Overview

Before writing any code, let us understand the system we are building. The AI email assistant has five main components:

  • Gmail Connector: Authenticates via OAuth 2.0, fetches emails, parses threads and headers, and sends replies through the Gmail API.
  • Classification Engine: Uses an LLM to detect priority levels, assign category tags, and analyze sentiment for every email.
  • Draft Generator: Creates context-aware reply drafts that match your writing tone, with a template system for common responses.
  • Smart Features Module: Generates summaries, extracts action items, schedules follow-up reminders, and produces daily digests.
  • Web Dashboard: A Flask-powered interface for viewing your priority inbox, reviewing drafts, tracking action items, and sending replies.
Incoming Email (Gmail API)
    |
    v
[Email Fetcher] --OAuth 2.0--> [Gmail API]
    |
    v
[Email Parser] --> [SQLite DB]
    |
    +---> [Classifier]     --> priority, category, sentiment
    +---> [Draft Generator] --> suggested replies
    +---> [Smart Features]  --> summaries, action items
    |
    v
[Flask Web Dashboard]
    |
    v
User: Review, Edit, Send

Step 1: Create the Project Structure

Create the following directory structure. Each module handles one responsibility:

ai-email-assistant/
├── app/
│   ├── __init__.py
│   ├── config.py          # Configuration and environment variables
│   ├── database.py        # SQLite setup and models
│   ├── gmail/
│   │   ├── __init__.py
│   │   ├── auth.py        # OAuth 2.0 authentication
│   │   ├── client.py      # Gmail API client
│   │   └── parser.py      # Email parsing utilities
│   ├── ai/
│   │   ├── __init__.py
│   │   ├── classifier.py  # Priority, category, sentiment
│   │   ├── drafter.py     # Draft reply generation
│   │   └── smart.py       # Summaries, action items, reminders
│   ├── web/
│   │   ├── __init__.py
│   │   ├── routes.py      # Flask routes
│   │   └── templates/
│   │       ├── base.html
│   │       ├── inbox.html
│   │       └── draft.html
│   └── scheduler.py       # Background job scheduler
├── credentials/            # Gmail OAuth credentials (gitignored)
├── tests/
│   └── test_smoke.py
├── .env.example
├── .gitignore
├── requirements.txt
└── run.py                  # Application entry point
# Create the project directory structure
mkdir -p ai-email-assistant/{app/{gmail,ai,web/templates},credentials,tests}

# Create all __init__.py files
touch ai-email-assistant/app/__init__.py
touch ai-email-assistant/app/gmail/__init__.py
touch ai-email-assistant/app/ai/__init__.py
touch ai-email-assistant/app/web/__init__.py

Step 2: Install Dependencies

Create the requirements file with all packages we need:

# requirements.txt
# Gmail API
google-api-python-client==2.131.0
google-auth-httplib2==0.2.0
google-auth-oauthlib==1.2.0

# OpenAI
openai==1.35.0

# Web framework
flask==3.0.3
flask-cors==4.0.1

# Database
sqlalchemy==2.0.30

# Background jobs
apscheduler==3.10.4

# Utilities
python-dotenv==1.0.1
beautifulsoup4==4.12.3
html2text==2024.2.26

# Development
pytest==8.2.2
# Create a virtual environment and install
cd ai-email-assistant
python -m venv venv
source venv/bin/activate   # On Windows: venv\Scripts\activate
pip install -r requirements.txt

Step 3: Set Up Gmail API Credentials

You need a Google Cloud project with the Gmail API enabled. Follow these steps:

  1. Go to console.cloud.google.com and create a new project (or select an existing one).
  2. Navigate to APIs & Services > Library and enable the Gmail API.
  3. Go to APIs & Services > Credentials and click Create Credentials > OAuth client ID.
  4. Select Desktop app as the application type and give it a name.
  5. Download the JSON file and save it as credentials/client_secret.json.
  6. Go to OAuth consent screen, add your email as a test user (required for apps in testing mode).
Security: Never commit credentials/client_secret.json or credentials/token.json to version control. These files contain your OAuth secrets. The .gitignore we create below excludes them.

Step 4: Configuration Module

Create the configuration file that loads environment variables and validates them at startup:

# .env.example
OPENAI_API_KEY=sk-your-openai-api-key-here
OPENAI_MODEL=gpt-4o-mini
GMAIL_CREDENTIALS_PATH=credentials/client_secret.json
GMAIL_TOKEN_PATH=credentials/token.json
DATABASE_URL=sqlite:///email_assistant.db
FLASK_SECRET_KEY=change-this-to-a-random-string
FLASK_PORT=5000
POLL_INTERVAL_MINUTES=5
MAX_EMAILS_PER_FETCH=50
# app/config.py
"""Application configuration loaded from environment variables."""
import os
from dataclasses import dataclass, field
from dotenv import load_dotenv

load_dotenv()


@dataclass
class Config:
    """Central configuration for the AI email assistant."""

    # OpenAI
    openai_api_key: str = field(
        default_factory=lambda: os.getenv("OPENAI_API_KEY", "")
    )
    openai_model: str = field(
        default_factory=lambda: os.getenv("OPENAI_MODEL", "gpt-4o-mini")
    )

    # Gmail
    gmail_credentials_path: str = field(
        default_factory=lambda: os.getenv(
            "GMAIL_CREDENTIALS_PATH", "credentials/client_secret.json"
        )
    )
    gmail_token_path: str = field(
        default_factory=lambda: os.getenv(
            "GMAIL_TOKEN_PATH", "credentials/token.json"
        )
    )
    gmail_scopes: list = field(default_factory=lambda: [
        "https://www.googleapis.com/auth/gmail.readonly",
        "https://www.googleapis.com/auth/gmail.send",
        "https://www.googleapis.com/auth/gmail.modify",
    ])

    # Database
    database_url: str = field(
        default_factory=lambda: os.getenv(
            "DATABASE_URL", "sqlite:///email_assistant.db"
        )
    )

    # Flask
    flask_secret_key: str = field(
        default_factory=lambda: os.getenv(
            "FLASK_SECRET_KEY", "dev-secret-change-in-production"
        )
    )
    flask_port: int = field(
        default_factory=lambda: int(os.getenv("FLASK_PORT", "5000"))
    )

    # Scheduler
    poll_interval_minutes: int = field(
        default_factory=lambda: int(os.getenv("POLL_INTERVAL_MINUTES", "5"))
    )
    max_emails_per_fetch: int = field(
        default_factory=lambda: int(os.getenv("MAX_EMAILS_PER_FETCH", "50"))
    )

    def validate(self):
        """Validate required configuration values."""
        if not self.openai_api_key:
            raise ValueError("OPENAI_API_KEY is required. Set it in .env")
        if not os.path.exists(self.gmail_credentials_path):
            raise ValueError(
                f"Gmail credentials not found at {self.gmail_credentials_path}. "
                "Download from Google Cloud Console."
            )
        return True


# Singleton config instance
config = Config()

Step 5: Database Schema

Create the SQLite database with SQLAlchemy. We need tables for emails, classifications, drafts, and action items:

# app/database.py
"""SQLite database setup with SQLAlchemy ORM."""
from datetime import datetime
from sqlalchemy import (
    create_engine, Column, Integer, String, Text,
    DateTime, Boolean, Float, ForeignKey
)
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker, relationship

from app.config import config

engine = create_engine(config.database_url, echo=False)
SessionLocal = sessionmaker(bind=engine)
Base = declarative_base()


class Email(Base):
    """Stored email metadata from Gmail."""
    __tablename__ = "emails"

    id = Column(Integer, primary_key=True, autoincrement=True)
    gmail_id = Column(String(64), unique=True, nullable=False, index=True)
    thread_id = Column(String(64), index=True)
    subject = Column(String(500))
    sender = Column(String(320))
    sender_name = Column(String(200))
    recipient = Column(String(320))
    date = Column(DateTime)
    snippet = Column(Text)
    body_text = Column(Text)
    body_html = Column(Text)
    labels = Column(String(500))  # Comma-separated Gmail labels
    is_read = Column(Boolean, default=False)
    has_attachments = Column(Boolean, default=False)
    fetched_at = Column(DateTime, default=datetime.utcnow)

    # Relationships
    classification = relationship(
        "Classification", back_populates="email", uselist=False
    )
    drafts = relationship("Draft", back_populates="email")
    action_items = relationship("ActionItem", back_populates="email")


class Classification(Base):
    """LLM-generated email classification."""
    __tablename__ = "classifications"

    id = Column(Integer, primary_key=True, autoincrement=True)
    email_id = Column(Integer, ForeignKey("emails.id"), unique=True)
    priority = Column(String(20))      # urgent, high, normal, low
    category = Column(String(50))      # meeting, task, fyi, personal, etc.
    sentiment = Column(String(20))     # positive, neutral, negative, urgent
    confidence = Column(Float)         # 0.0 to 1.0
    summary = Column(Text)            # One-line summary
    classified_at = Column(DateTime, default=datetime.utcnow)

    email = relationship("Email", back_populates="classification")


class Draft(Base):
    """AI-generated draft replies."""
    __tablename__ = "drafts"

    id = Column(Integer, primary_key=True, autoincrement=True)
    email_id = Column(Integer, ForeignKey("emails.id"))
    content = Column(Text, nullable=False)
    tone = Column(String(50))          # professional, friendly, brief, etc.
    status = Column(String(20), default="pending")  # pending, approved, sent
    created_at = Column(DateTime, default=datetime.utcnow)
    sent_at = Column(DateTime, nullable=True)

    email = relationship("Email", back_populates="drafts")


class ActionItem(Base):
    """Extracted action items from emails."""
    __tablename__ = "action_items"

    id = Column(Integer, primary_key=True, autoincrement=True)
    email_id = Column(Integer, ForeignKey("emails.id"))
    description = Column(Text, nullable=False)
    due_date = Column(DateTime, nullable=True)
    is_completed = Column(Boolean, default=False)
    extracted_at = Column(DateTime, default=datetime.utcnow)

    email = relationship("Email", back_populates="action_items")


class FollowUp(Base):
    """Follow-up reminders."""
    __tablename__ = "follow_ups"

    id = Column(Integer, primary_key=True, autoincrement=True)
    email_id = Column(Integer, ForeignKey("emails.id"))
    reminder_date = Column(DateTime, nullable=False)
    reason = Column(Text)
    is_dismissed = Column(Boolean, default=False)
    created_at = Column(DateTime, default=datetime.utcnow)


def init_db():
    """Create all tables."""
    Base.metadata.create_all(engine)
    print("Database initialized successfully.")


def get_session():
    """Get a database session."""
    session = SessionLocal()
    try:
        yield session
    finally:
        session.close()

Step 6: Application Entry Point

Create the main entry point that initializes everything and starts the Flask server:

# run.py
"""Application entry point."""
from app.config import config
from app.database import init_db


def main():
    """Initialize and start the AI email assistant."""
    print("AI Email Assistant - Starting up...")
    print(f"  OpenAI Model: {config.openai_model}")
    print(f"  Database: {config.database_url}")
    print(f"  Poll Interval: {config.poll_interval_minutes} minutes")

    # Initialize the database
    init_db()

    # Import Flask app (created in Lesson 6)
    # For now, just verify the setup works
    print("\nSetup complete! All components initialized.")
    print(f"Dashboard will be available at http://localhost:{config.flask_port}")


if __name__ == "__main__":
    main()

Step 7: Gitignore and Smoke Test

# .gitignore
venv/
__pycache__/
*.pyc
.env
credentials/
*.db
.DS_Store
# tests/test_smoke.py
"""Smoke tests to verify the setup."""
import os
from dotenv import load_dotenv

load_dotenv()


def test_openai_connection():
    """Verify OpenAI API key works."""
    from openai import OpenAI

    client = OpenAI()
    response = client.chat.completions.create(
        model=os.getenv("OPENAI_MODEL", "gpt-4o-mini"),
        messages=[{"role": "user", "content": "Say 'hello' in one word."}],
        max_tokens=5
    )
    result = response.choices[0].message.content.strip().lower()
    assert "hello" in result, f"Unexpected response: {result}"
    print(f"OpenAI OK - response: {result}")


def test_database_creation():
    """Verify SQLite database can be created."""
    from app.database import init_db, engine
    from sqlalchemy import inspect

    init_db()
    inspector = inspect(engine)
    tables = inspector.get_table_names()
    expected = ["emails", "classifications", "drafts", "action_items", "follow_ups"]
    for table in expected:
        assert table in tables, f"Missing table: {table}"
    print(f"Database OK - tables: {tables}")


def test_gmail_credentials_exist():
    """Verify Gmail credentials file exists."""
    creds_path = os.getenv(
        "GMAIL_CREDENTIALS_PATH", "credentials/client_secret.json"
    )
    exists = os.path.exists(creds_path)
    print(f"Gmail credentials at {creds_path}: {'FOUND' if exists else 'MISSING'}")
    if not exists:
        print("  Download from Google Cloud Console > APIs & Services > Credentials")


if __name__ == "__main__":
    test_database_creation()
    test_openai_connection()
    test_gmail_credentials_exist()
    print("\nAll smoke tests passed!")
# Run the smoke tests
python tests/test_smoke.py
# Expected:
# Database initialized successfully.
# Database OK - tables: ['emails', 'classifications', 'drafts', ...]
# OpenAI OK - response: hello
# Gmail credentials at credentials/client_secret.json: FOUND
# All smoke tests passed!
📝
Checkpoint: At this point you should have the full project structure created, all dependencies installed, and the database schema initialized. The smoke test should confirm OpenAI connectivity and database creation. If Gmail credentials are missing, that is fine for now — we will set them up in the next lesson.

Key Takeaways

  • The project uses a clean modular structure: gmail, ai, web, and scheduler are separate packages with clear responsibilities.
  • Configuration is loaded from environment variables and validated at startup so missing values fail fast.
  • SQLAlchemy provides an ORM layer over SQLite, making it easy to swap to PostgreSQL later for production.
  • The database schema captures emails, classifications, drafts, action items, and follow-ups as separate but related entities.

What Is Next

In the next lesson, you will build the Gmail integration — OAuth 2.0 authentication, fetching emails with pagination, parsing threads and headers, and storing email metadata in the database.