Project Setup
In this first lesson, you will create the project structure, install all dependencies, set up Gmail API credentials with OAuth 2.0, configure OpenAI, create the SQLite database schema, and verify everything is connected. By the end, you will have a running Flask server ready to power your AI email assistant.
Architecture Overview
Before writing any code, let us understand the system we are building. The AI email assistant has five main components:
- Gmail Connector: Authenticates via OAuth 2.0, fetches emails, parses threads and headers, and sends replies through the Gmail API.
- Classification Engine: Uses an LLM to detect priority levels, assign category tags, and analyze sentiment for every email.
- Draft Generator: Creates context-aware reply drafts that match your writing tone, with a template system for common responses.
- Smart Features Module: Generates summaries, extracts action items, schedules follow-up reminders, and produces daily digests.
- Web Dashboard: A Flask-powered interface for viewing your priority inbox, reviewing drafts, tracking action items, and sending replies.
Incoming Email (Gmail API)
|
v
[Email Fetcher] --OAuth 2.0--> [Gmail API]
|
v
[Email Parser] --> [SQLite DB]
|
+---> [Classifier] --> priority, category, sentiment
+---> [Draft Generator] --> suggested replies
+---> [Smart Features] --> summaries, action items
|
v
[Flask Web Dashboard]
|
v
User: Review, Edit, Send
Step 1: Create the Project Structure
Create the following directory structure. Each module handles one responsibility:
ai-email-assistant/
├── app/
│ ├── __init__.py
│ ├── config.py # Configuration and environment variables
│ ├── database.py # SQLite setup and models
│ ├── gmail/
│ │ ├── __init__.py
│ │ ├── auth.py # OAuth 2.0 authentication
│ │ ├── client.py # Gmail API client
│ │ └── parser.py # Email parsing utilities
│ ├── ai/
│ │ ├── __init__.py
│ │ ├── classifier.py # Priority, category, sentiment
│ │ ├── drafter.py # Draft reply generation
│ │ └── smart.py # Summaries, action items, reminders
│ ├── web/
│ │ ├── __init__.py
│ │ ├── routes.py # Flask routes
│ │ └── templates/
│ │ ├── base.html
│ │ ├── inbox.html
│ │ └── draft.html
│ └── scheduler.py # Background job scheduler
├── credentials/ # Gmail OAuth credentials (gitignored)
├── tests/
│ └── test_smoke.py
├── .env.example
├── .gitignore
├── requirements.txt
└── run.py # Application entry point
# Create the project directory structure
mkdir -p ai-email-assistant/{app/{gmail,ai,web/templates},credentials,tests}
# Create all __init__.py files
touch ai-email-assistant/app/__init__.py
touch ai-email-assistant/app/gmail/__init__.py
touch ai-email-assistant/app/ai/__init__.py
touch ai-email-assistant/app/web/__init__.py
Step 2: Install Dependencies
Create the requirements file with all packages we need:
# requirements.txt
# Gmail API
google-api-python-client==2.131.0
google-auth-httplib2==0.2.0
google-auth-oauthlib==1.2.0
# OpenAI
openai==1.35.0
# Web framework
flask==3.0.3
flask-cors==4.0.1
# Database
sqlalchemy==2.0.30
# Background jobs
apscheduler==3.10.4
# Utilities
python-dotenv==1.0.1
beautifulsoup4==4.12.3
html2text==2024.2.26
# Development
pytest==8.2.2
# Create a virtual environment and install
cd ai-email-assistant
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txt
Step 3: Set Up Gmail API Credentials
You need a Google Cloud project with the Gmail API enabled. Follow these steps:
- Go to console.cloud.google.com and create a new project (or select an existing one).
- Navigate to APIs & Services > Library and enable the Gmail API.
- Go to APIs & Services > Credentials and click Create Credentials > OAuth client ID.
- Select Desktop app as the application type and give it a name.
- Download the JSON file and save it as
credentials/client_secret.json. - Go to OAuth consent screen, add your email as a test user (required for apps in testing mode).
credentials/client_secret.json or credentials/token.json to version control. These files contain your OAuth secrets. The .gitignore we create below excludes them.Step 4: Configuration Module
Create the configuration file that loads environment variables and validates them at startup:
# .env.example
OPENAI_API_KEY=sk-your-openai-api-key-here
OPENAI_MODEL=gpt-4o-mini
GMAIL_CREDENTIALS_PATH=credentials/client_secret.json
GMAIL_TOKEN_PATH=credentials/token.json
DATABASE_URL=sqlite:///email_assistant.db
FLASK_SECRET_KEY=change-this-to-a-random-string
FLASK_PORT=5000
POLL_INTERVAL_MINUTES=5
MAX_EMAILS_PER_FETCH=50
# app/config.py
"""Application configuration loaded from environment variables."""
import os
from dataclasses import dataclass, field
from dotenv import load_dotenv
load_dotenv()
@dataclass
class Config:
"""Central configuration for the AI email assistant."""
# OpenAI
openai_api_key: str = field(
default_factory=lambda: os.getenv("OPENAI_API_KEY", "")
)
openai_model: str = field(
default_factory=lambda: os.getenv("OPENAI_MODEL", "gpt-4o-mini")
)
# Gmail
gmail_credentials_path: str = field(
default_factory=lambda: os.getenv(
"GMAIL_CREDENTIALS_PATH", "credentials/client_secret.json"
)
)
gmail_token_path: str = field(
default_factory=lambda: os.getenv(
"GMAIL_TOKEN_PATH", "credentials/token.json"
)
)
gmail_scopes: list = field(default_factory=lambda: [
"https://www.googleapis.com/auth/gmail.readonly",
"https://www.googleapis.com/auth/gmail.send",
"https://www.googleapis.com/auth/gmail.modify",
])
# Database
database_url: str = field(
default_factory=lambda: os.getenv(
"DATABASE_URL", "sqlite:///email_assistant.db"
)
)
# Flask
flask_secret_key: str = field(
default_factory=lambda: os.getenv(
"FLASK_SECRET_KEY", "dev-secret-change-in-production"
)
)
flask_port: int = field(
default_factory=lambda: int(os.getenv("FLASK_PORT", "5000"))
)
# Scheduler
poll_interval_minutes: int = field(
default_factory=lambda: int(os.getenv("POLL_INTERVAL_MINUTES", "5"))
)
max_emails_per_fetch: int = field(
default_factory=lambda: int(os.getenv("MAX_EMAILS_PER_FETCH", "50"))
)
def validate(self):
"""Validate required configuration values."""
if not self.openai_api_key:
raise ValueError("OPENAI_API_KEY is required. Set it in .env")
if not os.path.exists(self.gmail_credentials_path):
raise ValueError(
f"Gmail credentials not found at {self.gmail_credentials_path}. "
"Download from Google Cloud Console."
)
return True
# Singleton config instance
config = Config()
Step 5: Database Schema
Create the SQLite database with SQLAlchemy. We need tables for emails, classifications, drafts, and action items:
# app/database.py
"""SQLite database setup with SQLAlchemy ORM."""
from datetime import datetime
from sqlalchemy import (
create_engine, Column, Integer, String, Text,
DateTime, Boolean, Float, ForeignKey
)
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker, relationship
from app.config import config
engine = create_engine(config.database_url, echo=False)
SessionLocal = sessionmaker(bind=engine)
Base = declarative_base()
class Email(Base):
"""Stored email metadata from Gmail."""
__tablename__ = "emails"
id = Column(Integer, primary_key=True, autoincrement=True)
gmail_id = Column(String(64), unique=True, nullable=False, index=True)
thread_id = Column(String(64), index=True)
subject = Column(String(500))
sender = Column(String(320))
sender_name = Column(String(200))
recipient = Column(String(320))
date = Column(DateTime)
snippet = Column(Text)
body_text = Column(Text)
body_html = Column(Text)
labels = Column(String(500)) # Comma-separated Gmail labels
is_read = Column(Boolean, default=False)
has_attachments = Column(Boolean, default=False)
fetched_at = Column(DateTime, default=datetime.utcnow)
# Relationships
classification = relationship(
"Classification", back_populates="email", uselist=False
)
drafts = relationship("Draft", back_populates="email")
action_items = relationship("ActionItem", back_populates="email")
class Classification(Base):
"""LLM-generated email classification."""
__tablename__ = "classifications"
id = Column(Integer, primary_key=True, autoincrement=True)
email_id = Column(Integer, ForeignKey("emails.id"), unique=True)
priority = Column(String(20)) # urgent, high, normal, low
category = Column(String(50)) # meeting, task, fyi, personal, etc.
sentiment = Column(String(20)) # positive, neutral, negative, urgent
confidence = Column(Float) # 0.0 to 1.0
summary = Column(Text) # One-line summary
classified_at = Column(DateTime, default=datetime.utcnow)
email = relationship("Email", back_populates="classification")
class Draft(Base):
"""AI-generated draft replies."""
__tablename__ = "drafts"
id = Column(Integer, primary_key=True, autoincrement=True)
email_id = Column(Integer, ForeignKey("emails.id"))
content = Column(Text, nullable=False)
tone = Column(String(50)) # professional, friendly, brief, etc.
status = Column(String(20), default="pending") # pending, approved, sent
created_at = Column(DateTime, default=datetime.utcnow)
sent_at = Column(DateTime, nullable=True)
email = relationship("Email", back_populates="drafts")
class ActionItem(Base):
"""Extracted action items from emails."""
__tablename__ = "action_items"
id = Column(Integer, primary_key=True, autoincrement=True)
email_id = Column(Integer, ForeignKey("emails.id"))
description = Column(Text, nullable=False)
due_date = Column(DateTime, nullable=True)
is_completed = Column(Boolean, default=False)
extracted_at = Column(DateTime, default=datetime.utcnow)
email = relationship("Email", back_populates="action_items")
class FollowUp(Base):
"""Follow-up reminders."""
__tablename__ = "follow_ups"
id = Column(Integer, primary_key=True, autoincrement=True)
email_id = Column(Integer, ForeignKey("emails.id"))
reminder_date = Column(DateTime, nullable=False)
reason = Column(Text)
is_dismissed = Column(Boolean, default=False)
created_at = Column(DateTime, default=datetime.utcnow)
def init_db():
"""Create all tables."""
Base.metadata.create_all(engine)
print("Database initialized successfully.")
def get_session():
"""Get a database session."""
session = SessionLocal()
try:
yield session
finally:
session.close()
Step 6: Application Entry Point
Create the main entry point that initializes everything and starts the Flask server:
# run.py
"""Application entry point."""
from app.config import config
from app.database import init_db
def main():
"""Initialize and start the AI email assistant."""
print("AI Email Assistant - Starting up...")
print(f" OpenAI Model: {config.openai_model}")
print(f" Database: {config.database_url}")
print(f" Poll Interval: {config.poll_interval_minutes} minutes")
# Initialize the database
init_db()
# Import Flask app (created in Lesson 6)
# For now, just verify the setup works
print("\nSetup complete! All components initialized.")
print(f"Dashboard will be available at http://localhost:{config.flask_port}")
if __name__ == "__main__":
main()
Step 7: Gitignore and Smoke Test
# .gitignore
venv/
__pycache__/
*.pyc
.env
credentials/
*.db
.DS_Store
# tests/test_smoke.py
"""Smoke tests to verify the setup."""
import os
from dotenv import load_dotenv
load_dotenv()
def test_openai_connection():
"""Verify OpenAI API key works."""
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
model=os.getenv("OPENAI_MODEL", "gpt-4o-mini"),
messages=[{"role": "user", "content": "Say 'hello' in one word."}],
max_tokens=5
)
result = response.choices[0].message.content.strip().lower()
assert "hello" in result, f"Unexpected response: {result}"
print(f"OpenAI OK - response: {result}")
def test_database_creation():
"""Verify SQLite database can be created."""
from app.database import init_db, engine
from sqlalchemy import inspect
init_db()
inspector = inspect(engine)
tables = inspector.get_table_names()
expected = ["emails", "classifications", "drafts", "action_items", "follow_ups"]
for table in expected:
assert table in tables, f"Missing table: {table}"
print(f"Database OK - tables: {tables}")
def test_gmail_credentials_exist():
"""Verify Gmail credentials file exists."""
creds_path = os.getenv(
"GMAIL_CREDENTIALS_PATH", "credentials/client_secret.json"
)
exists = os.path.exists(creds_path)
print(f"Gmail credentials at {creds_path}: {'FOUND' if exists else 'MISSING'}")
if not exists:
print(" Download from Google Cloud Console > APIs & Services > Credentials")
if __name__ == "__main__":
test_database_creation()
test_openai_connection()
test_gmail_credentials_exist()
print("\nAll smoke tests passed!")
# Run the smoke tests
python tests/test_smoke.py
# Expected:
# Database initialized successfully.
# Database OK - tables: ['emails', 'classifications', 'drafts', ...]
# OpenAI OK - response: hello
# Gmail credentials at credentials/client_secret.json: FOUND
# All smoke tests passed!
Key Takeaways
- The project uses a clean modular structure: gmail, ai, web, and scheduler are separate packages with clear responsibilities.
- Configuration is loaded from environment variables and validated at startup so missing values fail fast.
- SQLAlchemy provides an ORM layer over SQLite, making it easy to swap to PostgreSQL later for production.
- The database schema captures emails, classifications, drafts, action items, and follow-ups as separate but related entities.
What Is Next
In the next lesson, you will build the Gmail integration — OAuth 2.0 authentication, fetching emails with pagination, parsing threads and headers, and storing email metadata in the database.
Lilly Tech Systems