# FinSim Investment Engine - Comprehensive Documentation & Setup Guide

## 1. Complete Step-by-Step Setup & Execution Guide

**Prerequisites:**
- **Python 3.12** or higher installed on your windows system.
- The **`uv`** package manager (highly recommended as the project uses a `uv.lock` file).

### Step 1: Open Terminal and Navigate to the Project Directory
Open PowerShell or Command Prompt and navigate to the project folder:
```powershell
cd C:\Users\Fares\OneDrive\Desktop\FinSim\investment_engine
```

### Step 2: Install Dependencies
Since the project relies on the modern `uv` build tool, we will use it to install the environment perfectly.

If you don't have `uv` installed globally in Python, install it first:
```powershell
pip install uv
```

Now, sync the dependencies. This command automatically creates a `.venv` virtual environment in the folder and strictly installs everything in `uv.lock` (like FastAPI, LangChain, Polars):
```powershell
uv sync
```
*(If you are avoiding `uv` for any reason, you can manually use standard pip instead: `python -m venv .venv`, then `.\.venv\Scripts\activate`, then `pip install -e .`)*

### Step 3: Configure Environment Variables
The application needs secure API keys to talk to the AI and Search platforms.
1. Make sure you are in `C:\Users\Fares\OneDrive\Desktop\FinSim\investment_engine`.
2. Create a new text file named exactly `.env` (with a dot at the start).
3. Open `.env` in Notepad or VSCode and paste the following, replacing the placeholders with your actual keys:

```env
GROQ_API_KEY="your_groq_api_key_here"
SERPAPI_API_KEY="your_serpapi_api_key_here"
```
*(Note: Important Database credentials for the remote CapRover MySQL instance are already hardcoded/defaulted safely in `config.py`, so you do not need to add DB keys here unless you want to override them).*

### Step 4: Run the Application
Start the FastAPI server. Because we used `uv`, we can use `uv run` to automatically use the virtual environment without needing to activate it manually.

```powershell
uv run python main.py
```

*Output should look like this:*
```text
Starting FinSim...
INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
INFO:     Started reloader process [...]
INFO:     Started server process [...]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
```

### Step 5: Access the Web Interface
1. Open your web browser (Chrome, Edge, etc.).
2. Go to: [http://localhost:8000](http://localhost:8000)
3. You will see the FinSim UI dashboard! You can generate historical scenarios on the left panel, and start chatting with the interactive AI on the right.

---

## 2. Granular File Descriptions

### Core Application Layer
- **`app.py`**
  - **Purpose:** The central nervous system of the FastAPI app.
  - **Details:** Mounts the `static/` folder to serve the UI on `/`. It defines the two main POST endpoints: `/generate` (which sequentially calls the Z-score engine, AI scenario generator, and Database insert functions) and `/chat` (which talks to the interactive agent). It also initializes the remote Database table on startup if it isn't there.
- **`main.py`**
  - **Purpose:** The immediate execution point.
  - **Details:** Calls `uvicorn.run("app:app", host="0.0.0.0", port=8000)`. It checks `config.py` upfront to warn you in the terminal if you forgot to set your `GROQ_API_KEY`.
- **`config.py`**
  - **Purpose:** Environment and configuration management.
  - **Details:** Uses `pydantic-settings`. Automatically loads the `.env` file. Defines all default values such as the Groq model name (`llama-3.3-70b-versatile`), remote MySQL server/credentials for CapRover, and the default math triggers for the Z-score logic (like a 100-day window).
- **`models.py`**
  - **Purpose:** The strict data types (Pydantic).
  - **Details:** Enforces rigid shapes for all data flowing through the app. It holds models for HTTP requests (`GenerateRequest`), internal Z-Score calculations (`ZScoreEvent`), and highly nested JSON structures that the LLM is forced to output (`Scenario`, `ScenarioGenerationResult`).

### Business Logic (`services/` Directory)
- **`services/zscore_engine.py`**
  - **Purpose:** The high-speed quantitative volatility analyzer.
  - **Details:** Connects to Yahoo Finance (`yfinance`) to pull 5 years of daily stock prices. Uses `polars` (a blazing fast data library written in Rust) to calculate rolling means, standard deviations, and final Z-scores. Filters out data that exceeds the trigger thresholds. It categorizes dates against a hardcoded list of `KNOWN_EVENTS` (e.g. 2008 Lehman Brothers collapse) to inject real historical context into the data points before returning them.
- **`services/scenario_gen.py`**
  - **Purpose:** Connects to Groq AI to generate MCQs.
  - **Details:** Takes the mathematical events found by `zscore_engine.py` and feeds them to the `llama-3.3-70b-versatile` model via LangChain. A massive system prompt forces the LLM to output pure JSON mapping exactly to the components required by the `Scenario` Pydantic model (Title, paragraph narrative, a best answer with rationale, and 3 decoy answers).
- **`services/database.py`**
  - **Purpose:** MySQL persistence layer.
  - **Details:** Sets up connection pooling to the `scenariodb.caprover.al-arcade.com` server. Includes SQL statements for `init_db()` (table creation) and `insert_scenario()` to log AI-generated MCQs robustly. Exports `get_random_scenario()` specifically for the chatbot to grab quiz questions.
- **`services/chat_agent.py`**
  - **Purpose:** The interactive LangChain ReAct (Reasoning and Acting) bot.
  - **Details:** Creates a conversational agent loop. It gives the AI tools: `@tool SerpApi_Search` for live web lookups (prices/news), and `@tool mcq_scenarios` to fetch DB questions. Maintains temporary session history in a dictionary `_sessions`, ensuring the bot remembers the last 20 messages per user. Complex extraction logic is included to pull the final response string from LangChain's diverse message structures.

### Frontend (`static/` Directory)
- **`static/index.html`**
  - **Purpose:** The user-facing dashboard.
  - **Details:** A clean, zero-dependency HTML file styled completely with CSS Variables (dark theme). It contains a form matching `models.GenerateRequest` on the left that fires Javascript `fetch('/generate')` requests. On the right, it implements a scrollable chat UI that tracks session variables and POSTs arrays of strings to `fetch('/chat')`.

### Dev Tools & Meta Files
- **`pyproject.toml`**
  - **Purpose:** Python application package definitions.
  - **Details:** Specifies that this requires Python >= 3.12 and strictly declares what packages the project needs (fastapi, langchain, yfinance, etc).
- **`uv.lock`**
  - **Purpose:** The reproducible dependencies file.
  - **Details:** Auto-generated by `uv`, it locks the exact hashes and versions of every library tree so developers sharing the project experience zero environment issues.
- **`.python-version`**
  - **Purpose:** A tiny text file (just says `3.12`) telling version managers like `pyenv` or `uv` to use Python 3.12 by default here.
- **`debug_scenario.py`**
  - **Purpose:** Terminal debugging.
  - **Details:** A manual script to test the LangChain chat agent loop in isolation inside the terminal, skipping the FastAPI and HTML layer entirely. Great for diagnosing AI tool-calling prompt issues.
- **`test_extraction_mock.py`**
  - **Purpose:** Unit testing for parsing LangChain AI formats.
  - **Details:** LangChain AI messages can randomly return as plain strings, lists of dicts, or nested objects. This mocks fake responses and runs them through the parsing algorithm copied from `chat_agent.py` to assert it successfully extracts plain text in all scenarios without crashing.
