← Back
Updated: 2025

🗂️ FileSense

✨ Reinforcement Learning (RL) Check

Integrated an Epsilon-Greedy Bandit agent to optimize decision making.
Note: Currently pivoting to Local SFT models to bypass Gemini Free Tier limits.

Essentially a python script that generates vector embeddings using BAAI/bge-base-en-v1.5. It combines local vector search with Google Gemini to classify and organize files. See Model Comparison.

Quick Overview

FileSense is a Self-Organizing file organizer that sorts documents by meaning, not just filenames. Unlike standard organizers, it uses SentenceTransformers and FAISS to understand context.

🤖 Generative Labeling

Uses Google Gemini to analyze unknown files and auto-create specific folder categories.

🟣 Reinforcement Learning

Epsilon-Greedy Bandit agent learns the optimal policy to balance speed vs accuracy.

🧠 Semantic Search

Vector embeddings understand "Newton" belongs in "Physics" without explicit rules.

⚡ Live Indexing

FAISS index rebuilds dynamically when new labels are generated.

🚀 Quick Start:

Double-click FileSense_Launcher.bat to start the app instantly without command line!

Demo Video

A short walkthrough demo.

How It Works

The system follows a decision pipeline: Identify → RL Agent Decisions → Search OR Generate → Move.

1️⃣ Semantic Classification

Files are read (via text or OCR), encoded into vectors, and compared against the folder_embeddings.faiss index. High similarity matches are instantly sorted.

2️⃣ The RL Agent (Epsilon-Greedy)

Before calling any API, the Epsilon-Greedy Agent evaluates the state. It decides whether to:

3️⃣ Generative Fallback

If the Agent permits, low-confidence files are sent to Google Gemini to:

Project Structure

FileSense/
├── scripts/
│   ├── RL/                       # Reinforcement Learning
│   │   ├── rl_policy.py          # Epsilon-Greedy Agent
│   │   ├── rl_feedback.py        # Reward System
│   │   ├── rl_config.py          # Hyperparameters
│   │   └── rl_supabase.py        # Cloud Logs
│   ├── logger/                   # Logging System
│   │   ├── logger.py             # Main Logger
│   │   └── rl_logger.py          # RL Logger
│   ├── classify_process_file.py  # Core Logic
│   ├── generate_label.py         # Gemini Interface
│   ├── create_index.py           # FAISS Indexer
│   ├── extract_text.py           # OCR Engine
│   └── launcher.py               # GUI App
├── evaluation/                   # Metrics
├── landing/                      # Website
└── wiki/                         # Documentation

Core Features

Want to see the data?

I've documented every benchmark, failure, and architecture decision in the Wiki. Check out the Metrics, RL Analysis, and NL vs Keywords study.

📚 Explore the Wiki