Quick Overview
FileSense is a Self-Organizing file organizer that sorts documents by meaning, not just filenames. Unlike standard organizers, it uses SentenceTransformers and FAISS to understand context.
🤖 Generative Labeling
Uses Google Gemini to analyze unknown files and auto-create specific folder categories.
🟣 Reinforcement Learning
Epsilon-Greedy Bandit agent learns the optimal policy to balance speed vs accuracy.
🧠 Semantic Search
Vector embeddings understand "Newton" belongs in "Physics" without explicit rules.
⚡ Live Indexing
FAISS index rebuilds dynamically when new labels are generated.
Double-click FileSense_Launcher.bat to start the app
instantly without command line!
Demo Video
A short walkthrough demo.
How It Works
The system follows a decision pipeline: Identify → RL Agent Decisions → Search OR Generate → Move.
1️⃣ Semantic Classification
Files are read (via text or OCR), encoded into vectors, and compared against the
folder_embeddings.faiss index. High similarity matches are instantly sorted.
2️⃣ The RL Agent (Epsilon-Greedy)
Before calling any API, the Epsilon-Greedy Agent evaluates the state. It decides whether to:
- Exploit: Use the safest/cheapest known method (Vector Search).
- Explore: Attempt to find a better label using GenAI (if permitted by policy).
3️⃣ Generative Fallback
If the Agent permits, low-confidence files are sent to Google Gemini to:
- Generate a broad Category Label.
- Create description keywords.
- Update
folder_labels.jsonand rebuild the index.
Project Structure
FileSense/ ├── scripts/ │ ├── RL/ # Reinforcement Learning │ │ ├── rl_policy.py # Epsilon-Greedy Agent │ │ ├── rl_feedback.py # Reward System │ │ ├── rl_config.py # Hyperparameters │ │ └── rl_supabase.py # Cloud Logs │ ├── logger/ # Logging System │ │ ├── logger.py # Main Logger │ │ └── rl_logger.py # RL Logger │ ├── classify_process_file.py # Core Logic │ ├── generate_label.py # Gemini Interface │ ├── create_index.py # FAISS Indexer │ ├── extract_text.py # OCR Engine │ └── launcher.py # GUI App ├── evaluation/ # Metrics ├── landing/ # Website └── wiki/ # Documentation
Core Features
- Semantic Sorting: Classifies documents by meaning using SentenceTransformers.
- RL-Optimized: Adapts to user files over time to minimize expensive API calls.
- AI-Powered Labeling: Creates new categories for unknown files automatically.
Want to see the data?
I've documented every benchmark, failure, and architecture decision in the Wiki. Check out the Metrics, RL Analysis, and NL vs Keywords study.
📚 Explore the Wiki