Invantia

Documentation

Complete Guide to Invantia Platform

📚 Documentation Sections

🚀 Getting Started

What is Invantia?

Invantia is a document intelligence platform that helps you analyze large sets of documents using AI. Unlike traditional search tools, Invantia uses intelligent corpus reduction to find precisely the content you need and package it for AI analysis.

Think of it as a smart document librarian that reads your documents, understands what you're looking for, and prepares perfectly organized summaries for AI assistants like ChatGPT, Claude, or Gemini.

5-Minute Quick Start

Step 1: Upload Documents

Go to the Desktop home page and upload PDF, DOCX, or TXT files. Files are processed entirely in your browser - nothing is uploaded to our servers.

Step 2: Select Documents or Collection

Choose which documents to search, or select a collection if you've organized related documents together.

Step 3: Choose Chat Window Size

Select "Standard" (works with all free AI accounts) or "Large" (for paid subscriptions with bigger paste limits).

Step 4: Ask Your Questions

Type natural language questions like "What are the key findings about climate change?" Invantia uses semantic search to find related content even if exact words don't match.

Step 5: Copy & Paste to AI

Click the "Copy Package" button and paste into ChatGPT, Claude, or any other AI assistant. The AI will analyze the pre-filtered, relevant content.

First-Time User Checklist

  • Upload 1-3 test documents to get familiar with the interface
  • Try a simple search to see how results are organized
  • Create your first collection to group related documents
  • Experiment with different question phrasings to see semantic matching
  • Review the security page to understand data privacy
  • Bookmark the Manage Documents page for easy access

🖥️ Desktop Edition User Guide

Document Management

Uploading Documents

Supported formats: PDF, DOCX, TXT

Upload methods:

  • Click "Choose Files" button
  • Drag and drop files onto the drop zone
  • Select multiple files at once (Ctrl+Click or Cmd+Click)

Processing: Files are parsed in your browser and stored in IndexedDB. Large files (>10MB) may take a minute to process.

Organizing with Collections

Collections let you group related documents:

  • Go to Manage Documents
  • Click "Create Collection"
  • Give it a descriptive name (e.g., "Q3 2024 Financial Reports")
  • Add documents to the collection
  • Use collections in searches to query multiple related documents at once

Document Types

Assign types to documents for better organization:

  • Contract: Legal agreements, terms of service
  • Report: Research reports, financial statements
  • Email: Email threads, correspondence
  • Memo: Internal memos, notes
  • Other: Anything that doesn't fit above categories

Building Queries

Understanding Query Topics

Each "topic" represents a question or area of interest. Invantia searches all documents for content related to each topic and packages the results together.

Writing Effective Questions

Good questions:

  • "What are the financial projections for 2025?"
  • "Summarize the key risks mentioned in the contracts"
  • "Find all references to data privacy requirements"
  • "What are the deliverables and deadlines?"

Less effective:

  • "Tell me about stuff" (too vague)
  • "revenue" (better: "What were the revenue figures?")
  • "Find page 17" (Invantia searches by content, not page numbers)

Semantic Search

Invantia uses semantic expansion to find related content even when exact words don't match:

  • Query: "financial performance" → Also finds: "revenue", "profit", "earnings"
  • Query: "deadlines" → Also finds: "due dates", "milestones", "completion dates"
  • Query: "risks" → Also finds: "concerns", "issues", "challenges", "threats"

Chat Packages

What is a Chat Package?

A chat package is a formatted text bundle containing:

  • Instructions for the AI on how to analyze the content
  • Your original questions
  • Relevant excerpts from your documents (super-chunks)
  • Document metadata (titles, sources)

Using Chat Packages

  1. Click "Copy Package" button after search completes
  2. Open your preferred AI assistant (ChatGPT, Claude, Gemini, etc.)
  3. Paste the entire package into the chat
  4. Wait for the AI to analyze and respond
  5. Ask follow-up questions to dig deeper

Account Tier Selection

This setting optimizes package size for your AI provider's paste limits:

  • Standard (30k characters): Works with free ChatGPT, Claude, Gemini accounts
  • Large (100k characters): For paid subscriptions (ChatGPT Plus, Claude Pro, Gemini Advanced)

Note: These limits are imposed by AI providers, not Invantia. Invantia Desktop is always free.

Data Backup & Restore

Creating Backups

  1. Go to Desktop home
  2. Look for "Your Library" sidebar
  3. Click "Backup Data" button
  4. A JSON file downloads with all your documents and metadata
  5. Store this file securely (it contains your document content)

Restoring from Backup

  1. Go to Manage Documents
  2. Click "Import Backup" button
  3. Select your previously exported JSON file
  4. Wait for import to complete

⚠️ Important: Backups are stored locally on your device. If you clear browser data without a backup, your documents are permanently lost.

💡 Core Concepts

Intelligent Corpus Reduction

Traditional search returns individual results ranked by relevance. Invantia takes a different approach: corpus reduction.

Instead of showing you a list of search results, Invantia progressively filters your document set down to just the content relevant to your query, then packages it for AI consumption.

Traditional RAG vs. Invantia Super-Chunking

Traditional RAG:

  • Documents chunked into small fragments (512 tokens)
  • Embedding model finds "similar" chunks
  • Top-K chunks sent to LLM
  • ❌ Loses context and co-references
  • ❌ May miss relevant content in lower-ranked chunks

Invantia Super-Chunking:

  • Documents chunked into large sections (2000+ tokens)
  • Hybrid scoring (TF-IDF + semantic expansion)
  • All relevant chunks packaged together
  • ✓ Preserves context and relationships
  • ✓ User reviews what's included before sending to LLM

Hybrid Search Algorithm

Invantia combines multiple scoring mechanisms for accurate results:

  • Exact term matching (100 points): Original query terms found in text
  • Semantic expansion (30 points): Related terms from TF-IDF vectorization
  • Proximity bonus (50 points): Terms appearing close together
  • Document frequency penalty: Common terms weighted less

This hybrid approach balances precision (finding exact matches) with recall (finding semantically similar content).

Collections & Document Types

Invantia provides two orthogonal organizational systems:

  • Collections: Many-to-many groupings of related documents
    Example: "Q3 2024 Board Meeting" collection contains reports, emails, and presentations
  • Document Types: Functional classification of what a document IS
    Example: Same document can be in multiple collections but has one type: "Report"

🔧 Technical Details

Client-Side Architecture

Technology Stack:

  • Storage: IndexedDB API for persistent browser storage
  • Document Parsing: PDF.js (PDF), Mammoth.js (DOCX), native APIs (TXT)
  • Vectorization: TF-IDF implementation in JavaScript
  • Search Engine: Custom hybrid scoring algorithm
  • UI Framework: Vanilla JavaScript with Jinja2 templating

IndexedDB Schema

Database: InvantiaDB

Object Stores:

  • documents: { id, filename, fileType, uploadDate, rawText, documentType }
  • chunks: { id, documentId, chunkIndex, text, position }
  • collections: { id, name, createdDate, documentIds[] }
  • vectors: { documentId, matrix: Map<term, tfidf> }
  • metadata: { key, value } // App-level settings

Search Algorithm Details

Phase 1: Term Extraction & Expansion

  1. Parse user query into terms
  2. For each term, find semantic expansions using TF-IDF vectors
  3. Weight original terms at 100 points, expansions at 30 points

Phase 2: Chunk Scoring

  1. For each chunk, calculate term frequency scores
  2. Apply inverse document frequency penalty
  3. Add proximity bonus for terms appearing within 50 characters
  4. Normalize scores by chunk length

Phase 3: Result Packaging

  1. Rank chunks by total score
  2. Group by document
  3. Format as chat package within account tier limits

Performance Considerations

  • Vectorization: Computed once per document at upload time
  • Search: O(n) where n = total chunks, typically <100ms for 1000 chunks
  • Storage: ~2MB per 100-page PDF document (includes text + vectors)
  • Memory: Processes documents in streaming fashion to handle large files

Browser Compatibility

  • Chrome / Edge: Version 90+ (Fully Supported)
  • Firefox: Version 88+ (Fully Supported)
  • Safari: Version 14+ (Fully Supported)
  • Opera: Version 76+ (Fully Supported)

🔍 Intelligent Corpus Reduction: Our Search Methodology

The Core Problem

Large Language Models face a fundamental constraint: context window limits. Even with massive 100k+ token windows, processing entire document collections becomes impractical. More critically, flooding an LLM with irrelevant content degrades response quality through what researchers call "lost in the middle" effects—the model struggles to identify and use truly relevant passages when buried in noise.

The solution? Intelligent corpus reduction: systematically reducing large document sets to precisely the content needed to answer specific queries. This isn't new thinking—it's a return to principles developed decades ago, adapted for the LLM era.

Classical Foundation: Vector Space Models (1970s-1980s)

Invantia's approach builds on the Vector Space Model (VSM), pioneered by Gerard Salton at Cornell in the 1970s for the SMART information retrieval system. The core insight was elegant: represent documents and queries as vectors in a high-dimensional space where each dimension corresponds to a term. Similarity becomes a geometric problem—documents "close" to the query vector are likely relevant.

Salton introduced term weighting schemes, most famously TF-IDF (Term Frequency-Inverse Document Frequency), which elevated important terms while downweighting common words. A term appearing frequently in one document but rarely across the collection must be significant for that document's topic. This simple heuristic proved remarkably effective and remains foundational to modern search.

Statistical Evolution: Co-occurrence and Context (1990s)

The next evolution recognized that terms don't exist in isolation—they appear in contexts. If "configure" and "GPS" frequently appear near each other across documents, they're semantically related. This insight led to co-occurrence analysis and techniques like Latent Semantic Analysis (LSA, 1990), which used singular value decomposition to discover latent semantic structures.

Invantia implements a simple but effective co-occurrence matrix: for each term, track which other terms appear within a fixed window (±7 tokens). When a user searches for "configure GPS," the system expands the query with contextually related terms like "setup," "initialization," "navigation," and "positioning"—terms that frequently co-occur in the document corpus.

This query expansion dramatically improves recall without requiring neural networks or external embeddings.

Why Not Modern Embeddings?

One might ask: why use co-occurrence matrices when we have sophisticated transformer-based embeddings? The answer reveals our key design philosophy: privacy, transparency, and computational efficiency.

Modern embedding models require:

  • Sending documents to external APIs (privacy concern)
  • Large model downloads (computational overhead)
  • Black-box transformations (lack of auditability)

Invantia's co-occurrence approach runs entirely client-side in the browser, requires no external services, and produces explainable results. When "configure" expands to "setup," users can verify this relationship in their own documents. For legal and accounting firms—our target market—this transparency and privacy are non-negotiable.

Hybrid Scoring Algorithm

The core innovation isn't the individual techniques—it's their orchestration for corpus reduction. Invantia employs a hybrid scoring system that balances multiple signals:

Scoring Components:

  • Original Query Terms (100 points each): Exact matches to user-entered terms receive maximum weight. If someone asks about "GPS configuration," chunks containing both terms rank highest.
  • Semantically Expanded Terms (30 points × similarity): Co-occurrence-based expansions contribute proportionally to their similarity score. A term with 0.8 similarity contributes 24 points.
  • Proximity Bonus (up to 50 points): Terms appearing close together (within 200 characters) receive additional weight. This rewards passages where concepts are discussed together, not just mentioned separately.

This creates a ranking cascade: chunks with all original terms and tight clustering rank first (precision), while chunks with related terms still surface (recall). The minimum threshold (30 points) filters noise while preserving relevant content.

From Chunks to Super Chunks

After scoring and ranking, Invantia performs intelligent packaging: grouping the top-scored chunks into "super chunks" that fit within the target LLM's context window (30k for free accounts, 100k for paid). This respects the reality that users don't paste individual 2000-character chunks—they need coherent, sized payloads ready for their AI provider.

Critically, super chunks maintain topic boundaries. If a user asks multiple questions, results for each topic are grouped separately, creating a structured package that helps the LLM understand the organizational logic.

Deterministic vs. Black-Box Retrieval

Modern RAG (Retrieval-Augmented Generation) systems often use neural retrievers—embedding models that map queries and documents to dense vectors, then retrieve by cosine similarity. This works but has drawbacks:

Neural RAG Limitations:

  • Non-deterministic: Same query may return different results
  • Unauditable: Why did this chunk rank #3? Hard to explain
  • Resource-intensive: Requires GPU inference or API calls
  • Privacy-leaking: Documents leave the user's control

Invantia's Classical Approach:

  • Deterministic: Same query, same documents → same results
  • Auditable: Scoring is transparent—original terms, expanded terms, proximity
  • Lightweight: Pure JavaScript, runs in-browser
  • Privacy-preserving: Documents never leave the device

Search Pipeline Overview

Phase 1: Query Understanding

  1. Extract terms from user's natural language question
  2. Build co-occurrence matrix from document corpus (±7 token window)
  3. Expand query terms with semantically related terms from matrix
  4. Weight original terms at 100 points, expansions at 30 points × similarity

Phase 2: Relevance Scoring

  1. Scan all chunks in selected documents/collections
  2. Calculate score for each chunk based on term matches
  3. Apply proximity bonus for co-located terms (within 200 chars)
  4. Filter chunks below minimum threshold (30 points)
  5. Rank remaining chunks by total score (descending)

Phase 3: Intelligent Packaging

  1. Group top-ranked chunks by topic
  2. Pack into super chunks respecting LLM context window limits
  3. Maintain topic boundaries across super chunks
  4. Format with clear delimiters for LLM consumption
  5. Present as ready-to-paste chat packages

Standing on the Shoulders of Giants

What Invantia demonstrates is that fundamental principles from the golden age of information retrieval (1970s-1990s) remain profoundly relevant. Salton's vector space model, TF-IDF weighting, co-occurrence analysis, query expansion—these weren't superseded by deep learning; they were validated.

The innovation is recognizing that for corpus reduction—the specific task of taking large document sets and reducing them to LLM-sized relevant subsets—you don't need the latest neural architecture. You need:

  1. Query understanding (semantic expansion via co-occurrence)
  2. Relevance ranking (hybrid scoring with multiple signals)
  3. Intelligent packaging (super chunks respecting LLM limits)

These are solved problems. The "new" part is applying them to the LLM workflow, creating a bridge between classical IR and modern AI chat interfaces.

Old Methods, New Context

Invantia's approach isn't revolutionary—it's evolutionary. It takes proven techniques from information retrieval's rich history and applies them to a new problem: preparing document corpora for LLM consumption.

The vector space model is 50 years old. Co-occurrence analysis is 35 years old. But for the task of intelligent corpus reduction—finding the needle in the haystack and presenting it to an AI in a digestible format—these classical methods remain remarkably effective.

As the saying goes: "There's nothing new under the sun." Invantia proves that in the age of transformer models and billion-parameter networks, sometimes the oldest ideas are still the best ones.

References & Further Reading

  • Salton, G., Wong, A., & Yang, C. S. (1975). "A vector space model for automatic indexing." Communications of the ACM, 18(11), 613-620.
  • Salton, G., & Buckley, C. (1988). "Term-weighting approaches in automatic text retrieval." Information Processing & Management, 24(5), 513-523.
  • Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., & Harshman, R. (1990). "Indexing by latent semantic analysis." Journal of the American Society for Information Science, 41(6), 391-407.
  • Church, K. W., & Hanks, P. (1990). "Word association norms, mutual information, and lexicography." Computational Linguistics, 16(1), 22-29.

Understanding Invantia's Co-occurrence Vectors: A Visual Example

Sample Document Text

Let's say you upload a GPS installation manual with this text:

To configure GPS, first access the navigation settings menu. 
The GPS configuration requires entering the coordinates manually.
For automatic positioning, enable the GPS receiver in setup mode.
The navigation system uses satellite signals for positioning accuracy.
Configure the antenna before testing GPS functionality.

What Gets Stored: The Co-occurrence Matrix

Invantia builds a co-occurrence matrix by tracking which words appear near each other (within ±7 tokens).

For the term "GPS":
vectors.matrix.get("gps") = Map {
  "configure"    => 3,  // appears near "gps" 3 times
  "navigation"   => 2,  // appears near "gps" 2 times
  "positioning"  => 1,  // appears near "gps" 1 time
  "settings"     => 1,
  "receiver"     => 1,
  "antenna"      => 1,
  "satellite"    => 0,  // never appears within 7 tokens of "gps"
  "functionality"=> 1
}
For the term "configure":
vectors.matrix.get("configure") = Map {
  "gps"          => 3,
  "navigation"   => 1,
  "settings"     => 1,
  "antenna"      => 1,
  "positioning"  => 0,
  "receiver"     => 0
}
For the term "navigation":
vectors.matrix.get("navigation") = Map {
  "gps"          => 2,
  "settings"     => 1,
  "system"       => 1,
  "positioning"  => 1,
  "satellite"    => 1,
  "configure"    => 1
}

Term Frequency Data

Invantia also tracks how often each term appears:

vectors.termFrequencies = Map {
  "gps"          => 5,  // appears 5 times total
  "configure"    => 3,
  "navigation"   => 2,
  "positioning"  => 2,
  "settings"     => 1,
  "receiver"     => 1,
  "antenna"      => 1,
  "satellite"    => 1,
  "functionality"=> 1,
  // ... etc
}

vectors.totalTerms = 87  // total meaningful terms in document

How Query Expansion Works

When you search for: "How do I configure GPS"

Step 1: Extract Query Terms
queryTerms = ["configure", "gps"]
Step 2: Find Related Terms Using Vectors

For "configure":

// Look up configure in the matrix
matrix.get("configure") = {
  "gps": 3,
  "navigation": 1,
  "settings": 1,
  "antenna": 1
}

// Calculate similarity scores (co-occurrence count / total appearances)
expandedTerms = [
  { term: "gps",        similarity: 3/3 = 1.0 },
  { term: "navigation", similarity: 1/3 = 0.33 },
  { term: "settings",   similarity: 1/3 = 0.33 },
  { term: "antenna",    similarity: 1/3 = 0.33 }
]

For "gps":

// Look up gps in the matrix
matrix.get("gps") = {
  "configure": 3,
  "navigation": 2,
  "positioning": 1,
  "settings": 1,
  "receiver": 1,
  "antenna": 1,
  "functionality": 1
}

// Calculate similarity scores
expandedTerms = [
  { term: "configure",     similarity: 3/5 = 0.60 },
  { term: "navigation",    similarity: 2/5 = 0.40 },
  { term: "positioning",   similarity: 1/5 = 0.20 },
  { term: "settings",      similarity: 1/5 = 0.20 },
  { term: "receiver",      similarity: 1/5 = 0.20 },
  { term: "antenna",       similarity: 1/5 = 0.20 },
  { term: "functionality", similarity: 1/5 = 0.20 }
]
Step 3: Combine & Deduplicate
finalExpandedQuery = {
  originalTerms: ["configure", "gps"],
  
  expandedTerms: [
    // From "configure"
    { term: "navigation", similarity: 0.33, source: "configure" },
    { term: "settings",   similarity: 0.33, source: "configure" },
    { term: "antenna",    similarity: 0.33, source: "configure" },
    
    // From "gps"
    { term: "positioning",   similarity: 0.20, source: "gps" },
    { term: "receiver",      similarity: 0.20, source: "gps" },
    { term: "functionality", similarity: 0.20, source: "gps" }
    
    // Note: "gps" and "configure" appear in each other's expansions
    // but are already in originalTerms, so not duplicated
  ]
}

Scoring Example: How Chunks Get Ranked

Let's score two chunks from the document:

Chunk 1:
"To configure GPS, first access the navigation settings menu. 
The GPS configuration requires entering the coordinates manually."

Score Calculation:

  • Original term "configure": 100 points
  • Original term "gps": 100 points (appears twice, counted once)
  • Expanded term "navigation" (0.33 similarity): 30 × 0.33 = 10 points
  • Expanded term "settings" (0.33 similarity): 30 × 0.33 = 10 points
  • Proximity bonus: "configure" and "GPS" within 200 chars: 50 points

Total Score: 270 points

Chunk 2:
"The navigation system uses satellite signals for positioning accuracy."

Score Calculation:

  • Original term "configure": 0 points (not present)
  • Original term "gps": 0 points (not present)
  • Expanded term "navigation" (0.33 similarity): 30 × 0.33 = 10 points
  • Expanded term "positioning" (0.20 similarity): 30 × 0.20 = 6 points
  • Proximity bonus: 0 points (original terms not present)

Total Score: 16 points (below 30 point threshold, filtered out)

Actual Data Structure in IndexedDB

Here's what's literally stored in your browser's IndexedDB:

{
  documentId: 5,
  
  // The co-occurrence matrix (converted to plain object for storage)
  matrix: {
    "gps": {
      "configure": 3,
      "navigation": 2,
      "positioning": 1,
      "settings": 1,
      "receiver": 1,
      "antenna": 1,
      "functionality": 1
    },
    "configure": {
      "gps": 3,
      "navigation": 1,
      "settings": 1,
      "antenna": 1
    },
    "navigation": {
      "gps": 2,
      "settings": 1,
      "system": 1,
      "positioning": 1,
      "satellite": 1,
      "configure": 1
    },
    // ... hundreds more terms
  },
  
  // Term frequencies
  termFrequencies: {
    "gps": 5,
    "configure": 3,
    "navigation": 2,
    "positioning": 2,
    "settings": 1,
    "receiver": 1,
    // ... etc
  },
  
  totalTerms: 87,
  created: "2025-12-06T23:47:26.229Z"
}

Why This Works

Key Insight: Words that appear near each other frequently are semantically related.

  • "GPS" and "configure" appear together 3 times → strongly related
  • "GPS" and "navigation" appear together 2 times → moderately related
  • "GPS" and "satellite" appear together 0 times → not related in this document

When you search for "configure GPS", the system automatically knows to also look for:

  • navigation (0.33 similarity)
  • settings (0.33 similarity)
  • positioning (0.20 similarity)
  • receiver (0.20 similarity)

This catches relevant content even if it doesn't use your exact words!

Size & Performance

For a typical 100-page document (~50,000 words):

  • Unique terms: ~2,000-3,000
  • Matrix entries: ~10,000-20,000 term pairs
  • Storage size: ~500KB-1MB
  • Build time: 2-5 seconds during upload
  • Search time: <100ms to scan all chunks

Comparison: What You DON'T See

What Invantia DOESN'T store:
// NO dense neural embeddings like:
"gps" => [0.234, -0.891, 0.445, 0.123, ... 768 more numbers]

// NO external API calls
// NO transformer models
// NO GPU processing
What RAG systems typically store:
// Dense 768-dimensional vectors (much larger!)
"gps" => Float32Array[768] {
  0.23445, -0.89123, 0.44567, 0.12389, 
  -0.55234, 0.78901, -0.34567, 0.91234,
  // ... 760 more floating point numbers
}

Size comparison:

  • Invantia co-occurrence: ~1MB per 100-page doc
  • Neural embeddings: ~15MB per 100-page doc (15x larger!)

Summary

Invantia's vectors are sparse, interpretable, and lightweight:

  • ✅ Just counts of which words appear near each other
  • ✅ Human-readable (you can inspect the matrix)
  • ✅ Deterministic (same document → same vectors)
  • ✅ Privacy-preserving (computed locally)
  • ✅ Fast to build and search
  • ✅ Small storage footprint

Instead of asking "what neural network thinks GPS is similar to", we ask "what words actually appear near GPS in YOUR documents". Simple, transparent, effective!

🔧 Troubleshooting

Common Issues

Documents won't upload

Symptoms: File picker closes but nothing happens, or upload progress stuck

Solutions:

  • Check browser console (F12) for JavaScript errors
  • Ensure file is under 50MB (very large files may timeout)
  • Try a different browser (Chrome recommended)
  • Verify file format is PDF, DOCX, or TXT
  • Check available disk space (IndexedDB requires free space)

Search returns no results

Symptoms: Search completes but shows "No results found"

Solutions:

  • Verify documents are selected in Step 1
  • Try broader search terms (e.g., "revenue" instead of "Q3 2024 revenue projections")
  • Check if documents actually contain the terms you're searching for
  • Try searching individual documents to narrow down the issue

Chat package too large for AI

Symptoms: AI service rejects paste, or paste truncated

Solutions:

  • Switch from "Large" to "Standard" account tier setting
  • Reduce number of query topics
  • Search fewer documents at once
  • Use more specific queries to reduce result size
  • Upgrade to paid AI subscription for larger paste limits

Lost all my documents

Symptoms: Document library shows 0 documents

Solutions:

  • Check if you're in the same browser and profile as before
  • Look for backup files you may have created
  • Check browser settings to ensure IndexedDB wasn't cleared
  • If using private/incognito mode, data is cleared when window closes

⚠️ Prevention: Regularly export backups via "Backup Data" button

Slow performance with many documents

Symptoms: Search takes >5 seconds, UI laggy

Solutions:

  • Close other browser tabs to free up memory
  • Use collections to search subsets rather than all documents
  • Clear browser cache (not IndexedDB data)
  • Consider splitting into multiple collections for better organization

Browser-Specific Issues

Safari

  • IndexedDB storage may be limited to 50MB per origin
  • Private browsing mode has stricter limits
  • Solution: Use Chrome/Firefox for large document sets

Firefox

  • May prompt for storage permission on first upload
  • Containers isolate IndexedDB per container
  • Solution: Use same container consistently

Mobile Browsers

  • Limited memory may cause large document processing to fail
  • File picker behavior varies by OS
  • Solution: Use desktop browser for best experience

Getting Help

If you've tried the above solutions and still have issues:

  • Check GitHub issues: Report a bug
  • Include browser version, OS, and specific error messages
  • Attach browser console output if possible (F12 → Console tab)

📚 Additional Resources

GitHub Repository

Source code, issue tracking, and contribution guidelines

View on GitHub

About Invantia

Learn about the company, products, and founder

Read About Page

Security & Privacy

Detailed information on data security and privacy practices

Security Information

Desktop Application

Start using Invantia Desktop right now

Launch Desktop