Dark red right-pointing play button with a black border.

The Librarian: High Fidelity Data for High Integrity Learning

At CustomAILab, students don't just "chat" with a bot. We anchor Cal’s intelligence to your child’s specific academic reality. To ensure Cal remains an expert, we use a specialized system—The Librarian—to process every document through a deep-scan engine (Mathpix) that preserves complex mathematical LaTeX formulas and prevents "formula fracturing". This engine also turns the tables and columns commonly used in Social Studies documents into text that Cal can read. To maintain this level of precision, students have two distinct methods for providing information to Cal…

Method B: The Knowledge Base "Vault" (Strategic Context)

Best for: Long-term academic anchoring. These documents allow Cal to offer strategic advice on course completion and exam preparation. Students upload pdf files here by clicking the ‘upload to KB button’, or by dragging the file into the window above it.

  • Target Documents: Course syllabi, unit outlines, summative assessment guidelines, exam formats, and review sheets.

  • The Vault: These files are stored in your child’s private Knowledge Base in our secure Montreal-based database.

  • Strategic Advantage: By "reading" the entire syllabus, Cal can help students manage their time, predict exam themes, and ensure no curriculum requirement is missed.

  • Availability: Exclusive to Undergrad and Graduate tiers.

Method A: The Chat Window (Active Work)

Best for: Current homework, specific assignment prompts, and tonight’s practice problems. Students can upload pdf files here by simply dragging them into the chat window.

  • Format: Digital PDF or high-quality image of handwritten work.

  • File Limit: 10MB maximum per file.

  • Token Efficiency: Every document uploaded directly into the chat increases the "token consumption" for that session. To maximize your monthly allowance, only upload the specific pages or questions Cal needs for the immediate task.

  • Availability: Available to all tiers (Foundation, Undergrad, and Graduate).

Why the 10MB Limit? (Precision & Purity)

There is a 10MB limit for all uploads. This isn't a storage restriction; it is a fidelity requirement.

  1. Preventing "Data Pollution": Large, monolithic files (like a 400-page textbook) contain "noise"—repetitive headers, footers, and page numbers—that can confuse an AI’s retrieval process. By uploading smaller, focused sections (e.g., a single chapter), you ensure the AI's "search" remains high-signal and accurate.

  2. Avoiding KB Bloating: If a Knowledge Base becomes "bloated" with redundant or irrelevant data, the AI may begin to hallucinate or struggle to find the specific "needle" in the haystack. Smaller files allow for better "chunking," meaning Cal can retrieve exactly the right sentence for the right question. Cal already has access to a vast range of OpenStax textbooks in his own KB, so textbook uploads should only be valuable if a teacher insists on a specific approach covered in that section of the textbook.

Data Sovereignty: The Montreal Vault

Regardless of the upload method, your data never leaves the "Vault."

  • Security: All logs and documents are stored in AWS Canada (Central), in Montreal, where it is subject to the most rigorous privacy law in North America, Quebec’s Bill 25. This is essentially a North American version of Europe’s GDPR.

  • Privacy: Our third-party processing partners (OpenAI, Google, Mathpix, Tavily, and Cohere) are accessed through enterprise API keys and are strictly prohibited from using your private curriculum data to train their public models.

  • Auditing: As the Architect, Andy retains the ability to prune and optimize your student KB to ensure it is optimized for search.