Large Language Models for Assisting American College Applications

Jan 23, 2026•Zheng Liu, Weihang You, Peng Shu +14•View PDF

TL;DR Highlight

A practical LLM system architecture paper for US college application assistance, built on RAG + Human-in-the-loop design.

Who Should Read

Backend/full-stack devs building LLM assistants in high-stakes domains (finance, healthcare, law). Especially useful if you're wrestling with source credibility management and hallucination prevention in RAG pipelines.

Core Mechanics

mapping-first paradigm: instead of generating answers directly from form questions, first map them to canonical schema fields, then generate answers — ensures consistency across multiple portals
Source-separated RAG: official sites / curated FAQs / community forums are managed in separate indexes, with tiered retrieval applied in order: official > FAQ > community at search time
strict human-in-the-loop: no auto-submission ever. AI only suggests, and users must manually copy-paste to apply changes — intentional friction by design
agentic retrieval: instead of auto-searching every query, the LLM calls the search_knowledge_base tool directly via function calling only when needed
structure-aware retrieval: without vector embeddings, understands document tree structure (table of contents, section boundaries) and lets LLM navigate the tree to extract relevant sections — no embedding infrastructure required
extraction over generation principle: refuses to generate essays. Only extracts and recombines information from uploaded documents — simultaneously prevents AI misuse and preserves academic integrity

Evidence

84% fill rate on common applications (auto-generated answers for 42 out of 50 questions)
92% citation validity: generated answer citations have cosine similarity ≥ 0.7 with actual sources
82% of human evaluators rated answers as "useful" or "very useful" (4-5 on a 5-point scale)
89% of evaluators said citation mechanism increased trustworthiness; answer editing time 15-20s (vs 2-3 min writing from scratch)

How to Apply

When separating source indexing in a RAG system: split official docs / FAQ / community content into separate collections, then re-rank by source type metadata at search time to guarantee high-credibility answers
When designing form/document automation pipelines: don't pipe questions directly into LLM — first map to canonical fields (e.g., user.academics.gpa) and construct queries with type/format constraints to improve cross-form consistency and output format compliance
Reducing search costs in LLM agents: instead of auto-RAGging every query, register search_knowledge_base as a function tool so the LLM calls it only when needed — simultaneously cuts context consumption and latency

Code Example

snippet

# Agentic Retrieval pattern example (OpenAI function calling)
tools = [
    {
        "type": "function",
        "function": {
            "name": "search_knowledge_base",
            "description": "Search the indexed admissions documents or student profile for relevant information",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {"type": "string", "description": "The search query"},
                    "source_type": {
                        "type": "string",
                        "enum": ["official", "faq", "community", "personal"],
                        "description": "Preferred source type (official > faq > community)"
                    },
                    "top_k": {"type": "integer", "default": 10}
                },
                "required": ["query"]
            }
        }
    }
]

# LLM calls the search tool only when needed
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages,
    tools=tools,
    tool_choice="auto"  # LLM decides on its own whether a search is necessary
)

# If there is a tool_call, execute the search and re-invoke
if response.choices[0].message.tool_calls:
    results = execute_search(response.choices[0].message.tool_calls[0])
    messages.append({"role": "tool", "content": results, ...})
    final_response = client.chat.completions.create(model="gpt-4o-mini", messages=messages)

Terminology

RAGA pattern where the LLM retrieves external documents to use as grounding when answering. Like looking up a book at the library before responding — the model references live material rather than relying purely on what it memorized during training.

Human-in-the-loopA design pattern where AI doesn't execute automatically but must pass through human confirmation. Like "Do you want to transfer this amount? Click confirm" instead of auto-transferring.

canonical schemaAn internal standard format that converts data from different formats into a single normalized form. Even though each university application differs, internally everything is managed with unified field names like user.academics.gpa.

hallucinationThe phenomenon where an LLM confidently outputs plausible-sounding false information without any grounding. Like an overconfident person making stuff up rather than admitting they don't know.

agentic retrievalInstead of automatically searching every question, the AI makes a judgment call and only invokes the search tool when it determines retrieval is necessary. Like a person who only googles something when they genuinely don't know the answer.

vector embeddingA technique that converts text into number arrays (vectors) so that semantically similar texts are positioned close together. "Apple" and "fruit" are close; "apple" and "car" are far apart.

tiered retrievalA search approach that prioritizes multiple sources in order of credibility. Like escalating through official site → FAQ → community posts if you can't find something at the first level.

Related Resources

Original Abstract (Expand)

American college applications require students to navigate fragmented admissions policies, repetitive and conditional forms, and ambiguous questions that often demand cross-referencing multiple sources. We present EZCollegeApp, a large language model (LLM)-powered system that assists high-school students by structuring application forms, grounding suggested answers in authoritative admissions documents, and maintaining full human control over final responses. The system introduces a mapping-first paradigm that separates form understanding from answer generation, enabling consistent reasoning across heterogeneous application portals. EZCollegeApp integrates document ingestion from official admissions websites, retrieval-augmented question answering, and a human-in-the-loop chatbot interface that presents suggestions alongside application fields without automated submission. We describe the system architecture, data pipeline, internal representations, security and privacy measures, and evaluation through automated testing and human quality assessment. Our source code is released on GitHub (https://github.com/ezcollegeapp-public/ezcollegeapp-public) to facilitate the broader impact of this work.