Duong Anh (Tom) Nguyen
CS student · LLMs · agents · Robotics · shipping AI systems end to end
About
I'm Duong Anh (Tom) Nguyen, a B.S. Computer Science student at San Francisco Bay University (GPA 3.88). I focus on LLMs, retrieval, multi-agent systems, and production deployments—not just demos.
Day to day I work in Python with LangChain-style patterns, vector indexes (e.g. FAISS), GCP managed services, FastAPI-style APIs, and observability so failures are visible—not silent.
I'm a founding member and AI engineer at Quiche, Inc.: MCP servers so features pull the right context on demand, multi-agent routing for our assistant (roles, orchestration, retrieval), and GCP on Compute Engine, Cloud Run, and Cloud SQL so workloads can scale beyond my laptop.
I'm IT student staff at SFBU: Tier-1 diagnostics for hardware, software, and classroom-network issues—ticket triage, quick fixes, escalations when needed—and a digital inventory of IT assets so swaps, repairs, and deployments stay traceable.
At Sofeast Ltd. (LLM Research & Implementation, Feb–Apr 2025) I shipped a LangChain + FAISS RAG pipeline for document Q&A, benchmarked open-source vs. proprietary LLMs on latency and cost, and tightened prompts and retrieval settings to cut hallucinations before outputs reached users.
Outside class I'm building Desir (voice-first assistant on OpenAI Realtime API, orchestrator plus specialist agents, React/Vite + Python WebSockets + Pydantic AI) and TTorch (C++/pybind11 tensor + autograd core with a pip-installable Python surface).
I'm NVIDIA Certified Professional — Agentic AI. Recent placings include Top 3 at the SFBU IEEE NVIDIA GTC Hackathon 2025 (BleuLaMuse) and 1st place at ThinkNext 2026 (Uniflow), plus other hackathon builds in security and disaster-response ML.
I learn by building, write to think clearly, and share what I figure out along the way.
Stay hungry, stay foolish — Steve Jobs
Experience
Sep 2025 — Present
IT Student Staff

San Francisco Bay University — IT Department
- Delivered Tier-1 diagnostic support for desktops, laptops, classroom AV, printers, Wi-Fi, and software accounts—logging tickets, reproducing issues, and resolving or escalating with clear notes.
- Supported faculty and staff during class hours: quick swaps for faulty gear, password resets, VPN and SSO troubleshooting, and coordination with vendors when hardware needed replacement.
- Maintained a digital inventory of IT assets—serial numbers, deployment locations, warranty status—so refreshes, audits, and maintenance cycles stay accountable.
- Helped track lifecycle events (deploy, retire, repair) so procurement and classroom readiness stay aligned.
IT support · Networking · Hardware · Asset management · Help desk
Nov 2025 — Apr 2026
Founding Member, AI Engineer

Quiche, Inc. (ADU Marketplace)
- Architected an MCP (Model Context Protocol) server so AI features request structured context from internal tools and data stores instead of hard-coding prompts.
- Owned integration work between MCP surfaces and product APIs—schemas for tools, auth boundaries, and failure modes when retrieval misses.
- Designed a multi-agent stack for a smart assistant: explicit agent roles, orchestration rules, and retrieval pipelines that feed grounded answers.
- Provisioned GCP infrastructure—Compute Engine for workloads that need persistent VMs, Cloud Run for HTTP services, Cloud SQL for relational state—with an eye toward cost and cold-start behavior.
- Iterated on chunking, indexing, and evaluation loops as new assistant capabilities shipped.
MCP · Python · Multi-agent AI · GCP · Compute Engine · Cloud Run · Cloud SQL · Pydantic AI · FastAPI
May 2025 — Aug 2025
LLM Research and Implementation Intern

Sofeast Ltd.
- Built a FAISS RAG pipeline that ingests internal documents and answers BI-style questions with citations instead of manual lookup.
- Benchmarked open-source LLMs (Llama 3, Mistral, Gemma, Qwen, Deepseek) against proprietary APIs on latency, token cost, and answer quality for recurring query templates.
- Implemented evaluation passes—spot checks and scripted prompts—to compare models before promoting a configuration.
- Refined prompt templates and retrieval settings (top-k, filters, stop sequences) to reduce hallucinations and unstable formatting in production replies.
- Documented pipeline configuration and failure cases so full-time engineers could extend or debug the stack after the internship.
RAG · LangChain · FAISS · LLM · Python · Llama · Mistral · Prompt engineering
Education
Aug 2024 — Dec 2027
B.S. Computer Science

San Francisco Bay University · Fremont, CA
B.S. in Computer Science (expected Dec 2027). GPA: 3.85. Coursework spans Data Structures, AI, Algorithms, ML, Deep Learning, Multi-Agent Systems, Cloud Computing, Computer Vision, NLP, MCP, and FastAPI—applied in parallel through internships, hackathons, and personal builds.
Data Structures · Algorithms · AI · ML · Deep Learning · Multi-Agent Systems · Cloud Computing · Computer Vision · NLP · MCP · FastAPI
Sep 2021 — Jun 2024
High School

Le Hong Phong High School for the Gifted — Nam Dinh Province
Class of 2024 — concentration in Chemistry.
Chemistry
Projects
- Built a real-time, voice-driven assistant on OpenAI Realtime API—streaming speech input with low-latency paths toward tool calls and UI updates.
- Designed an orchestrator that routes user intent to specialist sub-agents (email, calendar, web search, messaging, knowledge base) with explicit handoffs instead of one monolithic prompt.
- Implemented permission-style gates so sensitive actions (send mail, run OS automation) only fire after user confirmation policies.
- Shipped a React/Vite SPA talking to a Python WebSocket backend; orchestration implemented with Pydantic AI for structured outputs and safer tool schemas.
- Integrated third-party APIs—Resend for mail, Serper for search—and macOS automation hooks where local actions are required.
- Added Logfire traces across orchestrator steps to debug multi-turn failures without guessing which agent dropped context.
OpenAI Realtime API · Pydantic AI · React · Vite · Python · WebSocket · Multi-agent · LLM
- Implemented a tensor library and autograd engine in C++ with Python bindings—mirroring PyTorch-style tensors, views, and backward passes for educational clarity.
- Built a dynamic computation graph with modular gradient primitives (Add, Multiply, ReLU, Dot) so new ops plug in without rewriting the whole engine.
- Implemented reverse-mode autodiff with correct topological ordering for multi-use tensors.
- Packaged the project as `pip install ttorch`: Python façade → pybind11 bindings → C++ kernels → autograd, making it easy to profile each layer.
- Added correctness-focused tests comparing numerical gradients against analytic grads for core ops.
- Stretch goal: keep the codebase readable enough to study training mechanics without hiding everything behind CUDA kernels.
C++ · Python · pybind11 · Autograd · Deep Learning · PyPI
- Two-stage CV pipeline: YOLOv11n finds faces in a webcam stream; a shallow classifier predicts seven emotion labels per crop.
- Used ViT-B/16 embeddings (768-D) as frozen features, feeding a custom logistic-regression head implemented in PyTorch and trained with L-BFGS.
- Built data loaders and label hygiene checks so train/val splits stay stratified across emotion classes.
- Implemented a real-time inference loop with frame skipping options for slower laptops and overlay rendering for qualitative debugging.
- Recorded offline metrics (accuracy, confusion matrix) alongside qualitative checks on challenging lighting.
Computer Vision · PyTorch · YOLO · ViT · Real-time
- Interactive React/JavaScript front-end that renders mazes and animates search frontiers as algorithms expand nodes.
- Implemented BFS, flood fill, and a randomized search variant with identical interfaces so students can compare behavior apples-to-apples.
- Supports configurable grid sizes and procedural wall layouts with deterministic seeds for reproducible demos.
- Surfaces stats live—nodes expanded, path length, wall-clock time—to compare asymptotics versus intuition.
- Python tooling for batch experiments where needed; UI emphasizes step-by-step pedagogy over black-box runs.
Algorithms · Python · React · JavaScript · Visualization
Writing
Draft / ongoing. Compares fully automated, hybrid (AI plus human review), and non-AI restoration for damaged or souvenir war images. We connect capability to authenticity, trust, and collective memory; surface risks such as biased training data and invented detail; and argue—treating photos as historical evidence—that ethical practice needs transparency, oversight, and institutional guardrails. Includes a checklist for spotting historical hallucination in outputs and early results from a 50-person vignette study on trust and acceptability among archivists, families, and online audiences.
Draft / ongoing. Explores Rapidly-Exploring Random Trees (RRT) for maze navigation when state and controls are represented in a vector form suited to continuous spaces and simple collision checks. Motivation, algorithm sketch, and how the vectorized setup affects sampling, expansion, and path quality compared to a grid-first mindset.
Reading
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding— Devlin et al.[Reading]
The BERT paper. Key insight: pre-training a model on a large corpus of text and then fine-tuning it on a specific task.
- GRE Pre Plus 2024— ETS[Reading]
Currently on GRE Pre Plus 2024. Excellent reference for GRE preparation.
The foundational transformer paper. Key insight: self-attention can replace recurrence entirely.
Other
Apr 2026 — Mar 2028
NVIDIA Certified Professional: Agentic AI

NVIDIA
- Earned NVIDIA Certified Professional — Agentic AI: multi-agent design patterns, tool use, orchestration, evaluation, and responsible deployment concepts aligned with NVIDIA’s agentic AI curriculum.
- Completed hands-on assessments covering agent lifecycle design, retrieval-augmented behaviors, and operational monitoring considerations.
- Credential valid through Mar 2028; complements hands-on work shipping MCP and multi-agent stacks in production.
NVIDIA · Agentic AI · Professional · Certification
Apr 2026
3rd Place — Wildfire detection & response
Innovate Bay (BayHawk 2.0)
- Weekend build at Verizon Innovation Lab context: fused live camera smoke/flame detection (YOLOv8) with NASA FIRMS thermal hotspots and local weather feeds so alerts could be corroborated or dismissed.
- Added LLM reasoning layers to summarize evidence, assign severity (low → critical), and propose concrete response steps instead of raw scores only.
- Backend built with FastAPI, async task orchestration, and PydanticAI agents coordinating specialized prompts.
- Frontend stack: Vite + Tailwind for rapid UI iteration; PostgreSQL for incident logs; Dockerized services for reproducible deploys; CI/CD toward Vercel-hosted assets.
- Demonstrated end-to-end flow from sensor ingestion to operator-facing dashboard within hackathon time constraints.
Hackathon · Multi-agent AI · YOLOv8 · NASA FIRMS · FastAPI · PydanticAI · GPT-4o · Innovate Bay · BayHawk 2.0 · Verizon Innovation Lab · FastAPI · Python · Vite · Tailwind CSS · PostgreSQL · Docker · GitHub · Git · GitHub CI/CD · Vercel



Mar 2026
Hackathon — KryptosProof

HackHayward (Cal State East Bay)
- 24-hour security sprint: multi-agent “red team” that launches real scanners (sqlmap, Nuclei, FFUF), deduplicates noisy output, and focuses on reproducible vulns only.
- Automatic remediation drafts plus verifier passes before writing findings into a SHA-256–signed PDF-style report for authenticity.
- Docker sandbox isolated risky tooling from host machines; live dashboard streamed job progress for judges.
- Lean narrative: fewer buzzwords, more evidence chains suitable for demo to engineers—not slide-only fluff.
Hackathon · Security · Docker · Nuclei · Multi-agent · HackHayward · KryptosProof


Mar 2026
1st Place — Uniflow (ThinkNext 2026 · SFBU IEEE)

ThinkNext 2026
- Career copilot for SFBU students: intake agents capture goals and constraints; critique agents workshop resumes with structured feedback loops.
- RAG stack embedded course catalogs + campus events—chunked, embedded, and retrieved via vector search so answers cite real offerings.
- Mock interview lane combined OpenAI TTS/STT with STAR-style scoring prompts so users rehearse aloud or typed.
- Streamlit shell for fast iteration during judging; modular agents implemented with PydanticAI orchestration patterns.
- Pitch emphasized responsible personalization—clear logging when suggestions touched sensitive academic data.
Hackathon · PydanticAI · RAG · OpenAI · TTS · STT · Streamlit · SFBU · Career

Oct 2025
2nd Place — Bit for Bit (receipts → inventory)
Innovate Bay (BayHawk 1.0)
- BayHack 2025 business-action track: mobile receipt capture → OCR extraction → structured MongoDB records → conversational assistant for reconciling stock counts.
- Designed prompts + retrieval guardrails so SMB owners could query inventory without exposing unrelated receipts.
- Full stack blended Node.js services, Python utilities for OCR glue, FastAPI gateways, and OpenAI APIs for natural-language reconciliation.
- Live pitch + demo with teammates showcasing inventory deltas versus spreadsheet chaos.
- Lessons logged on schema discipline—inventory reconciliation fails quietly when receipt categories drift.
Hackathon · BayHack · OpenAI API · MongoDB · FastAPI · Node.js · Python · OCR · Inventory
Now
Drafting two papers: AI restoration of war photographs (ethics + vignette study) and RRT with a vector maze formulation. Public repos in motion: TTorch (C++/Python autograd), Desir (agentic assistant), a real-time emotion CV stack, and a maze-solving visualizer. Tightening this minimal site to match reality.
Third place at Innovate Bay (BayHawk 2.0) for a wildfire stack: YOLOv8 + NASA FIRMS + weather context, fused with a multi-agent LLM pipeline (PydanticAI, GPT-4o, FastAPI) for severity and response planning.
HackHayward weekend: shipped KryptosProof — multi-agent, tool-using red team (sqlmap / Nuclei / FFUF), Docker sandbox, live dashboard, and signed reports for the vibe-coding security angle.
Won ThinkNext 2026 with Uniflow — a PydanticAI-orchestrated career copilot for SFBU: goal + résumé agents, RAG over courses/events, and a voice/text mock interview path with STAR-style feedback (Streamlit demo).
Joined Quiche (ADU marketplace) as a founding member — AI / DevOps focus: MCP server for contextual retrieval, multi-agent assistant design, and GCP (Compute Engine, Cloud Run, Cloud SQL) for production-shaped deploys.
Contact
Feel free to reach out. I'm always happy to chat about deep learning, open source, or interesting problems.