PAUL IACOBUCCI

CS student @ Cornell, 3x SWE/ML Intern @ L3Harris.

I'm interested in AI infrastructure, ML systems, and applied AI.

At L3Harris, I had responsibility over multiple applications for first responders spanning the entire stack. I created UI, managed a multi-threaded Golang backend, integrated networking and video streaming, worked with edge ML, made custom Linux OS distros, and worked with embedded systems and hardware.

I've also built a web/mobile app backed by Cornell servicing 45,000+ athlete alumni.

On campus, I research AI infrastructure in the Zhang lab. In Spring '26 I was on a project specifically studying MoE models, and I'll continue that over the summer. Next semester I'm looking to expand to other projects.

scroll for more ↓

Experience

Cornell Zhang Research Group
Research Assistant: SP26 - Profiled Mixture-of-Experts inference on 8×H100 GPUs for an industry client.
L3Harris Technologies
Software Engineering Intern: Three internships across software, embedded, and FPGA. Returning summer 2026 for ML inference on Qualcomm SoCs.
RapStudy
Software Engineering Intern: Full-stack development on a DoED-backed EdTech platform with a 3-engineer Cornell team.

Projects

Scout (Backed by Cornell)
A platform helping Cornell student-athletes network with and track relationships across 45,000+ Cornell athletic alumni. Solo full-stack build (Next.js + Expo/React Native) on a multi-tenant Postgres backend, shipped to 200+ active users.

View Site
Lion AI Detection Suite
Trained a CNN on librosa features and ElevenLabs deepfakes; deployed via ONNX to mobile, Chrome extension, and desktop. Real-time audio capture + sliding-window inference + user alerts, shipped to 20+ users. Built with PyTorch and React Native.

View Site
HFT Mixture-of-Experts FPGA Engine
An FPGA trading pipeline that runs end-to-end in 444ns at 83.3M messages/sec. Register-partitioned limit order book, sparse MoE router pipelined to one trade per cycle, bit-exact RTL/C++ verification in Verilator.

View GitHub
Mini-TensorRT: DL Graph Compiler
A C++ deep-learning graph compiler with a CUDA backend. Conv-ReLU-Add fusion cuts DRAM traffic for a 25% end-to-end speedup; paged-attention KV cache doubles batch concurrency.

View GitHub
Triton GPU Performance Kernels
Fused Triton kernels on H100. LayerNorm runs 45.7% faster with symmetric FP8 quantization. Scaled FlashAttention to 16K context by tiling for SRAM and computing softmax online in one pass.

View GitHub
Digital Level & Impact Monitor
An interrupt-driven tilt sensor and impact monitor on the FRDM-KL46Z (Cortex-M0+). Sleeps in __WFI between PIT timer wakeups; ARM assembly for the trig math; I2C accelerometer reads and UART alerting on impact.

View GitHub

Hackathons

Point72 Cubist Hackathon
Built an AI-orchestrated modular chess engine evaluation system. Used Claude via an MCP server to autonomously test, benchmark, and compare diverse AI-generated chess engines using SPRT and perft.

View GitHub
UC Berkeley AI Hackathon
Vocera: Biometric authentication and synthetic voice detection system built leveraging FastAPI, SpeechBrain, and OpenAI Whisper.

View GitHub
AppDev Hack Challenge FA24
LockedIn: Professional networking application. Awarded Best UI.

View GitHub

Some other things about me

I'm chasing a 315 lb (3 plate) bench press, and my current PR is 305
I love golf, someday I hope to be scratch, but I'm usually ~10-12 handicap
I help my cousin run a real estate business on the side. Check it out.