Kareem Ibrahim

Kareem Ibrahim

Ph.D. Candidate, Computer Engineering, University of Toronto

I research efficient AI systems, with a focus on compression, quantization, data movement, and hardware/software co-design for bandwidth- and memory-constrained deployment.

About

I am a Ph.D. candidate in Computer Engineering at the University of Toronto, advised by Prof. Andreas Moshovos. My work sits at the intersection of machine learning systems, computer architecture, and practical AI deployment. Recent projects span tensor compression, low-precision representations, training-efficiency evaluation, computational graphics systems, and responsible human-in-the-loop AI workflows.

Research

Efficient ML Systems, Quantization & Compression

Compression and quantization-aware methods for ML workloads where data movement is a bottleneck, with emphasis on practical representations that reduce bandwidth and memory pressure without changing model behavior.

  • Low-precision and entropy-aware representations for ML tensors
  • Compact codec state and high-throughput compression

Representative: Shannonic; Atalanta; Every Bit Matters

Hardware/Software Co‑Design for AI Data Movement

Systems work that connects compression algorithms and quantized data formats to real implementation constraints, including codec state, cache behavior, throughput, hardware cost, and deployment across memory, network, and edge–cloud links.

  • Codec-aware software and hardware implementations
  • Data movement reduction across system boundaries

Representative: Shannonic; Atalanta

Evaluation Methodology for Efficient Training

Evaluation methods for efficient training systems, with emphasis on fair comparisons, reproducible baselines, and metrics that reflect practical end-to-end training behavior.

  • Wall-clock and time-to-target evaluation
  • Benchmarking methodology for training acceleration

Representative: Selective Workload Skipping (IHIET‑AI 2025)

Computationally Efficient Graphics

Systems work on efficient graphics pipelines, with emphasis on compression, latency, quality, and resource-aware deployment for emerging visual representations.

  • Computationally optimal Gaussian splatting
  • Rate–distortion–latency trade-offs

Representative: COGS; Gaussian splatting systems

AI-Assisted Educational Systems

Human-in-the-loop AI-assisted grading workflows for large programming courses, focusing on rubric alignment, grading consistency, human oversight, and responsible deployment.

  • Rubric-grounded feedback and grade suggestions
  • Instructor- and TA-controlled grading workflows

Current work: AI-assisted grading for large programming courses

Selected Prior & Collaborative Work

Broader systems work on on-device neural interfaces, with emphasis on reducing resource requirements while preserving application-level quality.

  • Scalable spike sorting for untethered brain-machine interfaces
  • Energy- and throughput-aware deployment

Representative: Marple (ASPLOS 2024)

News

Publications

2026

Under Submission

2025

2024

Earlier Publications

Awards

Teaching