Ahmet Inci

Ph.D.
Electrical and Computer Engineering
Carnegie Mellon University

Email: inciaf [AT] gmail [DOT] com

Affiliation: [EnyAC] [OPAL]

I am a Senior Deep Learning Performance Architect at NVIDIA, working on future hardware architectures and optimizations to advance the state-of-the-art in deep learning performance and energy-efficiency. Previously, I was a Machine Learning Engineer in Apple Neural Engine Compiler Team at Apple. I received my Ph.D. from CMU, co-advised by Prof. Diana Marculescu and Prof. Gauri Joshi. My dissertation was titled "Scalable and Efficient Systems for Deep Learning". Before joining CMU, I received my B.Sc. degree in Electronics Engineering at Sabanci University.

My research interests include Systems for ML, HW/SW Co-Design, and Efficient Deep Learning.

My Ph.D. research focused on designing scalable and efficient systems and ML models using HW/ML model co-design techniques to achieve the best of both worlds. I worked on quantization-aware DNN accelerator and model co-exploration through architecture-level modeling and efficient design space exploration. Before that, I worked on scalable and efficient reinforcement learning training on CPU-GPU systems. Additionally, my previous work has explored how to utilize emerging non-volatile memories in GPU architectures for DL workloads.

Publications

QUIDAM: A Framework for Quantization-Aware DNN Accelerator and Model Co-Exploration
Ahmet Inci, Siri Garudanagiri Virupaksha, Aman Jain, Rudy Chin, Venkata Vivek Thallam, Ruizhou Ding, Diana Marculescu
ACM Transactions on Embedded Computing Systems
Efficient Deep Learning Using Non-Volatile Memory Technology
Ahmet Inci, M. Meric Isgenc, Diana Marculescu
Book Chapter in Embedded Machine Learning for Cyber Physical, IoT, and Edge Computing
QADAM: Quantization-Aware DNN Accelerator Modeling for Pareto-Optimality
Ahmet Inci, Siri Garudanagiri Virupaksha, Aman Jain, Venkata Vivek Thallam, Ruizhou Ding, Diana Marculescu
ML for Computer Architecture and Systems Workshop, ISCA'21
QAPPA: Quantization-Aware Power, Performance, and Area Modeling of DNN Accelerators
Ahmet Inci, Siri Garudanagiri Virupaksha, Aman Jain, Venkata Vivek Thallam, Ruizhou Ding, Diana Marculescu
2nd On-Device Intelligence Workshop, MLSys'21
Cross-Layer Design Space Exploration of NVM-based Caches for Deep Learning
Ahmet Inci, M. Meric Isgenc, Diana Marculescu
12th Non-Volatile Memories Workshop (NVMW) 2021
The Architectural Implications of Distributed Reinforcement Learning on CPU-GPU Systems
Ahmet Inci, Evgeny Bolotin, Yaosheng Fu, Gal Dalal, Shie Mannor, David Nellans, Diana Marculescu
6th Workshop on Energy Efficient Machine Learning and Cognitive Computing (EMC2) 2020
DeepNVM++: Cross-Layer Modeling and Optimization Framework of Non-Volatile Memories for Deep Learning
Ahmet Inci, M. Meric Isgenc, Diana Marculescu
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
DeepNVM: A Framework for Modeling and Analysis of Non-Volatile Memory Technologies for Deep Learning Applications
Ahmet Inci, M. Meric Isgenc, Diana Marculescu
Design, Automation and Test in Europe Conference (DATE) 2020
Specializing Neural Networks for Heterogeneous Systems
Ahmet Inci, Danny Loh, Lingchuan Meng, Naveen Suda, Eric Kunze
US Patent Application 16/724,849, Filed: December 2019
Solving the Non-Volatile Memory Conundrum for Deep Learning Workloads
Ahmet Inci, Diana Marculescu
8th Workshop on Architectures and Systems for Big Data (ASBD), ISCA'18

Work Experience

NVIDIA - Deep Learning Compute Architecture
January 2024 - Present

Apple - Apple Neural Engine Compiler Team
August 2022 - January 2024
Research and development on neural engine compiler for ultra-low power devices

NVIDIA Research - Architecture Research Group in collobaration with AI Research
May 2021 - August 2021
Optimizing Power Management of Deep Learning Systems with Reinforcement Learning

NVIDIA Research - Architecture Research Group in collobaration with AI Research
May 2020 - August 2020
Towards Scalable and Efficient Reinforcement Learning on CPU-GPU Systems

ARM - ML Technology Group
May 2019 - August 2019
NASH: Neural Architecture Search for Heterogeneous Systems

Cadence Design Systems - Virtuoso ML Team
May 2018 - August 2018
ML-based Recommendation System for EDA Tools

Service

Program Committee Member for MLSys'26, MLSys'25
Reviewer for DAC'21, EMSOFT'21, MLSys'20, DAC'20, DAC'19, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on Computers, IEEE Journal of Selected Topics in Signal Processing
Sub-reviewer for ISCA'22, HPCA'22, MLSys'21, MLSys'19, ISCA'19