AI for Cybersecurity
I navigate the cyber landscape with a passion for creating tools that shield people from constant online threats. Currently a PhD Candidate in Information Studies at McGill University, I focus on the intersection of Machine Learning and System Security.
As a determined, fast learner, I deeply value diversity, open communication, and interdisciplinary collaboration. My work spans from developing novel embeddings for binary code using foundation models to creating intelligent fuzzing tools for Web Application Firewalls. Outside the lab, you can find me enjoying music, technology, and bicycling.
McGill University, Canada
Jan 2022 - Present | GPA: 4.00/4.00
Focus: Code Understanding, LLM-based Embeddings, Compiler Behavior Prediction.
Shahrood University of Technology
2018 - 2021 | GPA: 3.54/4.00
Thesis: Intelligent fuzzing for WAF vulnerability detection.
Shahrood University of Technology
2014 - 2018
Université de Montréal
June 2024
Canadian Center for Cybersecurity (CCCS)
July 2023
McGill University
2022 - Present
Teaching Python Programming at School of Information Studies, McGill Designed practical SQL Injection workshops (OWASP Zap, SQLmap) and taught programming to grad students.
My current research focuses on bridging the gap between binary analysis and modern AI, making low-level code understandable and secure.
The Challenge: Different compilers optimize code differently, making the same source code look completely different in binary.
The Solution: I developed a model using Llama 3 that generates unified embeddings for binary functions, neutralizing compiler variations to enable accurate cross-compiler similarity detection.
The Challenge: Function inlining (where a compiler replaces a function call with the function body) destroys the structural signature of code, confusing similarity detectors.
The Solution: FIN predicts inlining decisions and normalizes the binary representation. This allows security tools to match code even when heavy optimization has altered its structure.
The Goal: To help reverse engineers understand legacy or malware code faster.
The Contribution: We created a dataset and a model to automatically generate human-readable natural language descriptions from raw assembly code, effectively "documenting" binaries automatically.
The Problem: Web Application Firewalls (WAFs) are often bypassed by novel payloads.
The Solution: I proposed RAT, a Reinforcement Learning agent that adaptively learns to bypass WAFs. It uses an epsilon-greedy policy to mutate payloads, discovering vulnerabilities that static scanners miss.
Methods and Systems for Generating Description for Assembly Functions
Yuki, J.Q., Amouei, M. and Fung, B.C.M., BlackBerry Limited, 2024. U.S. Patent Application 18/339,139.
Open source contributions and tools developed during my research.
Tool to analyze ELF binaries and identify/normalize inlined functions to create ground truth datasets.
Resources and scripts related to intelligent fuzzing for web application firewalls (based on MSc work).
Learning Embeddings of Neutralized Assembly using Llama with Self-supervision.