Rizq Khateeb - Machine Learning & Software Engineer

Projects

Business Listings Web Scraper

July 2025 | Independent Project

- Built a full-stack web scraper in 3 days to extract and structure business data, integrating GPT-4 via Stagehand to parse inconsistently formatted listings and display results

- Integrated Supabase for persistent storage by defining schema, storing scraper results via backend API, and dynamically displaying data in a Next.js frontend table

Skills: Next.js · Application Programming Interfaces (APIs) · Web Scraping

Agentic LLM for Medical Coding Exams

July 2025 | Independent Project

- Built a PDF-to-JSON pipeline (Python) with an agentic LLM (GPT-4) interface that selectively retrieved and applied medical code references using structured prompts and RAG-style logic

- Improved automated medical exam accuracy from 84% to 96% by refining retrieval quality and grounding model outputs in curated reference data

GitHub

Skills: Retrieval-Augmented Generation (RAG) · Large Language Models (LLMs) · Medical Coding · Medical Technology

Bedtime Story Generator with AI Critique Loop

May 2025 | Independent Project

- Developed an LLM-driven feedback loop to generate and iteratively refine children's bedtime stories through structured storytelling, evaluation, and revision phases

- Designed and implemented prompt strategies in Python using OpenAI GPT-3.5-turbo to ensure coherence, creativity, and age- appropriate content

GitHub

Skills: Prompt Engineering · Large Language Models (LLMs)

Anomaly Detection for Radio Telescope Data

April 2025 | Independent Project

- Designed and implemented a custom ConvNet model in Python with PyTorch to classify nine distinct anomaly types in LOFAR ROAD radio telescope data, using on-the-fly lazy loading of HDF5 files for efficient memory management and scalable data handling

- Optimized model performance through grid search-based hyperparameter tuning and completed the full development lifecycle, including preprocessing, training, and evaluation, within 4 days, showcasing rapid prototyping and deep learning skills

GitHub

Skills: Convolutional Neural Networks (CNN) · PyTorch · Memory Management · Data Modeling

LLM Distillation For Financial Reports

August 2024 - December 2024 | Collaborative Project Associated with University of Southern California

- Applied step-by-step distillation techniques to reduce complexity of thousands of financial report entries, improving model interpretability and accessibility for financial forecasting

- Benchmarked large models (LLaMA 3, Claude 3.5, FinGPT) and distilled their reasoning into lightweight models (T5, GPT-2), achieving 99% faster inference and a 13% F1 score improvement on domain-specific tasks

Skills: Deep Learning · Large Language Models (LLMs) · Python (Programming Language)

Emotion Transition And Paraphrasing Using LLMs

January 2024 - May 2024 | Collaborative Project Associated with University of Southern California

- Curated datasets through utilization of existing datasets, and fine-tuned well-known LLMs, such as GPT-2, BART, and T5, for paraphrasing and emotion transition with application of zero-shot, few-shot, and supervised training methods

- Performed extensive research on six metrics (three custom metrics for emotion transition and three standard paraphrasing metrics) to determine the best-performing model for emotion transition and retaining sentence meaning

Skills: Large Language Models (LLMs) · Natural Language Processing (NLP) · Fine Tuning

Implementation of Abbasi et al. LMBiS-Net

August 2023 - December 2023 | Collaborative Project Associated with University of Southern California

- Implemented first open-source software deployment of LMBiS-Net, a ConvNet model utilized in medical image segmentation

- Compared performance of original research on 4+ datasets, focusing on accuracy and computational efficiency

Skills: Neural Networks · Convolutional Neural Networks (CNN) · Applied Machine Learning

COVID vs. Pneumonia vs. Normal Lung X Rays

March 2022 - June 2022 | Collaborative Project Associated with University of California San Diego

- Developed a ConvNet-based diagnostic model to classify lung X-rays as COVID, pneumonia, or normal, leveraging image masks for preprocessing and optimizing accuracy to over 80% through data augmentation and hyperparameter tuning

- Tested on dozens of X-ray images, enhancing precision in performance measurement and automated medical imaging detection

Skills: Machine Learning Algorithms · Image Masking · CNNs · Image Classification

Unsupervised ML Techniques on CIFAR-10

January 2022 - March 2022 | Collaborative ProjectAssociated with University of California San Diego

- Compared K-Means Clustering to a combination of PCA and Gaussian Mixed Models to find options with better clustering performance

- Tested methods on CIFAR-10, a publicly available image database with 60,000 color images of varying categories

Skills: Principal Component Analysis · Gaussian Mixed Models · Automated Clustering · Unsupervised Learning

Barcode Scanner for Nutritional Information

March 2021 - June 2021 | Independent Project Associated with University of California San Diego

- Implemented barcode scanner utilizing webcam to fetch all nutritional data from an online database with thousands of entries

- Used OpenCV (cv2) for image capture and ZBar (pyzbar) for barcode decoding into the model

Skills: Python (Programming Language) · Computer Vision · Applied Computer Science

CS:GO Winner Predictions

March 2021 - June 2021 | Collaborative Project Associated with University of California San Diego

- Predicted winner of eSports tournaments by utilizing predictive factors and logistic regression

- Improved accuracy by choosing the best feature parameters using subset selection and testing models with K-fold CV

Skills: Logistic Regression · Subset Selection · Cross Validation · Statistical Data Analysis