Sajid Hussain

Ex-ML Intern,

MS Data Science(AI/ML Specialization)
New Jersey Institute of Technology(NJIT)






  Sajid has proven track record of applying Python, Machine Learning(Github) and Deep Learning. His research interests lie at the intersection of Natural Language Processing, Generative AI and data science.

Before joining NJIT, he qualified for CBT-2 State Railways as part of his civil services preparation. He is an ardent reader and also likes to read about economy and the global state of affairs.


Generative AI: Working with Large Language Models

AWS Machine Learning Speciality(P)

Introduction to Large Language Models (Google Cloud)

Generative AI Fundamentals (GCP)

AWS Certified Cloud Practioner(P)

Deep Learning Specialization

Natural Language Processing Specialization

ChatGPT Prompt Engineering for Developers

Natural Language Processing
Deep Learning
Machine Learning
Applied Statistics
Data Mining
Language models Capstone Project Big Data 
Data Analytics with R Programming
Data Visualization


Deployed, configured Llama2(Large Language Models) on AWS Sagemaker, and optimized it to ensure efficient utilization of resources and optimal performance of the model.

Diagnosing and Mitigating Bias in Large Language Models
• Evaluated biases in large language models (LLMs) such as BERT, GPT-2, etc., addressing gender, race, and cultural biases. [BLOG]

Speaker Classification using Transformers
• Applied Self-Attention mechanisms and optimized transformer parameters, on a speech dataset featuring 600 speakers, resulting in a 20% reduction in training time with a commendable categorization accuracy score of 96.85%, demonstrating expertise in advanced machine learning techniques.

Adversarial Attack
• Implemented non-targeted iFGSM from scratch in PyTorch on CIFAR-10 dataset to generate adversarial samples which successfully reduced the model's accuracy from 95% to 0%.

Mastering Lunar Landing: Cutting-edge AI with OpenAI's Gym
• Applied Policy Gradient algorithm using Deep Reinforcement learning to accurately guide the Lunar's landing.

Unveiling Culinary Delights with Image Classification Technology
• Trained a neural network model using a dataset comprising 9867 jpg images followed by hyperparameter tuning, optimizer selection, exploration of complex model architectures suitable for image classification, and consideration of cross-validation techniques to optimize model generalization

Revolutionizing Wireless Communication: User Localization Through Self-Supervised Learning in IoT Environments
• Explored the efficacy of self-supervised learning for user localization based on Channel State Information (CSI), aiming to reduce reliance on labeled data and accommodate changing radio environments.

• Utilized a dataset obtained through a massive MIMO channel sounder, comprising labeled samples with real and imaginary parts of estimated channel matrices, signal-to-noise ratios (SNR), and ground truth positions of transmitters, as well as unlabeled and test data.


Case Study: Quora Question pair similarity problem
Identified which questions asked on Quora are duplicates of questions that have already being asked.

Real world problem: Predict rating given product reviews on Amazon
Used Natural Language Processing (NLP) and machine learning to extract valuable insights from Amazon product reviews. Predictied rating values based on customer reviews, providing a valuable tool for businesses to understand customer sentiment and make data-driven decisions.

Text Summarization & Keyword Extraction
Extracted key information from lengthy texts using the nltk library. Can be extended for efficient studying and lesson preparation for textbooks. and experimented with tfidf vector.

HR Dashboard
Designed and developed an HR Dashboard using Tableau.

Data Science Kaggle Challenge
Achieved a SMAPE score of 57.3 in the ”AMP®-Parkinson’s Disease Progression Prediction” hosted by Kaggle, using protein and peptide data measurements of patients. • Experimenting to discover important breakthrough information about which molecules change as Parkinson’s disease progresses.

Recommendation system
Performed Exploratory data analysis on 100 million ratings from 480,000 randomly chosen, anonymous Netflix customers over 17000 movie titles and computed similarity matrices. • Presently building a recommendation system for a fast-growing startup with more than 200,000-page visits.

Web-Scraped data
Web-Scraped extracted:Title, Authors, Author Affiliations, Correspondence Author, Correspondence Author's Email, Publish Date, Abstract, Keywords, Full Paper (Text format) from Chem-Bio Informatics Journal