About Me

I currently work as an Applied Scientist for Amazon Web Services. At AWS, I have developed extensive expertise in fine-tuning latest generation of large language models (LLMs) using techniques like continued pre-training, supervised finetuning, direct preference optimization (DPO), and retrieval-augmented generation (RAG) approaches. I have made significant contributions to enhancing the capabilities of LLMs for AWS use cases. This includes developing a continual pre-trained LLM with AWS-specific world knowledge, leveraging SFT & DPO for improved alignment and RAG for grounding models. My work has spanned internal AI-powered chat tools, text summarization, autonomous actions, and insights generation for sales customer meetings at AWS. I have collaborated closely with cross-functional teams to integrate these AI models into various AWS applications and products, leveraging state-of-the-art natural language processing (NLP) and machine learning (ML) techniques to deliver tailored solutions for the AWS ecosystem.

Research Background

Prior to Amazon, I pursued my graduate research at UMass Amherst under the guidance of Prof. Andrew McCallum. My research interests revolve around various aspects of machine learning and Natural Language Processing (NLP). The primary objective of my research is to enhance machine learning models by enabling them to generalize in a manner similar to humans. This involves focusing on aspects such as efficiency in data and computing, leveraging unsupervised learning signals, and incorporating external knowledge. Lately, my attention has been dedicated to enhancing the generalization capabilities of NLP models when confronted with limited human-labeled data. I achieve this through approaches like meta-learning, self-supervised learning, and multi-task learning. I had the privilege of conducting extensive research in the field of machine learning-based gene annotation techniques, under the invaluable guidance of Prof. Cynthia Baldwin and Janice Telfer. Throughout this endeavor, my focus was primarily on two key methodologies: hidden Markov models (HMMs) and support vector-based techniques. During my research, I delved into the intricacies of hidden Markov models, exploring their applications and fine-tuning them to address gene annotation challenges. I had extensive experience as an Machine learning engineer at Samsung where I primarily worked on optimizing the packets shared in the LTE layer of modem chipset.

Interest

Continual Pretraining of Large Language model
Supervised Finetuning
Direct Preference Optimization using RHLF
Unsupervised Representation Learning
Natural Language Processing
Few-Shot / Meta-Learning
Deep Learning for Causal Inference

Karthick Prasad Gunasekaran

Research Background

Interest