I am an incoming graduate student in Data Science at Luddy School of Informatics, Computing, and Engineering at Indiana University Bloomington this fall. My interest spans across several domains where I have previously worked, including but not limited to, such as Network Analysis, Semantic Parsing, Speech Processing, Natural Language Processing . In my senior year, I interned as a Data Scientist at the corporate strategy team of Tata Communications, to build an AI-powered Network Expansion Planning tool. Previously I worked on Code-mixed low-resource NLP (Dravidian Language Family) under the supervision of Prof. Bharathi Raja Chakravarthi. I am deeply fascinated by Natural Language Processing and its applications in multimodal learning and related interdisciplinary fields.
I recently completed my bachelors in Computer Science and Engineering at the Indian Institute of Information Technology Tiruchirappalli,.
Projects:
1. AI-based Network Expansion Planning -- (Mar 2022 - July 2022)
2. AI-Powered real-time Fraud Prevention as a Service (FPaaS) -- (Dec 2021 - Feb 2022)
Developed a multi-task learning framework for Sentiment Analysis and Offensive Language Identification in Dravidian Languages. Constructed Code-Mixed Datasets scraped from YouTube for Kannada-English language.
Developed a Deep Learning based architecture to detect diabetic retinopathy at two levels, a binary classification and fine-grained classification. Used Attention based vision models along with several data preprocessing techniques to achieve the best results.
- This project seeks to identify potential high bandwidth zones across all T1, T2, T3, etc., cities in India, where Tata Communications can expand its network presence to cater to customers' business needs.
- Implemented cutting-edge clustering algorithm, Diameter Clustering
- Created a streamlit-based web application and deployed it on docker, for it be used by Tata Communications' Network Expansion team
- The application points the potential places for network expansion based on feasibility requests, and gives the co-ordinates of the cluster
Developed Multi-task learning frameworks to address the lack of annotated data for low-resourced languages. Experimented with several loss functions and benchmarked the models on the datasets of three low-resourced languages. Experimental results indicate that Multi-task learning is effective on very closely related tasks, and the loss functions assigned to each tasks.(Source code)
Constructed a corpus consisting of ~7K comments for sequence classification. Used Google Translate API to translate all the code-mixed sentences into English. Computed the Weighted sum of the outputs of the two datasets to improve the performance of the models.(Source code)