LogoLanguage
ShellSquare Softwares (P) Ltd

1st floor, Periyar Building, Technopark Phase I, Kerala. , 695581

Data Scientist

Closing Date:22,Sept 2025
Job Published: 22,Aug 2025

Brief Description

About the Role: 

We are looking for a Generative AI Expert with strong knowledge in Retrieval-Augmented 
Generation (RAG) and machine learning/deep learning (ML/DL). You will work on building intelligent 
systems that combine large language models (LLMs) with document retrieval to generate accurate 
and context-aware responses. 

Your role will involve developing and improving ML/DL models, fine-tuning LLMs, and integrating 
retrieval systems using vector databases. You’ll collaborate with cross-functional teams to build realworld AI solutions that make use of both unstructured data (like PDFs and web pages) and structured 
sources. 
 
Key Responsibilities: 

 Design, build, and optimize RAG pipelines for document-level and multi-turn QA systems. 
 Fine-tune or prompt-tune foundation models (LLMs) for domain-specific tasks. 
 Develop and deploy ML/DL models to support NLP/NLU tasks like summarization, 
classification, and retrieval scoring. 
 Integrate vector databases, semantic search tools, and embedding models for highperformance document retrieval. 
 Work with unstructured and semi-structured data sources (PDFs, HTML, JSON, SQL, etc.). 
 Collaborate with data engineers, ML engineers, and product teams to build end-to-end 
generative AI solutions. 
 Monitor performance, latency, and relevance metrics; iterate on retrieval and generation 
models. 
 Implement prompt engineering strategies and hybrid approaches (rule-based + neural) to 
enhance model reliability. 
 Contribute to research and innovation in applied generative AI, and stay up-to-date with the 
latest in LLM, RAG, and MLOps ecosystems. 

Key Skills Required: 

 Strong experience with RAG architectures and hybrid retrieval systems. 
 Solid hands-on knowledge of LLMs (e.g., GPT, Mistral, LLaMA, Claude, DeepSeek, etc.) and 
embedding models (e.g., SBERT, OpenAI, HuggingFace models). 
 Proficiency in machine learning / deep learning using PyTorch, TensorFlow, Hugging Face 
Transformers, etc. 
 Experience with vector databases (e.g., FAISS, Weaviate, Pinecone, Qdrant). 
 Experience in text chunking, retrieval scoring, prompt tuning, or LoRA/PEFT methods. 
 Strong background in NLP, information retrieval, and knowledge graphs is a plus. 
 Comfortable with Python and associated data science stacks (Pandas, NumPy, Scikit-learn). 
 Experience working with real-world messy data (PDF parsing, OCR, HTML scraping, etc.)

Preferred Skills

LLM, RAG, ML/DL