We are looking for passionate and autonomous PhD students who are within one or two years of their PhD graduation date to contribute to our innovative projects in the field of the multimodal foundation models for LLM, image, and clinical data analysis. The students will work with a dynamic team who are pushing the boundaries of AI machine learning in the healthcare domain and building real-world products. The students will research best practices for data curation/preparation and training of our state-of-the-art foundation models and address some of the hardest problems in computer science. Our internships are designed to be either a summer or a yearlong program based on the company’s needs and the candidate’s desires. Join us and build a rewarding career while making a significant impact on the future of healthcare and patient well-being.
Requirements
- PhD students who are within one or two years of their graduation date, with a focus on LLM and image multi-modal foundation models.
- Excellent programming skills in Python and C/C++.
- Proficiency in PyTorch framework and the common NLP libraries (NLTK, spaCy, scikit-learn, etc.)
- Experience with state-of-the-art deep learning architectures (ViT/SWIN transformer, GPT, CLIP, etc.).
- Experience with model compression and large scale distributed model training (data-parallel and model-parallel)techniques.
- An outstandingtrack record of publications (NeurIPS, CVPR, ICML, AAAI, etc.) and contributions to the machine learning communities (kaggle, Hugging Face, etc.).
- Hands-on experience with parameter-efficient tuning (QLoRA) techniques and the LangChain framework is a plus.
- Experience with prompt engineering and fine-tuning Llama 2 or PaLM 2 with domain specific data is a plus.
- Experience with CUDA programming is a plus.
- Experience with imaging processing (opencv2, VTK, ITK, DCMTK, Albumentations) is a plus.
- Experience with vector databases (Chroma, Pinecone, Milvus, redis, etc.) is a plus.
- Experience with human-in-the-loop, Reinforcement Learning from Human Feedback (RLHF),and continuous online training is a plus.
- Knowledge for Stable Diffusion, OpenJourney, or DeepFloyd IF is a plus.