Bioinformatics Support Specialist

Updated: about 2 hours ago
Location: Pasadena, CALIFORNIA

Caltech is a world-renowned science and engineering institute that marshals some of the world's brightest minds and most innovative tools to address fundamental scientific questions. We thrive on finding and cultivating talented people who are passionate about what they do. Join us and be a part of the diverse Caltech community.


Job Summary

The Ismagilov Lab at Caltech is hiring a full-time Bioinformatics Support Specialist for a 2-year term position. Reporting to Professor Rustem Ismagilov, the successful candidate will contribute to the lab’s key projects and collaborations in human health, which include: developing novel tools that improve biological AI models by orders of magnitude; developing closed-loop systems that combine robotics and AI for rapid autonomous experimentation; discovering in human tissues novel bacteria, fungi and viruses (and corresponding genes and pathways) responsible for heath and disease; developing novel diagnostic approaches; understanding microbial origins of human diseases.  Additional details about the lab can be found at http://ismagilovlab.caltech.edu/  


Essential Job Duties
  • Assist graduate students, postdocs, and research staff with bioinformatics/computational components of all ongoing projects.
  • Contribute to manuscript preparation and grant proposals, data visualization, and scientific presentations.
  • A highly motivated and qualified candidate may additionally develop their own independent aspects of ongoing projects.

Bioinformatic Analyses
  • Learn and apply state-of-the-art statistical and computational methods for analyzing datasets from shotgun metagenomics, amplicon sequencing, and RNA-sequencing experiments.
  • Assist in the design, implementation, and benchmarking of in-silico experiments.
  • Generate computational results and publication-quality figures for clear communication of results to lab members and external collaborators.

Scripting and Automation
  • Collaborate with lab members and external colleagues to learn and implement best practice methods and packages for data analysis and processing in the lab.
  • Work with lab members to understand needed bioinformatic capabilities and develop tools to accomplish them.
  • Work with lab members and external collaborators to help them develop, execute, and maintain state-of-the-art data processing pipelines on Caltech high-performance computing (HPC) clusters.
  • Develop and maintain scripts for routine analysis tasks such as qPCR quantification.
  • Develop and maintain scripts for automated robotic platforms to perform laboratory procedures.
  • Help lab members test and integrate LLM-based AI tools to improve efficiency of research tasks such as experimental design and technical writing.

Bioinformatic Maintenance
  • Maintain well-organized computational workspaces, including data storage, version-control systems, containerization, and documentation in a neat, detailed electronic lab notebook.
  • Maintain reference databases of microbial genomes, gene clusters, antimicrobial resistance markers, and metabolic pathways.
  • Maintain organized and version-controlled repositories and containerized computational environments (Docker/Conda), using Python and/or R.
  • Assist with data management on remote computing clusters and back-up servers, including organizing raw and processed data, maintaining version control, and ensuring data integrity and reproducibility.

Basic Qualifications
  • Experience with programming languages for scripting and data analysis: Python, R, and Bash required.
  • Bachelor’s degree in science with educational emphasis in computational sciences, bioinformatics, biology, microbiology, computer science, bioinformatics, computational biology, data science, or a related field required at time of application.
  • Familiarity with basic molecular biology topics.
  • Experience (including coursework or internships) in data analysis using at least one scripting or programming language (e.g., Python, MATLAB, R).
  • Experience working with next generation sequencing data and/or big data (e.g. genomics, RNA-seq, single-cell RNA-seq, epigenomics, proteomics, metabolomics, etc.).
  • Experience working in Linux/Unix environments.
  • Strong drive to learn and to make positive impacts on the lab and lab’s mission. 
  • Ability and desire to collaborate and support collaborative projects. 
  • Highly detail-oriented; must have examples of well-documented code (e.g., Git repositories) involving complex technical tasks or data analyses available upon request.
  • Ability to rapidly learn technical skills, particularly in computational methods and bioinformatic workflows.
  • Strong organizational, time-management, and project-management skills; ability to professionally and continuously communicate priorities, workload, and current progress to all lab members.
  • Must be able to use sound judgment to handle multiple simultaneous tasks, prioritize effectively, and troubleshoot problems independently.
  • Must be able to work independently as well as part of a collaborative team in a dynamic, fast-paced environment.
  • Ability to write clean, well-documented, and reproducible code for data processing and analysis.
  • Ambition to develop research skills and advance scientific understanding through high-quality computational analyses and interpretation of biological data. 

Preferred Qualifications
  • Experience with cloud computing environments (e.g., AWS, Google Cloud, or institutional equivalents) and using High-Performance Computing (HPC) clusters with the Slurm workload manager.
  • Experience performing bioinformatic analyses involving shotgun metagenomics, RNA sequencing, or 16S rRNA gene sequencing.
  • Experience with workflow management systems for pipeline automation (e.g., Snakemake, Nextflow)
  • Proficiency with Git for collaborative coding, project tracking, and version control.
  • Experience performing statistical analyses or modeling of complex biological datasets.
  • Experience with containerization (Docker, Conda, Singularity).
  • Experience creating visualizations or interactive notebooks (e.g., Jupyter, R Markdown, Dash).
  • Experience fine-tuning LLMs for analysis or writing tasks.
  • Co-authorship on a scientific publication.
  • Solid understanding of statistical analysis methods (NHST, probability distributions, etc.)
  • Ambition to pursue advanced training or career in computational biology, bioinformatics, ambition to enter a top medical or graduate school or a similar high-performance, high-impact environment in industry or academia.

Required Documents
  • Resume (describe relevant experiences and accomplishments).
  • Cover letter with specific examples demonstrating a) candidate’s alignment with lab’s mission, b) candidate’s level of ambition, motivation, and drive; c) candidate’s attention to detail and ability to produce data others can trust; d) the level of technical mastery of the top three most relevant experimental skills described in the resume.
  • GitHub (preferred) or code examples.


Similar Positions