People Matter

Data Scientist

Scipher Medicine

Scipher Medicine

Data Science
Waltham, MA, USA
Posted on Monday, May 27, 2024

About Us:

Scipher Medicine is a precision immunology company that uses AI and network science to match patients with the most effective therapies. Its flagship product, PrismRA, is a blood-based test that predicts whether a patient with RA is likely to respond to anti-TNF therapy. Scipher Medicine is also developing the Spectra platform, which can be used to discover and validate new drug targets for a variety of autoimmune diseases. The company has made significant progress in a short period of time, with its PrismRA test already being used by clinicians in the US and Europe, and its Spectra platform having the potential to revolutionize drug development for autoimmune diseases.

Job Description:

We're seeking a proactive and highly skilled Data Scientist with a background in Computer Science, Machine Learning (ML) or other computational sciences, coupled with expertise in Google Cloud Platform (GCP). As our Data Scientist, you will involve with design, development, and maintenance of our data ingestion and ML pipelines. This role demands a deep understanding of data automation, ETL operations, and data orchestration, particularly within the GCP environment. Additionally, you'll be responsible for model development across various domains, leveraging techniques like neural networks (NN), graph convolutional networks (GCN), random forests (RF), optimization, and feature selection methods.

Summary:

We're seeking a Data Scientist experienced in data ingestion for ML pipelines and proficient in leading model development across a range of techniques, including NN, GCN, RF, optimization, and feature selection methods. The ideal candidate will possess strong data engineering skills and hands-on experience with GCP services like Cloud Storage, BigQuery and Dataflow.

Key Responsibilities:

Develop and deploy advanced machine learning models on the GCP platform.

Collaborate closely with data scientists to translate Python-based data processing and analysis pipelines into scalable PySpark ETL processes suitable for deployment on cloud platforms such as AWS or GCP.

Conduct thorough data analysis and preprocessing to ensure data quality and reliability.

Implement a wide range of models, including but not limited to Neural Networks (NN), Graph Convolutional Networks (GCN), Random Forests (RF), and optimization algorithms.

Perform feature selection and engineering to enhance model performance and interpretability.

Utilize best practices in model development and optimization to achieve desired outcomes.

Evaluate and benchmark model performance using appropriate metrics and validation techniques.

Stay updated with the latest advancements in machine learning research and technologies.

Requirements:

PhD in Computer Science, Statistics, Mathematics, or a related quantitative field.

2+ years of experience in data engineering (including academic experience)

Proven experience in developing and deploying machine learning models, preferably in a commercial setting.

Strong understanding of various machine learning algorithms and techniques, such as Neural Networks, Graph Convolutional Networks, Random Forests, and optimization methods.

Excellent programming skills in languages such as Python or R.

Ability to work effectively in a collaborative, team-oriented environment.

Exceptional problem-solving and analytical abilities.

Excellent communication and presentation skills, with the ability to convey complex concepts to non-technical stakeholders.

Skills:

Machine Learning: Neural Networks, Graph Convolutional Networks, Random Forests, optimization algorithms.

Proficiency in data modeling and database design.

Strong programming skills, especially in languages such as Python.

Knowledge of Statistical Analysis, Data Preprocessing and Feature Engineering, Model Evaluation and Validation.

Strong analytical and problem-solving abilities.

Excellent communication and collaboration skills to work effectively with cross-functional teams.

To Apply:

If you are a highly motivated and results-driven data scientist with an eagerness to work with cutting-edge GCP services, we encourage you to apply. Join us in shaping the future by enabling data-driven decisions through our ML pipeline.

Additional Information:

As we embark on this exciting journey, we are committed to fostering an inclusive and diverse work environment. We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status. Join us in this transformative project where you can contribute to cutting-edge data solutions while being a part of a culture that values and celebrates diversity.