Sr. Data Engineer
Scipher Medicine
Scipher Medicine is a precision immunology company that uses AI and network science to match patients with the most effective therapies. Its flagship product, PrismRA, is a blood-based test predicting whether a patient with RA will likely respond to anti-TNF therapy. Scipher Medicine is also developing the Spectra platform, which can be used to discover and validate new drug targets for various autoimmune diseases. The company has made significant progress quickly, with its PrismRA test already being used by clinicians in the US and Europe, and its Spectra platform has the potential to revolutionize drug development for autoimmune diseases.
Overview:
We are seeking a skilled and autonomous Data Engineer to join our team. We are interested in candidates with experience designing, developing, and maintaining our data ingestion processes into Data Warehouses and other data pipelines. This is a full-time remote position with minimal required travel in Massachusetts. As a key member of our team, you'll be at the forefront of our data infrastructure transformation. Our dynamic environment and cutting-edge technology offer a unique chance to bring your innovative data pipeline concepts to life rapidly.
Role Requirements:
Bachelor's degree, Science or Engineering ,5-7 years of experience required
Healthcare Data Wrangling: Design, develop, and deploy complex data pipelines for ingesting, processing, normalizing and transforming healthcare data from various sources, including electronic health records (EHRs), claims data, labs, genomics, clinical trials, and other sources. Ensure data pipelines are optimized for performance and scalability, utilizing cloud-based technologies and big data tools.
Ability to translate business requirements, into effective data models. Build and maintain data warehouses and data lakes to support analytics, reporting and customer delivery needs.
Autonomous learning. We need folks who enjoy parsing large swaths of software documentation and gain sufficient prowess quickly to turn around and work on POCs that can be veritable candidates for production-ready processes/pipelines.
Advanced SQL skills, including schema design, CTEs, complex stored procedures, task etc. You’re familiar with standard data types but also variant types (json, xml). Extra points if you’re familiar with Snowflake-flavored syntax and features (stages, external functions, snow pipes, python UDFs, external tables, etc.).
Strong Python OOP We're seeking individuals well-versed in class utilization, inheritance, and functional programming. Live debugging and Python package creation should be second nature to you. Your code is clean, pythonic, and readable, and you appreciate a great formatter.
Strong understanding of healthcare data standards and terminologies (HL7, FHIR, ICD-10, LOINC, CPT, Medical claim 837/835, etc) Bonus points for JavaScript, Mirth Connect, for Rhino JavaScript XML experience.
Data Security: Familiarity with healthcare data security and privacy regulations (HIPAA). Bonus points for experience with Datavant or other de-identification, tokenization and data linking technologies
Constant Learner: Stay abreast of the latest technologies and trends in healthcare data engineering and big data.
You’re great at creating and learning from internal documentation – we’re great at documenting. You understand that documentation (or playbooks as we call them) are living documents, not historical relics – and you update the playbook whenever an issue arises, so we’re better prepared for next time.
Nice to have – in order of importance:
Databricks Experience – we work with Snowflake, Athena, and Databricks. A lot of our future work will be Databricks centric – and you’ll need to build ways to move data from Snowflake, AWS S3, and Databricks.
AWS Experience – Points if you can speak to expertise in SAM, Lambda, S3, boto3, API gateway, or anything else you see as obviously missing from this list and can explain to us why.
DevOps and CI/CD experience. Automating the process of promoting through dev/test/prod workflows. Extra points for Terraform experience.
Experience standing up an Orchestration Tool. We’ve got our eye on Prefect Orchestration.