Embark on a Rewarding Career in Data Curation at GSK: Shape the Future of R&D Analysis
Are you a technically adept professional passionate about transforming raw data into high-quality, actionable assets for groundbreaking R&D? GSK is seeking talented individuals for the Data Curation Developer role, a pivotal position designed to fuel scientific discovery and drive effective decision-making across therapeutic areas.
This opportunity offers a unique chance to leverage your expertise in data pre-processing, harmonization, wrangling, and contextualization. By making data analysis-ready, you will directly contribute to GSK's ambitious Disease Area Strategies and other critical R&D priorities. Depending on your experience, you could be considered for either a G6 or G7 level position, providing a clear path for growth.
At GSK, we are committed to fostering an environment where you can truly thrive. We believe in creating a space that is welcoming, valued, and inclusive, where you can be your best, feel safe, and continue to grow. Beyond a competitive salary and performance-based bonus, we offer a comprehensive benefits package including:
- Robust healthcare and wellbeing programs
- Pension plan membership
- Shares and savings opportunities
Embracing modern work practices, our Performance with Choice program provides a hybrid working model, empowering you to achieve an optimal work-life balance. Discover more about the extensive company-wide benefits and life at GSK on our dedicated webpage.
Key Responsibilities and Advantages for Your Career Path:
- Lead Business Requirements: Collaborate with R&D business and data platform teams to define and develop crucial data curation requirements. This is an excellent opportunity to hone your stakeholder management and strategic thinking skills.
- Seamless Integration: Maintain strong connections with analytical groups and R&D Data Platform teams, ensuring smooth data integration and utilization. This fosters a collaborative environment and expands your network within the organization.
- Deliver High-Impact Datasets: Produce pre-packaged, curated datasets that align with business needs for analytics. You will be responsible for documenting data specifications, ensuring providence, lineage, and privacy. This hands-on experience in delivering critical data assets is invaluable for career advancement.
- Unify Diverse Data: Integrate varied datasets, including clinical trials, real-world data, and omics, into a unified format. This experience will deepen your understanding of complex data ecosystems and enhance your data architecture skills.
- Champion Data Quality & Privacy: Ensure all datasets meet analysis-ready and privacy requirements, including anonymization. You will play a vital role in upholding industry best practices and data governance standards.
- Mentor and Guide: Provide coaching and peer review to your colleagues, ensuring adherence to industry best practices in data curation, privacy, and anonymization. This leadership aspect is crucial for developing your mentoring and team-building capabilities.
- Ensure Compliance: Process datasets to meet specific conditions outlined in approved data re-use requests, demonstrating your attention to detail and commitment to regulatory compliance.
- Write Clean, Efficient Code: Develop well-documented, high-quality code that meets industry standards. This focus on code quality will benefit your long-term software development career.
- Facilitate Production Pipelines: Ensure deliverables are thoroughly quality controlled and documented, with the potential for handover to the R&D Tech team for production pipeline implementation. This provides exposure to operationalizing data solutions.
What GSK is Looking For (Your Strengths & Opportunities):
We are seeking professionals with a strong foundation in:
- Educational Background: A BSc/MSc/PhD (or equivalent) in Computer Science, Mathematics, Statistics, or a related field.
- Scientific Data Handling: Proven experience managing diverse scientific clinical data, including clinical trial data (with biomarkers), real-world data (RWD), and omics. This experience is a significant asset in the biopharmaceutical industry.
- Technical Proficiency: Expertise in Python, Databricks, Delta Lake, PySpark, Pandas, and other data engineering frameworks for creating industry-standard compliant datasets. Mastery in these tools is highly sought after.
- Data Processing Prowess: A strong ability to efficiently handle and process large structured, semi-structured, and unstructured datasets.
- Communication & Translation: Excellent communication skills to translate complex business needs into clear technical data requirements and processes. This skill is fundamental for bridging the gap between business and technology.
- Impact Assessment: The ability to quantify and articulate the business impact and value creation derived from data curation activities, showcasing your strategic contribution.
- Industry Data Standards: Experience with at least one major industry data standard such as CDISC (ODM: CDASH, SDTM, ADaM), HL7 FHIR, or OMOP (CDM). Familiarity with these standards is a key differentiator.
Preferred Skills (Opportunities for Growth):
While not mandatory, experience in the following areas would be a significant advantage and an excellent opportunity for further development:
- Experience in R
- An agile mindset with the ability to deliver prototypes quickly and iterate improvements based on stakeholder feedback.
- Experience with digital clinical trials protocols and the Unified Study Definition Model (USDM).
- Experience in data modeling.
This role is an exceptional opportunity to contribute to meaningful scientific advancements while developing your career in a dynamic and supportive environment. If you are driven by impact, possess a strong technical acumen, and are eager to be part of a company that is uniting science, technology, and talent to get ahead of disease, we encourage you to apply.