Previous Job
Data Scientist
Ref No.: 18-04071
Location: South San Francisco, California
  • Genentech's Early Clinical Development (ECD) department is seeking a Data Scientist reporting to the Genentech Predictive Analytics (gPA) Team Lead.
  • The gPA group supports clinical development in the area of trial design, study planning and the creation and application of predictive algorithms.
  • The role will require cross-functional interactions with Clinical Science and Clinical Operations.
  • The Data Scientist will act as a consultant and be required to both establish and apply new machine learning methods to specific problem domains within those functions.
  • The Data Scientist will primarily be responsible for the development and the deployment of machine learning methods in early clinical development with particular focus on the application to protocol writing, clinical trial planning/forecasting, and study execution. In addition, the data scientist may work to apply methods from the ECD AI (Artificial Intelligence) team to specific problem domains.
  • Lead key machine learning pilots to ensure successful completion and develop plans to integrate capabilities into the normal early clinical development process
  • Access, process, manipulate data of large volumes and from diverse sources (e.g. Oracle Clinical, Teradata, Medidata RAVE,
  • Development of new and Client machine learning methods for predictive modeling and simulation-based modeling
  • Project and predict business processes including clinical site activation, patient screening, and enrollment
  • Detect trends and anomalies in data and analyses
  • Conduct ad hoc requests to support clinical development business needs\Possess in-depth knowledge of multiple real-world data assets and share information about these data to business partners and the ECDi team.
  • Proficiency in Python
  • Demonstrated proficiency using Apache Spark and SparkML
  • Demonstrated Proficiency using Tensorflow and/or PyTorch
  • Demonstrated Proficiency using Keras
  • Desirable, experience using a probabilistic programming library or languages such as PyMC3, Edward or Figaro
  • Strong analytical and problem-solving skills
  • Excellent oral and written communication skills
  • Ability to lead cross-functional multi-disciplinary teams
  • Able to work in teams and collaborate with others to solve the most challenging business problems
  • Degree in a relevant technical field such as Statistics, Mathematics, Computer Science.
  • Masters or Ph.D. is a plus
  • 2+ yrs experience in using Statistics, Mathematics, Computer Science (1+ yrs with a Ph.D.)
  • 3+ yrs experience in Clinical Development, Pharmaceutical, or healthcare industry