Michael Flashman
Data Scientist at Sound Agriculture- Claim this Profile
Click to upgrade to our gold package
for the full feature experience.
Topline Score
Bio
Experience
-
Sound Agriculture
-
United States
-
Agriculture, Construction, Mining Machinery Manufacturing
-
100 - 200 Employee
-
Data Scientist
-
Dec 2022 - Present
Lead architect and software engineer for the Research Department, working to integrate a third-party LIMS (Sapio) with lab operations. In addition, I improved the Data Science team’s engineering practices through trainings, mentorship, process improvements, and good example. Key contributions include: - working with a lab process engineer to implement templated lab data workflows and supporting data models within the LIMS framework - integrating automation equipment, e.g. label printers… Show more Lead architect and software engineer for the Research Department, working to integrate a third-party LIMS (Sapio) with lab operations. In addition, I improved the Data Science team’s engineering practices through trainings, mentorship, process improvements, and good example. Key contributions include: - working with a lab process engineer to implement templated lab data workflows and supporting data models within the LIMS framework - integrating automation equipment, e.g. label printers and plate reader, with the LIMS by leveraging adjunct software systems, e.g. BarTender, custom file parsers, and AWS - creating scalable webhooks service and endpoints for advanced customization of the LIMS - building a Streamlit web-app for prototyping and distributing custom design and analysis tools - developing an AWS-backed data warehouse with Metabase font-end to serve curated genomic data, LIMS process data, and analysis results in an integrated and queryable fashion All coding work was done independently, mostly in python, dockerized, deployed to AWS (e.g. S3, EC2, ECR, ECS, ELB, Glue, Athena), and generally managed with Terraform. Show less Lead architect and software engineer for the Research Department, working to integrate a third-party LIMS (Sapio) with lab operations. In addition, I improved the Data Science team’s engineering practices through trainings, mentorship, process improvements, and good example. Key contributions include: - working with a lab process engineer to implement templated lab data workflows and supporting data models within the LIMS framework - integrating automation equipment, e.g. label printers… Show more Lead architect and software engineer for the Research Department, working to integrate a third-party LIMS (Sapio) with lab operations. In addition, I improved the Data Science team’s engineering practices through trainings, mentorship, process improvements, and good example. Key contributions include: - working with a lab process engineer to implement templated lab data workflows and supporting data models within the LIMS framework - integrating automation equipment, e.g. label printers and plate reader, with the LIMS by leveraging adjunct software systems, e.g. BarTender, custom file parsers, and AWS - creating scalable webhooks service and endpoints for advanced customization of the LIMS - building a Streamlit web-app for prototyping and distributing custom design and analysis tools - developing an AWS-backed data warehouse with Metabase font-end to serve curated genomic data, LIMS process data, and analysis results in an integrated and queryable fashion All coding work was done independently, mostly in python, dockerized, deployed to AWS (e.g. S3, EC2, ECR, ECS, ELB, Glue, Athena), and generally managed with Terraform. Show less
-
-
-
-
Data Science Consultant
-
Aug 2022 - Nov 2022
Advised the Sound Agriculture Research Department on how to best integrate a recently purchased third-party LIMS (Sapio) with their existing spreadsheet-backed lab operations, and to identify gaps in the LIMS capabilities relative to their operational requirements. In addition, I provided basic training in agile development, version control, data modeling, and software design. Advised the Sound Agriculture Research Department on how to best integrate a recently purchased third-party LIMS (Sapio) with their existing spreadsheet-backed lab operations, and to identify gaps in the LIMS capabilities relative to their operational requirements. In addition, I provided basic training in agile development, version control, data modeling, and software design.
-
-
-
Zymergen
-
United States
-
Biotechnology Research
-
100 - 200 Employee
-
Data Scientist
-
Jul 2016 - Oct 2021
Worked with a team of data scientists, biologists, software engineers, and higher-ups to design and implement machine-learning based approaches for microbe and protein optimization: - developed supervised and unsupervised machine learning systems to automate the design of microbial and protein DNA. Models were implemented in python using standard scientific libraries e.g. sklearn, statsmodels, and genism. Modeling challenges included high-dimensional sparse co-linear features, noisy… Show more Worked with a team of data scientists, biologists, software engineers, and higher-ups to design and implement machine-learning based approaches for microbe and protein optimization: - developed supervised and unsupervised machine learning systems to automate the design of microbial and protein DNA. Models were implemented in python using standard scientific libraries e.g. sklearn, statsmodels, and genism. Modeling challenges included high-dimensional sparse co-linear features, noisy non-i.i.d data, non-linear effects, multiple objectives, physical design constraints, long experimental cycle times, and cross-domain collaboration. - developed automated workflows to execute machine learning systems in production environments. Workflows were implemented on Airflow, Kubernetes, and/or Jupyter frameworks. - developed data management systems e.g. SQL data models, REST APIs, REST clients, CLIs, ETL process, and web front-ends, to track model performance over time. - developed bioinformatic workflows to identify and visualize promising proteins from large metagenomics databases for downstream experimental investigation. Workflows used standard bioinformatics tools e.g. hmmer, mafft, and cd-hit, custom python wrapping, were executed on a combination of AWS Batch and Kubernetes, and were orchestrated using a combination of Airflow, AWS Glue, and in-house job execution framework. - mentored junior staff on the day-to-day nitty-grittys of data science, software engineering, legacy software systems, and company culture. - communicated planning, progress, and results of engineering efforts to collaborators, colleagues, higher-ups, and company at-large, through casual conversation, in-depth discussions, design docs, formal presentations, white papers, and patents.
-
-
Software Engineer
-
Jul 2014 - Jul 2016
First software engineer hired to help build a first-of-its-kind automated laboratory for optimizing microbes for production of novel molecules. Worked closely with a team of biologist, automation engineers, and software engineers to develop numerous core components of the Lab Information Management System using a variety of languages e.g. python, ruby, java, and javascript, a variety of scientific libraries e.g. scipy and biopython, a variety of frameworks e.g. Django, Flask, Celery, Rails, and… Show more First software engineer hired to help build a first-of-its-kind automated laboratory for optimizing microbes for production of novel molecules. Worked closely with a team of biologist, automation engineers, and software engineers to develop numerous core components of the Lab Information Management System using a variety of languages e.g. python, ruby, java, and javascript, a variety of scientific libraries e.g. scipy and biopython, a variety of frameworks e.g. Django, Flask, Celery, Rails, and Dropwizard, and backed by SQL and S3 data storage. Notable contributions included: - python and ruby REST clients for the RESTful LIMS data store. - file parsers (and some supporting Celery boilerplate) to ingest outputs from a diverse collection of laboratory devices. - automated DNA design and QC system to support high-throughput strain engineering. This involved implementation of published methods as well of development of novel approaches. - Rails front-end to support high and low throughput laboratory data visualization and management (experiment tracking, sample tracking, data visualization, etc) by lab techs and scientists. - first full-stack integration test environment and test suites.
-
-
-
Cornell University
-
United States
-
Higher Education
-
700 & Above Employee
-
Research Assistant
-
Jan 2014 - May 2014
Assisted in the development of a distributed Twitter crawler, hosted on AWS. Developed command line tools to efficiently aggregate terabytes of data gathered by the crawler. Implemented distributed algorithms via MapReduce to infer user locations. Worked with a team of three other researcher assistants, under the direction of Prof. Michael Macy.
-
-
Teaching Assistant
-
Aug 2011 - May 2014
Served as lead teaching assistant for large (200+ student) introductory programing course taught using MATLAB. Held weekly discussion sections, created grading rubrics and test scripts for student submissions, and supervised 20 undergraduate graders. Spring 2012: grader for introductory course in real analysis.
-
-
Research Assistant
-
Jun 2012 - Aug 2012
Investigated entropic methods for detecting novel temporal trends of word (n-gram) usage in scientific literature from arXiv pre-print database. Developed custom python program and optimized C extensions to perform the analysis. Worked under the direction of Prof. Paul Ginsparg.
-
-
-
Intersect
-
United States
-
Technology, Information and Internet
-
Junior Software Developer
-
Sep 2009 - May 2011
Developed features and oversaw testing for Intersect.com, a Rails + PostgreSQL web application for community storytelling. Designed and implemented spectral clustering of geo–temporal data and page–rank based content ranking system for search. Worked closely with a team of six other developers and Pulitzer Prize winning journalist. Developed features and oversaw testing for Intersect.com, a Rails + PostgreSQL web application for community storytelling. Designed and implemented spectral clustering of geo–temporal data and page–rank based content ranking system for search. Worked closely with a team of six other developers and Pulitzer Prize winning journalist.
-
-
-
Reed College
-
United States
-
Higher Education
-
500 - 600 Employee
-
Fabrication Assistant
-
Aug 2008 - May 2009
Assisted Prof. Ondrizek in the fabrication and installation of "Cellular" (PDX Gallery, 2008) and "Sound Wall" (Western Washington University, 2009). Responsibilities included making paper, steel and aluminum fabrication, general assembly, and installation.
-
-
Editorial Associate
-
Jun 2008 - Aug 2008
Working under Dr. Joel Franklin, performed comprehensive editing, for accuracy, style, and content, of his undergraduate physics text, Advanced Mechanics: An Introduction to General Relativity, Cambridge University Press. Aug 2010; wrote, edited, and typeset portions of the solution manual
-
-
-
Santa Fe Institute
-
United States
-
Research Services
-
1 - 100 Employee
-
REU Fellow
-
Jun 2007 - Aug 2007
Conducted independent research under Dr. Eric Smith concerning the plausibility of a metabolism-first scenario for the origin of life and graph theoretic formalisms for describing chemical reactions Research focused on developing a graph theoretic formalism for describing relevant chemical systems. Conducted independent research under Dr. Eric Smith concerning the plausibility of a metabolism-first scenario for the origin of life and graph theoretic formalisms for describing chemical reactions Research focused on developing a graph theoretic formalism for describing relevant chemical systems.
-
-
-
Reed College
-
United States
-
Higher Education
-
500 - 600 Employee
-
Nuclear Reactor Operator
-
Aug 2005 - May 2006
Responsible for reactor operation, sample preparation, operator training, and emergency response. Responsible for reactor operation, sample preparation, operator training, and emergency response.
-
-
Education
-
Cornell University
Master of Science (MS), Applied Mathematics -
Reed College
Bachelor of Arts (BA), Physics