Data Scientist - MORSCO Inc.
- Research and utilize Machine Learning (ML) and NLP frameworks and libraries to train NLP models with internal and external datasets
- Prototype and develop NLP-enabled, services and components
- Builds and deploys solutions using deep learning and NLP technologies
- Demo new product features to stakeholders
- Ability to understand business requirements, collaborate with team, provide ideas, and clearly present new findings and solutions
- Able to build analytics solution from scratch. Includes data exploration, extraction, cleaning, transformation, modelling, testing and implementation
- Work with teams to design and build cloud hosted, automated pipelines that run, monitor, and retrain ML Models for business applications
- Optimize and refactor development code so that it can be moved to production
- Build ETL Pipelines for new and existing models
- Requisition cloud infrastructure for model and pipeline development environments
- relate business use cases to appropriate technologies required to implement them
- Bachelor's degree from an accredited university in Computer Science, Statistics, Applied Mathematics or related field
- 2+ years of work experience in NLP projects
- 2+ years of work experience in software development
- Strong knowledge of text representation techniques (such as n-grams, bag of words, sentiment analysis etc.), statistics and classification algorithms
- Must have a clear understanding and implementation of different machine learning algorithms such as logistic regression, decision trees, SVM, Na ve Bayes, KNN, neural networks, gradient descent, Random forest, ensemble gradient boost, etc.
- Programming background and experience in Python and its libraries is required
- Relevant python packages - SpaCy, scikit-learn, NLTK, Numpy, Pandas, Jupyter notebook
- Command of deep learning frameworks TensorFlow, Keras, PyTorch etc. and large scale data processing using big-data technologies - Apache Spark, Apache Beam, Flink etc.
- Uses AWS technologies - Sagemaker, Glue, EMR, S3, Lambdas, Redshift and Athena
- Experience working with natural language processing and text mining
- Experience working with a variety of relational SQL and NoSQL databases
- Experience with deployment technologies in one or more Cloud Providers (preferably AWS or GCP)
- Experience scheduling cloud hosted workflows using tools like Cron or Apache Airflow
- Hands on experience containerizing code and environments with Docker and/or Kubernetes
- Experience building automated Model Training/Retraining and Validation Pipelines
- M.S. degree in Computer Science or related advanced degree preferred.
- Experience with Microsoft Azure
- Knowledge on streaming is bonus, especially with Kafka
- Experience specifying infrastructure to be built using tools such as Terraform or Jenkins
MORSCO is a leading U.S. distributor of commercial and residential plumbing, waterworks and HVAC, with showrooms across the country. Since our inception in November 2011, we've grown rapidly through a series of acquisitions and store openings. MORSCO 's family of brands consists of Morrison Supply, DeVore & Johnson, Murray Supply, Wholesale Specialties, Express Pipe & Supply, Farnsworth Wholesale, and Fortiline Waterworks. In 2018, MORSCO was acquired by The Reece Group, Australia's leading provider of plumbing, HVAC and waterworks products.
MORSCO is an EEO/AA/Disability/Vets Employer