Cannabis Data Science

Cannabis Data Science

An open source project that applies data science and machine learning to the collection, parsing, cleaning and analyzing of data from the cannabis industry.

Enabled price per quantity analysis by using NLP to mine product quantity information from product descriptions.

Waste generation analysis for a company wanting to convert cannabis production waste into fertilizer.

Derived a method to separate incorrectly dated historical data from the previous tracking system from current data.


  • Pandas
  • Python
  • spaCy
  • Jupyter Notebook
  • Scikit-Learn
  • NumPy
  • Catboost
  • Optuna