We have developed a streamlined process to create a data lake for Lottery West winning numbers, leveraging a straightforward architecture powered by Python. Our approach involves extracting data from the Lottery West website using Python, and then storing the data in a MySQL database for further processing and analysis, following a delta load methodology.

To ensure the data is up-to-date, we execute a Python script through a scheduled cronjob approximately 12 hours after the numbers are drawn. The script compares the drawn numbers with the maximum draw number already stored in the database. If no maximum draw number is found, a full load is performed. Otherwise, only data with draw numbers greater than the maximum draw number already stored in the database is loaded via delta load.

Next, we conduct statistical analysis on the data using Python libraries such as Pandas and Numpy, and leverage the MySQL Python Connector along with MySQL 8 running on Ubuntu 20 for database operations. We then use Power BI to visualize the results of the statistical analysis.

We are continuously working to enhance the visualization of the results and will share the progress here. Our ultimate goal is to maintain a simple architecture that does not compromise the integrity of the process or its objectives.