First steps building a deep learning model

Inspired by last weeks presentation on AI, I decided to start working with the material already myself.

  • Installed Anaconda and setup my work environment Jupyter Notebook/Lab with Keras, Tensorflow, Scikit-learn and Scikeras
  • Following Digital Oceans Guide to predict P(employee leaving) based on this HR dataset on Kaggle

It basically feels like lego-blocks you are building your model by layering didn’t took me long and got my first locally trained model : 1 layer, activation function: relu

Github link to the full ipynb notebook: https://github.com/TinkerFrank/AI_Project_0/blob/7ccf2b97ee22a2aa9fb9b4cde9d5810252d955e1/JupyterLabTest.ipynb

Also started working on the kaggle competition: Titanic – Machine Learning from Disaster, which really got me thinking about:

  • prepping the datasets (how to deal with incompleteness) and making the least amount of assumptions (prove most with data analysis).
  • spend more time on doing data science on the test-data validating assumptions and so
  • able to reason/explain what was or were the most important features that contributed to the result (to see if it makes sense)

https://www.kaggle.com/code/frankpieterse/beginner-titanic-analysis/notebook

The exponential rise of AI

With the news all over ChatGPT and Lensa last month, I’ve decided to dive deeper into Artificial Intelligence by following the Elements of AI course online from the Helsinki University over the weekend. They give a broad and balanced overview of the current technical state of AI – Machine Learning – Deep Learning while also highlight the ethical aspects. And you get a cute certificate ^^

I did notice during the course my python skills got a bit rusty so currently doing a basic refreshing course on Sololearn 👌

Main key take-aways for me were:

  • It’s not about the coolest models or the latest technology, it’s about finding the real problems, identifying the ones that are worth solving and figuring out the most suitable and fastest ways to solve them.
  • Another key challenge is explaining what’s possible to do with AI methods to the non-data scientists. A crucial skill because it’s the non-data scientists who often know what challenges need to be solved/optimized (business needs) → think my entrepreneurial background helps having focus on certain KPI’s like cashflow and customer Satisfaction
  • Large high quality datasets are going to be the equivalent of oil not so much the models. So think early about the data you want to collect and set the collection mechanisms up early not when you need it then it will be too late.
  • There is a chance of a digital divide in generative AI between companies who have access to large, high quality datasets and can afford the computing power it takes to generate new IP’s