Get started with Machine Learning Part 1

Alexander Daniel Pratama
3 min readOct 13, 2021
Photo by Kevin Ku on Unsplash

I have been working for many months to understand machine learning. It is the most agonizing experience but on the other hand, is a thrilled one. I don’t have any experience related to IT or Advanced Calculus. Indeed, I learn calculus in my previous study in college but sometimes memory can fade if it does not use every day, am I correct?

Therefore, I have to study the basics of machine learning. I did not join any paid training courses. So I learn from wherever I can get free. Frankly, it is not a best practice for me because sometimes I need guidance. Then, I ask my manager to approve Machine Learning paid training for better understanding. After approval, join the training program, in short, I get a better understanding, still lack any experience in processing the data.

Fast forward, I incidentally watch MIT sources about machine learning, MIT Deep Learning 6.S191 (introtodeeplearning.com). The lecturer is Alexander Amini and Ava Soleimany. The lecture is excellent as expected from MIT. I learn a few fundamental calculus but also in the appliance of machine learning.

Then, I watch TensorFlow which is also a great channel for beginners. The title is Intro to Machine Learning (ML Zero to Hero). There are 4 parts and you can easily understand what he said.

https://www.youtube.com/watch?v=KNAWp2S3w94

After watching and learning from other sources as well, I try my first analysis in Irish Dataset which you can download here https://www.kaggle.com/arshid/iris-flower-dataset/version/1. Actually, it is the best data to begin machine learning outside of TensorFlow prebuild data.

What you need to know in machine learning the flow, begins with

  1. What do you want from the data? Classification? Prediction? or others?
  2. Understanding the data? What type of data it has?
    If tabular, what are the fields it has and is there null data? If images, is it black and white or RGB? Are there any corrupt files?
  3. For tabular data, if you find null, do you want to delete or do the interpolation by yourself? Is it important or not? Is your data contain time series? Because time-series data has a different approach so be careful
  4. Do some cleansing or preprocessing data
  5. Split the data for testing and training
  6. Create a proven model, you can google it
  7. Create your own model, this is a hassle part because it is trial-errors
  8. Plot your loss and var_loss
  9. Do the prediction from testing data
  10. Plot your prediction data and training data in the same graphic. The purpose is to find the data is already meet our requirement or not.
  11. Save the model if you are confident with the model. If not, repeat step-7

This is the summary based on my experiences after 3 months of studying. Of course, there is still a lot of room for improvement. Feel free to message me at alexanderdan.pratama@gmail.com for this article needs to be edited or updated. The next part of is how to process the Irish dataset using TensorFlow. Happy learning all.

--

--

Alexander Daniel Pratama

GIS Specialist, Data Engineer, and A proud Geodetic Engineer