Project Journal, Week Nine

Machine Learning Update

With my new found power of CVAT, I am able to label without any hardware issues and its amazing! So now I have two different datasets im gonna train. The first one is labeling all the fish, which is a big project and the other one is to label and identify just where the fish is.

My PFR supervisor and I decided to make a box identifier to be built onto the OpenCV system to auto crop the images for the employees and simplify the process, so after annotating all the images, it was time to send it off to be trained!

A couple days have passed and its the best results i have ever had! It had dropped to 0.01 accuracy, which is amazing, it must work? right?

After running it through my small test environment to see if it can predict where the bin is, i was met with this small line…

As we can see from above, their is nothing inside the list, so it didn’t end up finding anything and predicting, but how? the accuracy is at 0.01?

Overfitting

Over fitting is the terminology that is used in statistics and machine learning. It is where the data formula is too accurate to the data it knows, so when new data is added, it doesn’t know what to do with it because its threshold is so close to the original data. Within machine learning in particular, the machine begins to memorize the data instead of actually learning from it.

How do we fix it?

Well there are many different solutions and some that i will be trying over the week will be increasing the data set size and applying higher drop rates. drop rates is the rate that some data randomly gets deleted so the machine can’t just remember the process and will have to actually learn. A good article by Ashish Patel explains this problem in better detail and worth a read if you are interested.

ITP Events

Last week i mentioned going to a small conference about data visualization and it was my first time going to an ITP event so I wasn’t sure what to expect, however it was extremely interesting to hear about what professionals in our community are building and how data visualization was helping them achieve that.

On the topic of ITP. I recently was given the chance to go attend ITx Rutherford 2019 for free! which i am extremely excited to go to as I have been reading and saving up for it since it was first announced at the start of the year. ITx Rutherford 2019 is a combination of ITP and CITRENZ events over a full three days. This event focuses on innovation, technology, education and bringing hundreds of IT professionals together.

Leave a comment