During the first day I quickly realised that scientists spent years of research to get to the point wher they can model and train AI models successfully, so I can’t possibly think that I will be able to do that in 100 days!

I quickly realised that I need a different strategy if I want this endeavour to be successful. Today I switched from the bottom-up to top-down approach, so I started browsing the internet for pre-trained models and how I can customize them to solve new and more specific problems.

First I decided to give PyTorch a try. I created my first model on Tensorflow but I read that PyTorch is supposed to be simpler to use for begineers.. I’ll tell you if this is true after a few days of use. Second I found HuggingFace, a hub for data scientists to find, train and share models.

I started working on replicating another tutorial and train a T5 model on the new task of generating user questions from given a product description.

So far it seems I’ve been able to run a training successfully from a small training set of 300 lines. Though I kicked off the full training 12 hours ago and so far it’s only completed 2% of the task! Shocking!

Unfortunately, since the job hasn’t finished yet, I can’t share any results today. I’ll update the post as soon as I test the model.

Lessons Learned

  • There’s no need to reinvent the wheel. It’s far more efficient to improve upon a pretrained model rather than modeling one yourself every time.
  • I need to get my hands on a GPU or else I will be needing a new computer before the end of the 100 days!