Post

Week 100

Week 100

What I pre-planned

  • bhai
    • work on novel approaches for lid model
    • share and think about solving functional privacy
  • play good amount of flute
  • something to do for the 100th weekly blog which is this

What new did I learn

  • tried multiple approaches for lid model
    • standard MFCC, attention based transformers
    • stumbled across phonemes, identified a potential solution
    • broke down the problem systematically and optimized every step of the process
    • learnt about hard negative fine tuning, dictionaries and IPA
  • there is a big oppotunity to solve niche use cases for enterprises and people across indian languages and domains
  • put it out straight for future shilling
  • midjourney body scanning machine is sick, high chance it’ll be big if true
  • played a lot of flute and realized my straight ass thumbs are not meant for holding a flute well so it pains after a while, and no position is suitable for ideal finger placement

Where/How did I implement it

  • mlid model
    • implemented dictionary based pretraining for broken down classification problem
    • completely bypassed the voice and speaker dependency using the power of phonemes
    • solved overfitting, validation inaccuracies by manual labelling, noise and drop resistance
    • used a bunch of youtube videos from news channels for getting variety of multi speaker audio dataset
    • got a really good 7MB model that can run 30x realtime easily fitting before a multi model transcription pipeline
  • discussed career growth with skip at work, clarified expectations
  • i’ll just write a blog about my weekly blogs to celebrate a hundred of them, there is a lot more to celebrate in life ahead!

Any memorable moments of the week

  • started and finished severance s1 and s2, liked it
  • good weekend, stuffed all meals, good discussion with viji, brijesh, gautam and sis
  • played a lot of peak, started a supermarket in supermarket together which was also quite fun
  • achieved “trust me bro” sota level tiny model for multilingual language identification for three languages
  • i’ve been saying this is the hundredth blog but I started with 0 obviously so you’ve been scammed haha
This post is licensed under CC BY 4.0 by the author.