My end of year lists 2022

My end of year lists

I always look forward to sharing the content that consumed me this year and what I would love to share with others.

**Here are top books I read in 2022. **

  • The Gardner and the Carpenter- by Alison Gopnik
  • Hunt, Gather, Parent, — by Michaeleen Doucleff
  • Atomic Habits — James Clear
  • Principle of Changing world order — Ray Dalio

Here are top podcast hosts for 2022

  • Seen and the Unseen — by Amit Verma
  • Lex fridman podcast
  • Masters of Scale with Reid Hoffman

**Career focussed content — as a new DS Manager leading MLOPS **

  • HBR’s 10 Must Reads for New Managers
  • mlops.community
  • Machine Learning Systems design — Chip Huyen

Do share what your recommendations are from this year!

Happy New Year 2022!

Read More

Clickbait Detector

Clickbait

Detect clickbait with Machine Learning - http://clickbait.pythonanywhere.com/

What is Clickbait?

  • Clickbait is a fancy headline written to grab attention of generation that needs instant gratification
  • Clickbait is an acknowledgement that anything that needs to be done to move the revenue, ad, CTR needle forward - will be done.
  • Here is an interesting blog on why clickbait is so popular - link
  • And another that talks about the psychology of clickbait - link
  • My personal opinion is - that this form of writing like many other bublbles is just a trend, will die down soon. Until then - we will keep trying to find a way to seperate genuine news from baity articles. :)
  • Facebook’s effort to curb clickbait - News Feed FYI:FB

Science

  • The demo is a text classifier that determines weather the headline is a clickbait.
  • The clickbait corpus consists of article headlines from ‘BuzzFeed’, ‘Upworthy’, ‘ViralNova’, ‘Thatscoop’, ‘Scoopwhoop’ and ‘ViralStories’.
  • The non-clickbait article headlines are collected from ‘WikiNews’, ’New York Times’, ‘The Guardian’, and ‘The Hindu’.

Clickbait

Web Service

  • The idea to have a simple interface to allow people to be aware and rather surprised - to how many news articles they read in a day are based on appealing to your dopamine - Link to the demo again - clickbait detector
  • This is a hobby project to get the idea out there - I will someday work on a chrome plugin to get it to highlight all the news that one might read as baity or safe

Future

  • If the tool reaches a significant audience. I would be happy to wrap it in a AWS lambda function and call it from a chrome extension- which looks for articles on google news, facebook etc. - for a more significant usage.
  • If you think so too - let me know at @shubhamkalra27

My thanks to

  • Training data has been used from this study - data posted here
  • Good people at pythonanywhere.com
  • Machine learning tutorials by Jose Marcial Portilla on Udemy.
Read More

Visualizing Random Walk

My probability teacher opened our class of Markov Chain Model by giving us the drunk man hypothesis - A drunk man will find his way home. We all had a laugh, but I had an urge to try this out everytime I was in those shoes. What better way to simulate this experiment.

Link to fiddle - Link to fiddle - http://jsfiddle.net/27thmartian/y16sqcb0/embedded/result/

Walker

Visualisation Notes:

  • Walker starts from origin to walk randomly in unit steps
  • We need to see if walker will come back to origin
  • A random generator decides whether the person goes north-south or west-east.
  • A Second randomizer moves walker to forward or backward in the previously selected dimension.
  • The chart is re-rendered after the given amount of time, defaulted to 50 ms.
  • A walker may or maynot hit the origin. Increase the steps to see if it returns to origin
  • At the end of the program, we see the Average Steps taken to Return to Origin

Assumptions

  • Each step takes 1 unit time
  • Walker takes one of the equally likely four paths available - North, South, East or West
  • Each change in direction is exactly at right angles

Conclusion

If you let walker walk long enough it will come back to origin

or a drunk person - who can follow the easy assumptions of -

  • walking exact size steps
  • turning at exact right angles will eventually reach home

Walker

Read More

Hotdog not Hotdog!

Youtube Video

Demo -http://nothotdog.pythonanywhere.com/

Inspired from Silicon Valley tv show’s Entrepreneur in Residence Jiyan Yang’s app

Here is an image classifier - which reads an uploaded image to classify as a hotdog or a not hotdog

Demo - http://nothotdog.pythonanywhere.com/

Working

  • Thanks to google’s codelab demos of tensorflow for image classification link.

  • I fetched images for two categories
    • hotdog
    • non hotdogs included various other images like - sandwiches, pizzas, salads, pasta, movie covers, wallpapers etc to cover wide variety of images.
  • Tensorflow is used to retrain MobileNet with a concept called Transfer learning.
    • MobileNets are optimized to be small and efficient, at the cost of some accuracy, when compared to other pre-trained models
    • Transfer Learning, means starting with a model that has been already trained on another problem. Deep learning from scratch can take days, but transfer learning can be done in short order.
  • Once model is ready, google has tricks to reduce the size of the model
    • tf includes a tool called optimize_for_inference, that removes all nodes that aren’t needed for a given set of input and outputs.
    • The script also does a few other optimizations that help speed up the model, such as merging explicit batch normalization operations into the convolutional weights to reduce the number of calculations.
    • The second script called quantize_graph is available for optimization which quantizes the weight of the network allowing
  • List of all Pre-trained models one can use to build an image classifier depending on usage and compute available

  • Demo hosted on Google App Engine PythonAnywhere using Flask
    • Images extracted from google images using Fatkun Batch Download Image

Potential

  • Product #1: With enough training size and compute strength - Anyone can extend this to create the See-food App/ Shazam for food
  • Product #3: App can indicate food with possible allergens
Read More

Learning to use Jekyll-now!

Alt Text

This is a test blog to see -

  • Time taken by Jekyll ang gh-pages to refresh content
  • Formatting and layout of the theme
  • URL mapping
  • Blogs and Disqus comments

Notes to self

  • Remember to add a layout snippet on top of each blog page- Those variables at the top of the blog post are called front matter
  • Markdown Editor
Read More