Technology used:
- Python
- OpenAI API
- Flask
- Google Cloud Run
I created a Python-based web application which extracts the transcript of a specified cooking video, translates it to English if necessary, then runs it through generative AI to output a recipe.
This summer, I decided to play around with OpenAI API. One day, I read an article by Zach Seward on creating structure with generative AI, where he explained how he used OpenAI API to organize transcripts for city council meetings with key points and chapters. The key idea of the article is that generative AI is actually better at reordering and manipulating existing data than creating new data.
I got to thinking about how I could apply this idea to a real project. In August 2024, while working as a web dev intern at New Circle Consulting, we talked about a YouTube channel that posts cooking videos every week (I unbiasedly recommend subscribing if you want to learn some Chinese recipes or how to best utilize Costco food products).
The channel owner wished to include text versions of recipes in the descriptions of his videos from his video speeches, preferably, without doing much extra manual work. So I set off to create something that could do the recipe-generating for him in a matter of seconds. I planned to use generative AI to form recipes from video transcripts.
One helpful thing I learned was that the Pydantic library can be used to create structured data. So, for my recipe generating script, I could specify the output of the recipe-generating function to include various fields, like a string with the description “The name of the dish” and an integer with the description with the description “The number of servings that the recipe makes.”
Yes, I would now need to format this output into text that could be copy-pasted into video descriptions and a simpler solution might’ve been to just have a single prompt asking for a text recipe with all of the specified information. But, I wanted to experiment with this Pydantic class library and hopefully, this would lead to more consistent results.
Then, I added some code that formats the class output of the recipe-generating function into copy-pastable text and some code that is able to get a video’s transcript from its url, so now, the script was able to return a text version of a recipe video from its link. Pretty neat. But, this was all still happening in a Google Colab notebook, and given that the script needed a minute to install a list of libraries before running, this was still far from a convenient tool that the channel owner could feasibly use.
After some research, I decided on using Flask to create a web application and deploying it using Google Cloud Run. I spent some time learning how to containerize my Python script and creating a simple html page with form submission, and voilà!