Youtube Mashup – Data Collection (Part 1)

What I’m trying to achieve

I’m going to start by writing a script to make a list of mashups and find the song names from the title.

Once I’ve verified it works, I’ll leave it to download the videos in a folder structure that looks like:

  • Mashup Folder
    • Songs Folder
    • Mashup Folder
    • json of links

I think I can find mashups of the same few songs so there will be more possible mashups for a specific song selection.

Steps

  1. Update dependencies
  2. Make the Script
    1. Find list of mashups,
    2. Find song titles from mashup titles
    3. Download Mashup and related songs in a correct folder structure
    4. Download more mashups of related songs

Update Dependencies

Make sure youtube_dl is installed / up to date.

brew install youtube-dl

Writing the Code

So the plan is to make modular functions I can test out cause you should always TEST YOUR CODE šŸ™‚

Functions I need

  • QueryYoutube: Get the list of results from queries and put it in a text file
    • Input = count of songs for each query & queries to use
  • GetSongNamesFromMashup: Takes a text file and goes through each line (which is a mashup name) and generates a json file where the dictionary has mashup name: songName1, songName2…)
    • Possible issue where a song has a ton of songNames like those 50 songs mashups
    • will probably use a regex
  • DownloadAll: Takes a json file and downloads all the songs & videos in the folder structure as mentioned above. Put the links in the folder structure in a json file as well.
    • Test by downloading two sets of mashups

One thought on “Youtube Mashup – Data Collection (Part 1)

Leave a comment