About
This is a small project to test some self learning concepts I’ve read about and for fun 😀
What I’m going to do
I’m going to train a neural network to take in 2 songs and generate an audio mashup from that.
I’m going to compare that to training a neural network to take in 2 videos and generating a video mashup of that. The problem with this is that the data would be a bit of a mess because mashup videos on youtube seem to take footage from other sources instead of the song video.
If I do transfer learning of the audio mashup NN for the video mashup NN, that should be more effective right? But the audio and video should be correlated…
Steps
- Data Collection & Storage : I’ll use youtube_dl to make a script to download mashups and then from the title of the mashup, get the name of the 2 songs and download them too. -> will use a folder structure (mashupName > mashup folder + songs folder)
- Data Cleaning : Going over the data to make sure the mashups are actually mashups of the songs I’ve collected
- Exploration / Transformation : Figure out how I want to represent the songs as input into the neural network, the score for the neural network’s output should represent the similarity against the original video, learn-to-hash?
- Training : I currently want to test out using self-learning (GAN style). So I’ll train a discriminator using previous generation samples of the NN and the actual video to label with score whether it is an actual good mashup and let train it like a generative adversarial network
- Testing : Once the GAN is pretty good, I’ll test against mashups it has never heard before.
- Try out video NNs
- Implementation + new avenue to explore : I’ll post some mashups to youtube~ and see the number of likes and dislikes a video gets per view -> train the network to produce mashups that are more liked per view?
Followup Posts
- Data Collection (Part 1)