Mini-Project – Switching Vocals

This project is related to my youtube mashups project but should be easier to train as it essentially is a pitch shift of only one song.

Essentially I want to train a NN to generate songs like:

From the base song.

Later maybe it can generate videos?

Getting Data

Format for Audio

I’m downloading the audio in wav format and keeping the video using the following code

WAV format can cover the full frequency that the human ear is able to hear! An MP3 file is compressed and has quality loss whereas a WAV file is lossless and uncompressed. 

Artisound.io

Download Script

I am checking to ensure I reject the megamix (the mashups with 20+ songs) and only picking switching vocals with some regex that is supported by youtube_dl

from __future__ import unicode_literals
import youtube_dl
import os
from pathlib import Path

rootdir = str(Path().absolute())

def QueryYoutube(QueryList, toSkip = True):
	""" Get the list of results from queries and put it in a json file"""
	ydl_opts = {
		# "outtmpl": "%(title)s.%(ext)s", #file name is song name
		"outtmpl": os.path.join(rootdir,"%(title)s/SV.%(ext)s"), #folder name is song name, file is SV
		"ignoreerrors": True, #Do not stop on download errors.
		"nooverwrites": True, #Prevent overwriting files.
		"matchtitle": "switching vocals", #not sure if this works (Download only matching titles)
		"writedescription": True, #Write the video description to a .description file
		"skip_download": toSkip, #don't actually download the video
		"min_views": 100, #only get videos with min 10k views
		"download_archive": "alreadyListedFiles.txt", #File name of a file where all downloads are recorded. Videos already present in the file are not downloaded     again.
		"default_search": "auto", #Prepend this string if an input url is not valid. 'auto' for elaborate guessing'
		'format': 'bestaudio/best',
	    'postprocessors': [{
	        'key': 'FFmpegExtractAudio',
	        'preferredcodec': 'wav',
	        'preferredquality': '192'
	    }],
	    'postprocessor_args': [
	        '-ar', '16000'
	    ],
	    'prefer_ffmpeg': True,
	    'keepvideo': True
		}
	with youtube_dl.YoutubeDL(ydl_opts) as ydl:
	    ydl.download(QueryList)


def test():
	"""Test by downloading two sets of two SV"""
	# queriesL = ["nightcore mashups", "bts mashups", "ytuser:https://www.youtube.com/channel/UC5XWNylwy4efFufjMYqcglw"]
	# queriesL = ["ytuser:https://www.youtube.com/channel/UC5XWNylwy4efFufjMYqcglw", "ytuser:"]

	#nightcore switching vocals
	queriesL = ["https://www.youtube.com/channel/UCPtWGnX3cr6fLLB1AAohynw", 
				"https://www.youtube.com/channel/UCPMhsGX1A6aPmpFPRWJUkag"
				]
	# QueryYoutube(queriesL, True) #should download that channel
	QueryYoutube(queriesL, False) #should download that channel

def run():
	##### DOWNLOADING
	#nightcore switching vocals channels
	queriesL = ["https://www.youtube.com/channel/UCPtWGnX3cr6fLLB1AAohynw", 
				"https://www.youtube.com/channel/UCPMhsGX1A6aPmpFPRWJUkag", 
				"https://www.youtube.com/channel/UCl2fdq_CzdrDhauV85aXQDQ",
				"https://www.youtube.com/channel/UC8Y2KrSAhAl1-1hqBGLBdzA",
				"https://www.youtube.com/channel/UCJsX7vcaCUdPOcooysql1Uw",
				"https://www.youtube.com/channel/UCtY3IhWM6UOlMBoUG-cNQyQ",
				"https://www.youtube.com/channel/UCNOymlVIxfFW0mVmZiNq6DA"
				]
	QueryYoutube(queriesL, False)

if __name__ == "__main__":
    run()
    

Data Cleaning & Automating Download of Original Songs

I’m taking the title of the youtube switching vocal video and using regx to find the names of the original song.

regex expression crafting

After converting the unicode to regular punctuation, I used a regex expression tester to zero in on the key words. I still have to remove some strings which got included in accidentally because I wanted to make sure I kept the artist names in the string groups matched.

Get the proper song names

I queried youtube according to the youtube_dl documentation (as the APIs for songs might not have the song I’m looking for, and I’m searching on youtube for the downloads).

To clean up the data and prevent duplicates of the original songs, I’m removing artist names and tags related to a song already noted down.

To look through the json for the metadata related to the youtube video, I used an online JSON Viewer:

Looking through Json Data

I made some test cases to check whether it works. I’m not using assert here because I don’t want the program to stop whenever a unit test fails.

Code posted in github in the Download Videos folder.

Current Folder Structure

Example of folder structure

Notes:

  • Problems
    • Some Original videos not downloaded
    • some wrong videos downloaded (different artist)
  • Solutions (Tackled in script C_FixMissedDownloads.py:)
    • Check number of original videos downloaded (should correspond to number of original songs)
    • Filter out incorrect number of original songs (e.g. Remove Folders not containing an “Original_*” video)
    • Filter out if videos are too long (possibly not a song)
    • #TODO Check similarity of song videos against switching vocals (should have some similar parts)

TODO:

  • Sort Folders based on whether there are multiple songs contributing to the final video
  • Make model to learn audio switchingVocals transformation for the original song

Parts left:

  • Exploration / Transformation : Figure out how I want to represent the songs as input into the neural network, the score for the neural network’s output should represent the similarity against the original video, learn-to-hash?
  • Training : I currently want to test out using self-learning (GAN style). So I’ll train a discriminator using previous generation samples of the NN and the actual video to label with score whether it is an actual good mashup and let train it like a generative adversarial network
  • Testing : Once the GAN is pretty good, I’ll test against mashups it has never heard before.
  • Try out video NNs
  • Implementation + new avenue to explore : I’ll post some mashups to youtube~ and see the number of likes and dislikes a video gets per view -> train the network to produce mashups that are more liked per view?

Things to improve efficiency:

  • Memory storage
    • prevent duplicate video files by storing all video files in common folder and just using the file path as a reference to the video.

Next Post

Leave a comment