[From the sandbox] Midi reconstruction from Synthesia video clips (and similar ones)

[From the sandbox] Midi reconstruction from Synthesia video clips (and similar ones)


One day while sitting in YouTube, searching for interesting teaching melodies, I came across videos from Synthesia, some of which I really liked, I decided to download and I will learn ... = ) But alas, as it turned out, there are videos, but no one was eager to upload midi files = (


Zasev for Google decided to see if there are ready-made solutions that would suit me, but alas, from the fact that I found only audio converters in midi, which slightly upset me ... Without thinking, I decided that it would be enough to restore the MIDI frame rate in the videos. .., and I decided to try to implement this business ....


I didn’t want to write everything from scratch, so I decided that I would do it on ready-made components provided by Debian GNU/Linux, from what python was best suited.


At the beginning of the implementation I decided that I would use ready-made (pulled out of the video clips) pictures, but after the first uploads I realized that there was no point ... It turned out that it was very slow and also consumes significant resources on the screw ... Then I decided to try it for myself A new thing like OpenCV (I really wanted to touch it for a long time), it turned out that OpenCV works very well with the video stream, providing all the functions I need (read pixel, output frames and text).


For example, opening a video file and getting one frame can be described in two lines:


  vidcap = cv2.VideoCapture ('test.mp4')
 success, image = vidcap.read ()  

And if you like, you can dump the frames directly onto the screw:


  cv2.imwrite ("/tmp/frame% d.jpg"% frame, image)  

After some time, I wrote a function for generating the positions of the virtual keyboard keys, and displaying them (in the form of rectangles) over the image of the stream and unloading the picture turned out the following:
image


So having decided that frame by frame, when reading an image from a video stream, I’ll read active notes from the virtual keys position (only those notes that are active are pixels that match the reference color or not far from it) and send them to midi. It was impossible to simply register the notes, as if the situation was on an ordinary midi keyboard, only a little simpler ... I checked it on the video, looked at how many notes came across (and there were quite a few of them) I thought not bad, it only remained to figure out how to write notes to the file, looking for a bit , I found an excellent python python midiutil package. After some time I was able to record notes in the midi. As it turned out, python-midiutil is a very simple and very easy-to-use sachet. For example, creating a file and adding notes is a couple of lines:


  mf.addTrackName (track, time, "Sample Track")
 mf.addTempo (track, time, 60)

 mf.addNote (track, channel, pitch, keytime, duration, volume)

 with open (outputmid, 'wb') as outf:
  mf.writeFile (outf)  

Downloading the resulting midi to LMMS turned out to be quite successful. The first thing I restored a couple of favorite tunes. Then it became clear that the function of generating key positions was not very convenient from roller to roller, their location changed, I decided that I would do the GUI, I made a simple one, but with the function of key placement


image


I think that this program can be useful to many people, because I put it all on the github

Source text: [From the sandbox] Midi reconstruction from Synthesia video clips (and similar ones)