Tl;Dr: I am working on a large-scale machine learning project to distinguish music from random noise using midi. I need to separate percussion tracks from non-percussion tracks.
Details:
I have a massive MIDI database of ~130,000 songs - downloadable here if you want to follow along.
Each of these MIDI files contain an arbitrary number of tracks. Some of these tracks are instrumental (non-percussion); other tracks are percussion. For my purposes, I want to take the MIDI tracks in a given song and write all NON-PERCUSSION tracks to a single instrumental output track. I have no idea how to differentiate between instrumental (those which sound a distinguishable pitch, like Piano, or Trumpet, but also Vibraphone and the like) tracks and percussion (those which sound non-distingushable pitch, like snare or bass drum) tracks in an automated manner. Going through tracks manually is not an option due to sheer volume; furthermore, some tracks in the database are erroneously titled (sample titles: 'Violin - Solo', but also stuff like 'track 17' and even glitchy stuff like '* Music Energy II GM Data, Music Channel BBS'). Due to the way my midi library works, some of these tracks do not contain any notes at all. This would all imply that whatever I implement, it'll have to distinguish between the actual notes in the track, not the title of the track.
Clearly Sibelius can perform this differentiation, because whenever I open a MIDI file with it the percussion tracks are correctly mapped to percussion sounds, instead of default instruments.
What am I missing here?
Technical Details:
I am using the Python MIDI library MIDO to traverse the database and read the midi files, and the Python MIDI library MIDIutil to write midi files.
I can post my code in the comments if requested.
The General MIDI System Level 1 specification says:
Key-based Percussion is always on channel 10.
(Please note that the tenth channel has the number 9.)
Yes, it is usual to place percussion onto Channel 10 (based on 1 - 10, which in terms of MIDI data would be 0 - 9). However, note that it would also be usual to assign a 'Program Change' instruction on that channel to set a particular 'patch', or 'Drum Kit', containing the required percussion sounds, although some older devices may have 1 kit only, or some devices might default to a particular kit on Channel 10.
Also, I assume that Channel 10 COULD be used for any program sound if a program Change setting is made, so you should not assume that Channel 10 MUST be percussion.
Similarly, you could split percussion between Ch.10 and one or more other channels.
If the MIDI files you have are older files, then you could go a long way by assuming percussion is on Channel 10, AND also by assuming that a channel without ANY program Change command could be percussion. This will NOT be reliable for any files designed for more modern modules which have multiple Drum Kits selected via program Change instructions.
Geoff
Are you still wanting to do this? I'm not clear WHY you wish to remove the percussion, or how vital it is to remove ALL percussion.
I can make some suggestions.
I have written a number of small progs (usually 'C') to manipulate midi files in one way or another, also I have some utility progs from other sources.
It may be possible to achieve a lot of what you want using a combination of such progs. If that would help?
I have a prog of my own which reads through a midi file and reports on the details of Tracks/channels, ## of note events and midi events, and any text items. The present versions of this prog are intended to process a single file, but the prog could be altered to process a mass of files automatically, and save selected data in a form that another prog could use.
The 'other prog' I have in mind is part of a package called MidiTools, One of the utilities allows the processing of a file to remove specified tracks/channels, another utility will take an existing midi file (say using format 1) and re-write it as a format 0 (one track only) file.
Such a combination of utility progs could be combined within a batch file to automate the whole process.
I cannot say that such a process would provide 100% reliability regarding your intentions, but it could run fairly close.
One other factor to use could be the fact that the percussion track will usually contain a much larger number of note events that any other track. And likely fewer midi (command) events than usual. Such info may be helpful in confirming a track as percussion?
All the progs I'm referring to are DOS based utilities, and they not run on newer machines (W7 onwards). I have only ever used them on DOS based machines, specifically the Pentium with my Roland LAPC-I card installed, in turn connected to various other midi devices incl a Yamaha MU90R, Korg NS5R, Yamaha TQ5, etc, all linked up to a Fostex MC102 keyboard mixer.
Geoff