Overview
Recently, I became frustrated with my Creative AWE64 card and MIDI scoring software. I was disappointed by the huge difference between human and (untweaked) computer performance, especially for breath and bowed instruments. So I looked into ways to spice things up. I experimented with various controllers, including breath and pitch wheel. I also delved into software techniques involving Cakewalk CAL scripts, CSound programming, and Superconductor. However, I found that all of these approaches are quite limiting. They are simply either too difficult and/or time consuming to use effectively or they do not deliver the quality I am after. Eventually, I became aware that the subtle nuance I wanted when realizing my compositions could be attained (easily) only through recording live human performance. And it could only be captured to MIDI through the use of sophisticated (and thus expensive) MIDI-enabled instruments. But this was at direct odds with my whole reason for choosing to compose and realize music with personal computer, cheap sound card, and scoring software: I wanted an inexpensive yet powerful system for making and distributing music. And I wanted one that did not require massive digital storage, organizing a group of musicians, maintaining a practice space, or any of the other real-world inconveniences that such an endeavor typically imposes. Then I did a bit more thinking, and I had a wonderful insight: vocal performance is the least expensive, most natural, and most direct way of expressing desired performance parameters. After that, I started to track down software that could translate audio, such as voice recording, to MIDI performance data. This article describes the three best audio-to-MIDI offerings I have found to date.
The Digital Ear: "Best Results; A Little Slow But Easiest To Use."
Of the three, I like Digital Ear the best. This program comes closest to capturing the full expression found in singing, wind, string and other continuously variable pitch instruments. The product takes an original approach to conversion in that it captures the initial note of each unbroken phrase and uses MIDI pitch wheel data to capture the rest of the pitch, pitch slur, and vibrato elements for that phrase. Marvelously, it also captures the continuous volume of the performance on the MIDI Sound Volume control. It also has the option of capturing the brightness of the performance on any control # of your choosing, defaulting to #74 (MIDI Sound Brightness). Unfortunately, Digital Ear does not currently do polyphonic capture. But for the moment this is not something that really matters to me: I am most concerned with adding life to the AWE64s blown and bowed instruments which does not really require polyphonic capture.
Conclusion: The Digital Ear software, in my opinion, produces fantastic results that are well worth the $80 price tag. The only things I would like to see improved or added are:
Although it is probably asking too much to do this with no feedback delay as would be required to use the tool in live performance, even a 40-80 ms delay would be very tolerable for the purposes of studio recording and live tuning of the various capture parameters for optimum performance. Unfortunately, the demo version of the Digital Ear only does about 2 seconds of capture on each .wav file you submit. The people at Digital Ear must know they have a good product. However, the included demo MIDIs are really cool and I have proven to myself from several two-second conversions that the product really does work as advertised.
Intelliscore: "Second Best Results; Hardest And Slowest To Use"
Intelliscore (at http://www.intelliscore.net/) is the most sophisticated of the three in that it can do polyphony. This sounded promising at first, but I found that it was quite difficult to get good results. There are quite a few parameters that one needs to specify, and it seems that one must set them all "just right" for the software to do a good job. The fact that these settings must be changed, based on the instrument being captured and the material being performed, does not help much either. In fact, I played with the settings provided for converting each demo sample, and even slight modifications resulted in completely unusable output. Adding the fact that the software only understands fixed pitch instruments, this is a product that I find too limited and difficult for regular use. Also, the software takes a long time to do the conversion (even on a 500Mhz machine). So in the trial and error process of finding settings that work for a particular sample, you can end up spending quite a bit of time using this program before you start to get even slightly usable results.
Conclusion: Despite its sophisticated ability to do polyphonic audio-to-MIDI conversion, this software is not what I am after at the moment. If I want to capture polyphonic percussion-class instrument performance, I get better results from a good MIDI keyboard. And if I want to capture continuously variable pitch instruments, I am S.O.L.. Plus, I lose any dynamics that are not initial velocity related. And timbre is just plain ignored. Of course, the product is OK for someone who has no ear but wants to capture recorded piano work to MIDI. And it might be good for a more advanced musician who wants to capture recorded keyboard work that is too difficult to transcribe by other methods. Anyway, the idea is great and I look forward to more work and refinement in this area from the people at Intelliscore.
Inst2Midi: "Worst Results; Best Integration With Other MIDI Products, Fastest Results"
In my opinion, it currently has the most potential of the three. Disappointingly, though, it currently yields the worst conversion results. I feel Inst2Midid has the most potential because it does real-time conversion and acts as a live MIDI input source for recording software. Add to that the fact that the "Nerds" have ambitions to provide real-time polyphonic capture of live performance to MIDI and it could spell "Killer App." Unfortunately, I have had little success with getting good results. Apparently the software can do an "acceptable" job if set up properly. At least this is indicated by listening to the provided demo audio and corresponding midi files. The biggest problem I have with this product is that it uses the standard keyboard-biased approach of extracting only the start time, quantized pitch, velocity, and duration information that any cheap MIDI keyboard can produce. This fact produces very irritating results because slurs and vibrato tend to turn into messy sounding gliss, trill or grace-note ornamentation, or worse...much worse. Add to this the poor choice of hysterisis settings (note locking stability) that the program exhibits, an you are very likely to get noise rather than music in all but the most subdued and controlled performances.
Conclusion: Once again: why not just perform the same stuff from a MIDI-enabled keyboard? To me, the whole idea of converting audio to MIDI is to capture as much of the great expressive quality that only continuously variable pitch, volume and timbre instruments offer. I can see this software as an interesting tool for guitarists or single voice instrumentalists who want to capture simple solo-melody performances or control synthesizers in a live setting, but they had better be pretty careful: it does not take too much exuberance in one's playing to start getting lots of wrong MIDI notes.
Conclusion: The results of Digital Ear are really quite impressive. A little more work and the results could be absolutely stunning. The other products offer tantalizing features, but they have a long time to spend at the drawing board before they will find a space on my "regularly used" software list. In Pursuit of Realistic Timbres On related line of thought, it is amazing what one can actually do with an AWE64 card. Of course, this requires one to dig down into the technical specs and to spend considerable time and effort adding pitch, expression controller, velocity and other MIDI data. Also, avoiding the AWE64s internal reverb and chorus effects and then post-processing with a good software or hardware reverb can really help the final realization. I did dig into the specs, spent the time, and produced 40 bars of highly tweaked MIDI data. And was quite pleased with the initial result. But I was also quite disappointed to discover that the AWE64 has no implementation of brightness response to volume, expression, or aftertouch data, rendering performance on a touch-sensitive keyboard all but useless without some complicated scheme to convert these control streams to some combination of NRPN control data. Now I have to figure out some scripts to do all the stuff I did by hand. Of course, it should not be too difficult, really. Most of the tweaks were performed using the simple rules that:
Product: OfficeXP (known as Office 10 in beta; released May 31, 2001.)
From: Microsoft
Price: Retail prices range from $479 to $799 (US) for various...
Office 2000 Premium (final) – a Hands-on Test
June 10th, 1999 was the long-awaited release date of Microsoft Office 2000. Although the full...
Post new comment