The Note Consolidation Interface is a sophisticated suite for editing
note capture data with an interactive musical staff. The main
difference between this type of interface and other music-editing software
is that the origin here is waveform audio, while other software can typically
generate content only from other metadata or direct user input.
The timeline of the song interpreted from the waveform audio is shown as a
horizontally scrolling image in the center of the window. For those familiar
with musical staff notation, little is left to the imagination. Time
signature, key signature, clefs, measure boundaries, notes, and rests all
appear as they would in sheet music.
Because of the highly subjective nature of note capture output, the more
nuanced notation usually written in sheet music does not appear in this
interface. All notes appear as singles, all notes at or above middle C
appear in the treble clef, and all notes below middle C appear in the bass
clef. Aesthetic decisions about how to orient note stems in close proximity
are not configurable; the tool does its best job to guess what looks like
a reasonable layout on the page.
ZZT Sound Plus also supports its own unique notation. Measure
numbers and section numbers are enumerated at the base of the image.
Section boundaries are shown within measures using dotted lines. There
are also several postfixes that the user can choose to show next to each
note: volume numbers, decibel numbers, and voice numbers.
It's easy to be overwhelmed by the interface at first glance. With practice,
though, the Note Consolidation Interface is a very powerful tool for
a sound engineer to have at his or her disposal.
If one wants to generate quick consolidated output, one can select the
Auto-Consolidate button, which merges adjacent notes at similar
volumes and assigns preliminary voice numbers. The note output after
auto-consolidation yields much better-quality results in
The Hall of Music
or other destinations than the "rough cut" output would yield.
Data Sources for Notes
If the data for the notes displayed in this interface originated from a .WAV
file, most of the tempo, signature, and spacing idiosyncrasies will not be
known at first. The user will need to experiment with the settings within
the interface in order to get a feel for which settings work best with the
song characterized.
If the data for the notes originated from a .MID file, most of the song
information will be known, and ZZT Sound Plus will fill in this
information automatically if the MIDI file contains it.
MIDI note metadata is characterizable in most circumstances, although not all
nuances present in the original MIDI file are used here. Note on/off events
will be used, but program changes, control changes, aftertouches, and pitch
bends will not be reflected in this interface.
ZZT Sound Plus remembers interface settings between sessions. For each
file loaded in the interface, the application saves the last settings chosen
for that song. This is to make it easier to model songs multiple times, so
that one does not need to store metadata in a separate location. Saved
settings can be removed by deleting entries from the ZZT_SP.INI file
in the user's roaming profile.
Pointer Mode
All notes in the timeline are selectable with the mouse. The default pointer
mode is section selection mode, accessible by pressing F1.
A single click on a note will select it and show information about it in the
area just below the mode selection buttons.
One can select multiple notes by holding down the Control key and
clicking several notes in succession. One can also select entire sections
at once (which includes all notes per section) by holding the Shift key
and clicking a different section from the last section clicked.
A related mode is box selection mode, accessible by pressing F1
a second time. This mode differs from section selection mode in that one can
drag a "box" with a vertical component to select notes at specific X/Y locations
in the staff, as opposed to whole sections at once.
Add note mode changes the behavior to add new notes to the staff at
the click location. This mode is chosen by pressing F2.
The Active Update Info located to the right of the
information text indicates the volume, duration, and
voice number chosen for new notes added to the staff.
Move note mode changes the behavior to allow the user to drag a note from
one spot on the staff to another. This mode is chosen by pressing F2
twice. A note's pitch can be modified by dragging it up or down the staff,
and a note's starting location can be modified by dragging it to a different
section.
Insert sections mode changes the behavior to allow the user to insert
sections into the timeline. This mode is chosen by pressing F3.
A section's duration is the shortest playback interval chosen for the note
capture operation. One can "bump forward" the timeline with empty sections
with this mode.
Set measure start mode lets the user pick a specific section to act as
the start of a measure boundary. This mode is chosen by pressing F3
twice. Since measures are periodic, the section chosen will be MOD'ed by the
number of sections per measure, as needed. This mode can be useful when
aligning measure intervals to audio that might not have started on a
"complete" measure boundary.
Show/Hide Options
At the lower-left portion of the window, there are several options for
configuring what is shown in the staff display.
The staff only shows a limited subset of the total notes captured. The
notes shown are filtered by the F-Num Show Range and the Volume
Show Minimum. Only those notes within the frequency range are displayed
on the staff, and only those notes greater than or equal to the volume level
minimum are displayed on the staff.
One can fine-adjust the volume show minimum with the keys
Ctrl+Plus and Ctrl+Minus.
These filter options are useful when distinguishing notes by relevance.
The fact that most waveforms are not perfect sines means that artifacts,
in the form of shorter, quieter notes than the ones desired, will appear
at various locations in the staff. While the option exists to display all
of the notes from the capture operation, the user generally only wants to
view the notes most relevant for editing.
The act of filtering the displayed notes limits what is selectable on the
staff. An excluded note cannot be selected with the mouse (or even the
Select All feature), and cannot be copied to the clipboard. Most
operations will skip over excluded notes in favor of those visible.
The checkboxes below the filter options control the postfixes and other
notations shown on the staff. One can show or hide the volume number,
decibel number, and voice number next to each note. The
section numbers and measure numbers, drawn at the base of the image, can
also be shown or hidden.
It is not normally necessary to show voice numbers because
ZZT Sound Plus color-codes the notes by their assigned voice numbers.
See the Voices buttons for the color key. For those without good
color vision, though, it can be helpful to enumerate voice numbers.
The duration extenders are optional dashed lines that can be turned
on or off. These lines extend from each note as a way to highlight how many
trailing sections a note "occupies." Duration extenders are useful in
identifying how note durations characterize the entire affected portion
of the timeline, instead of just where the note begins.
Global Settings
Global settings control high-level options that are not strictly
related to note capture data or the consolidation process, but can be
useful in characterizing the output.
The Tempo setting, in quarter notes per minute, defaults to the
setting used in the original note capture operation. However, it can be
changed so that output formats assume this new tempo specification. In
copied clipboard data, this is a "Unnn" code. In MIDI, this is a tempo
meta-command.
It is possible to "guess" what the song tempo should be with the Infer
Tempo button. This button analyzes a user-determined selection to
come up with a more accurate assessment of the tempo than what had been
selected in the note capture operation. While not perfect, it can be
helpful to use these suggestions for a second note capture operation later.
The way ZZT Sound Plus infers tempo is by interpreting the user's
selection of multiple sections as a "complete" measure. The tool posts
several suggestions for what the tempo would be based on a re-interpreted
timeline and the original note capture tempo setting. The suggestions,
of course, are only as accurate as the user's selection would indicate.
The Time Signature sets the measure boundaries via the numerator
and denominator numbers of the signature. After changing the time
signature, the staff is updated. A time signature is also set as a MIDI
meta-command when MIDI output is generated.
The Key Signature sets the key signature on the staff, and also
sets the corresponding MIDI meta-command when MIDI output is generated.
The key signature modifies how notes appear on the staff, manipulating
both the vertical positions for some notes and the appearance of accidental
sharps, flats, and normals for others.
The Section Spacing slider adjusts the normal section width. Sections
have a default length based on the number of postfixes that need to be
displayed next to notes within a section. This slider can grow or shrink
the standard size.
Measures are not necessarily spaced evenly in ZZT Sound Plus. If there
is no need to display a section for lack of any visible notes present, it
is skipped in favor of a rest of appropriate duration. This means that
less complicated music (more quarter, half, and whole notes) has smaller
measure widths than more complicated music (more sixteenth, thirty-second,
and sixty-fourth notes). The user must decide the best way to trade off
readability with functionality.
ZZT Sound Plus remembers global settings between individual
sessions, and which settings one had chosen for specific files. MIDI files
populate many of the global settings automatically, but the user will need
to pick his or her own ideal settings for other types of files.
The Chord and Keyboard Gauges
As notes are selected, they are highlighted using pink arrows and reticles.
Whole sections, when selected, are also tinged pink, indicating selection.
There are two additional selection indicators that one can use to analyze
the selected notes: the Chord Gauge and the Keyboard Gauge.
The Keyboard Gauge is a straightforward representation of how a
piano keyboard would look if the selected notes were depressed. White keys
are shown with a pink highlight, while black keys are shown with a red
highlight. This gives a good indication of where a person's fingering would
need to be as the notes are played. If the configuration would appear
impossible, some editing of the notes might be warranted.
The notes on the keyboard gauge can be toggled on and off with the mouse.
Clicking on a key will select or deselect all notes that match the chosen
frequency. This is a useful way to identify notes within a specific range
of a broader selection.
The Chord Gauge provides a rough representation of the energy
spectral density for all twelve of the half-steps in the music scale. As
notes are selected, the number for each note in the scale increases based
on the sum of products of duration and volume level at all octaves:
[C level] = [2-C dur]*[2-C vol] + [3-C dur]*[3-C vol] + [4-C dur]*[4-C vol] + [5-C dur]*[5-C vol] + ...
[C# level] = [2-C# dur]*[2-C# vol] + [3-C# dur]*[3-C# vol] + [4-C# dur]*[4-C# vol] + [5-C# dur]*[5-C# vol] + ...
[D level] = [2-D dur]*[2-D vol] + [3-D dur]*[3-D vol] + [4-D dur]*[4-D vol] + [5-D dur]*[5-D vol] + ...
[D# level] = [2-D# dur]*[2-D# vol] + [3-D# dur]*[3-D# vol] + [4-D# dur]*[4-D# vol] + [5-D# dur]*[5-D# vol] + ...
[E level] = [2-E dur]*[2-E vol] + [3-E dur]*[3-E vol] + [4-E dur]*[4-E vol] + [5-E dur]*[5-E vol] + ...
[F level] = [2-F dur]*[2-F vol] + [3-F dur]*[3-F vol] + [4-F dur]*[4-F vol] + [5-F dur]*[5-F vol] + ...
[F# level] = [2-F# dur]*[2-F# vol] + [3-F# dur]*[3-F# vol] + [4-F# dur]*[4-F# vol] + [5-F# dur]*[5-F# vol] + ...
[G level] = [2-G dur]*[2-G vol] + [3-G dur]*[3-G vol] + [4-G dur]*[4-G vol] + [5-G dur]*[5-G vol] + ...
[G# level] = [2-G# dur]*[2-G# vol] + [3-G# dur]*[3-G# vol] + [4-G# dur]*[4-G# vol] + [5-G# dur]*[5-G# vol] + ...
[A level] = [2-A dur]*[2-A vol] + [3-A dur]*[3-A vol] + [4-A dur]*[4-A vol] + [5-A dur]*[5-A vol] + ...
[A# level] = [2-A# dur]*[2-A# vol] + [3-A# dur]*[3-A# vol] + [4-A# dur]*[4-A# vol] + [5-A# dur]*[5-A# vol] + ...
[B level] = [2-B dur]*[2-B vol] + [3-B dur]*[3-B vol] + [4-B dur]*[4-B vol] + [5-B dur]*[5-B vol] + ...
By selecting a portion of the timeline, one can determine what the "chord"
is based on the frequencies that stand out. The above example shows that
the most prominent notes are D# (222), G (267), and A (430).
This gives a good idea of the underlying chord at that portion of the song.
The chord gauge offers a handy mechanism for removing artifacts that do not
match the chord:
- Select a portion of the timeline.
- Click the chord gauge for the notes with the highest energy spectral
density levels. This deselects them from the staff.
- Press the Delete key.
This has the effect of leaving the selected sections with only those frequencies
that are part of the chord.
Selection Editing
A person can select, copy, cut, paste, and delete notes on the staff. The
typical hotkeys for doing so are implemented (Ctrl+C, Ctrl+X,
Ctrl+V, Ctrl+A, Delete).
The clipboard format for notes copied from the staff is the same format used
in the ZZT Ultra #PLAY string syntax. When copied, the notes can
be pasted into
The Hall of Music
or a text editor. Similarly,
these note codes can be copied from external locations and pasted into
the staff at any selected section.
The Ctrl+Delete operation works differently from Delete. This
operation removes selected sections from the timeline altogether, moving later
sections into the deleted-section gap.
The Ctrl+R operation crops the selected sections to the notes visible
with the existing show/hide filters. Notes removed in this manner cannot be
selected or edited afterwards, even if the filters are changed to allow for
notes within the range.
The Ctrl+Z "undo" operation reverses the last selection editing or
modification event. Any operation can be undone except for
auto-consolidate and Reassign Voices by Spectral Density.
While the goals of editing the notes on the staff are entirely up to the
user to decide, perhaps the most important goal should be noise removal.
Out-of-place notes of short duration (usually in the extreme high or low
octaves) are likely artifacts from the note capture operation, and their
removal (using Delete, Crop to Visible, etc.) will help round
out the results of the captured notes significantly.
Changing Selected Note Attributes
The Active Update Info is used in many of the modification operations.
Direct adjustment of the volume, duration, and voice number is possible with
the mouse. There also exist several key commands for
adjusting these quantities.
The Set Volume button, or V key, sets all selected notes to the volume
level shown in the Active Update Info. This is handy for smoothing
out fluctuating volume from multiple notes captured. One can make fine
adjustments to this volume level with the +/- keys.
The Set Duration button, or D key, sets all selected notes to the duration
shown in the Active Update Info. This is handy for lengthening,
shortening, or otherwise making uniform one or more selected notes.
Merging of multiple notes at the same frequency will occur if a note is
lengthened to the point where it would "bleed into" notes in adjacent sections.
One can quickly set the active duration with the keys W, H, Q, I, E, S, T, J,
and period.
The Change Voice button, or C key, sets the voice number for
all selected notes to the voice shown in the Active Update Info.
Setting the voice number for a note will recolor the note to reflect the
color key shown in the voice buttons.
Voice numbers aren't altogether pertinent to musical staff notation, because
it is usually to the musician's discretion how the note should be played.
The reason for selecting voice numbers is output metadata generation
(clipboard format or MIDI file).
Until voice numbers are assigned to individual notes, notes will have no
default voice number. Copying the notes without voice numbers to the clipboard
or creating a MIDI file will assign voice numbers in a subjective fashion.
Selection Modification
The user can perform more involved modification to the selected notes or
sections. The buttons in the Modify Selection box in the upper right
portion of the window operate on the selected notes or sections. If no notes
are selected, the operation applies to all notes in the capture output.
The Merge button merges selected adjacent notes unconditionally, and does
not assign voices. For notes to be merged, they must have the same frequency,
and the end of the duration of the earlier note must exactly touch the start of
the later note.
If merged notes have different volumes, the highest of the volumes is picked.
The Break Apart button separates notes into shorter durations based on
the following criteria.
- Note spans measure boundary: Remove the tie; make two unique notes
split at the measure boundary.
- Note is dotted: Remove the dot; make two unique notes with the
"dotted" portion becoming its own note.
- Note is undotted: Split the note evenly; create two unique notes
at the next-shorter denomination.
The Auto-Consolidate button merges adjacent notes at the same frequency
and assigns voice numbers based on a frequency-affinity system. Each voice
number has a "preference" for a specific frequency, which means that notes
on a specific part of the staff (treble clef or base clef) will tend to select
voices closest to that affinity if they are available. The resulting
coloration also helps to distinguish overall pitch patterns.
When auto-consolidating notes, adjacent notes are only merged if the notes
are (A) directly next to each other, section-wise, and (B) within a volume
tolerance of +/- 3 levels. When merged, the composite note will have the
maximum volume of all merged notes.
Important: Auto-consolidation cannot be undone.
The Reassign Voices by Spectral Density button reassigns note voice
numbers based on the need to set a single voice per frequency for each measure.
After this change has taken place, one can guarantee that a note frequency will
have a voice "to itself" if there are sufficient voices available.
The maximum number of voices selected in this fashion is set by the Limit
combo box to the right. For example, if this is set to 10, it means that only
10 voices will have unique numbers assigned; all the rest will be assigned voice
zero.
The idea behind this operation is that frequencies with the highest energy spectral
density (see notes above on the Chord Gauge)
will have the lower voices, starting at 1, and those frequencies with lower density
will have higher voices, ticking upwards until the maximum limit is reached. Those
frequencies with the least density are then assigned voice zero, which can be easily
removed if desired.
Important: Spectral voice reassignment cannot be undone.
The Round to Duration button "rounds" the selected notes to a duration
pattern resembling the duration from the Active Update Info. Selected
notes will be set only if the selected note has a duration less than or equal
to one denomination above the duration specified in the active update info.
In other words, eighth note rounds to eighth-note duration anything
eighth-note-dot or shorter, eighth note dot rounds to eight-note-dot duration
anything quarter note or shorter, and quarter note rounds to quarter-note
duration anything quarter-note-dot or shorter.
The Round to Start Time button rounds note starting sections to "whole"
locations based on the existing note duration. The time signature is used
to help judge where these positions should be. For example, for 4:4, a whole
note is rounded to the first section of a measure, a half note is rounded to
the first section or halfway-point of a measure, and quarter notes are rounded
to sections offset from the start of a measure by +0, +1/4, +1/2, and +3/4.
The Shift Key button shifts the selected notes up or down by the
specified number of half-steps. Shifting frequencies in this manner has the
effect of changing the key in which the song is played back. Keep in mind
that this action does not modify the key signature.
Voice-by-voice Selection Toggle
The colored Voice Buttons toggle specific voices in the selection on or
off. This is a useful way of isolating the scope of multiple selected notes to
just one or more voices out of many, so that an entire set of voices can be
subjected to common changes, while ignoring other voices.
Ctrl+(Number) will toggle the selection status on or off for the
appropriate voice in the previously selected sections. Note that the hot key
combinations only work for voices 0 through 9; voices above 9 do not have hot
keys.
Holding and pressing Shift+Button will toggle the appropriate voice in
an "exclusive" fashion: all voices except that voice are deselected,
while only the voice chosen will be toggled on or off.
Context Menu Actions
Right-click the staff to open a context menu. The actions in the context menu
apply to the current selection.
Many of the actions are also available as buttons and/or keyboard shortcuts,
including Merge, Break Apart, Copy, Cut,
Paste, Delete, Delete Section, and Crop to Visible.
There are also some actions that are specific to workflow and are accessible
only via the context menu.
Select Entire Measure expands the selected portion or last clicked
location in the staff to cover one or more measures. This is a quick way
to perform operations that apply to only one measure in the song.
Select Active Duration Interval expands the selected portion or last
clicked location in the staff to cover the length of the interval set by the
active update info. For example, if the active update duration is set to
quarter note, the interval selected will be the length of a quarter note, and
rounded to the start of the measure using quarter-note multiples (4/4 time
signature will have exactly four different possible selection starting points
per measure).
Delete Notes in Measure Here and Above clears the measure of all
notes at or above the selected note's frequency. This can be useful when
removing artifacts.
Delete Notes in Measure Here and Below clears the measure of all
notes at or below the selected note's frequency. This can be useful when
removing artifacts.
Move Notes 1 Section Left/Right performs a quick cut-and-paste
operation with the selected notes, moving them one section left or right.
It is handy to use this feature when cleaning up a song with notes that
are slightly "off" their preferred starting time.
Shift Octave +/-1 performs a quick octave shift of the selected
notes. This is generally faster than using the Shift Key feature
when a note only needs to be modified by a single octave.
Consolidate towards Middle Octave performs a special "chord
consolidation" within the selected sections. Selected notes provide the
basis for the frequencies that will be consolidated towards middle C.
If a note exists closer to middle C (lower than self for treble clef;
higher than self for bass clef) whose frequency is only different by octave
multples, such notes are "absorbed" into the notes closer to middle C.
If the removed notes are longer and/or louder than the retained notes, the
retained notes assume these longer and/or louder characteristics.
One would consolidate notes up or down an octave if it is known that high
or low harmonics will appear as a result of instrument aliasing, and such
harmonics are not considered necessary for playback.
Subtract Clipboard from Selection performs removal of notes in the
selected sections that match equivalent note content found in the clipboard.
This somewhat unusual operation can be useful for getting rid of notes that
are seen as repeated across multiple measures, and one wants to purge the
"regular" notes to get a better view of the "solo" notes.
Other Actions
The Play button plays to the speakers the selected sections using
perfect sines as the waveform for each note. There are brief attack and
release portions of the envelope for the notes played, but for most of the
duration, a note is just played as a sine wave sustained at the volume level
specified.
Playback is of limited quality because sine functions are used to synthesize
the audio, resulting in a relatively cheap waveform profile. The audio
will appear to be subjected to a low-pass filter of about 4000 Hz. Note that
The Hall of Music
delivers somewhat better output, albeit with little control over instrumentation.
Playback can be limited to only selected sections, or it can continue until
the end of the timeline. If only a single section is selected, you will be
prompted to confirm playback of the remainder of the song.
MIDI output translates the notes to MIDI format and saves the file to
the disk. A MIDI song generated from ZZT Sound Plus only contains
note F-numbers, note durations, and note volume levels (implemented as
velocity) plus the appropriate signatures and other general metadata. Voice
numbers set for notes translate to channel numbers. The tool does not save
instrument names, program or control changes, aftertouches, or other types
of metadata.
PNG output translates the notes to a series of sheet music image files.
You can number the individual pages of the file by placing "[]" brackets in
the filename; the brackets are replaced by a number 1, 2, 3, 4...etc. as more
pages are generated. Sheet music has a lot of nuance to it; an additional
dialog box opens for configuring the output before it is saved.
The Sheet Music Generation Settings are in an early prototype stage.
There are settings to control the standard, maximum, and minimum percentages
of page width that the user wishes to allocate to measures, as well as the
limits on measures per row and rows per page.
The resolution of the generated images is determined by the width,
height, and margin settings. The greater the number of pixels, the finer
quality the notes will be when the image is printed out.
Because sheet music can be quite subjective when displayed, it may be
worthwhile to edit the images after they are generated. ZZT Sound Plus
attempts to display notes in a very basic fashion; it cannot possibly
know the "right" way to organize them in every context.
|