LATEST NEWS from my Prolatio and music21 blogs:
[August 8, 2015 17:04 pm] « » [prolatio]
[Update: 12 August 2015: it is now possible to do this much more easily in the Beta version of GitHub Desktop]

For years I’ve been trying to get student assistants to use GitHub more effectively to work on larger projects.  One of the main problems though has been that the process of using forks + pull requests to submit their code to the main project has always required going back to the terminal for one key step: pulling and merging others’ changes from the upstream branches.  Today even for many seasoned programmers, the terminal/command prompt is a bit of a mystery to students.  Thus it would be great if the GitHub graphical client made this simple or at least possible.

The most recent versions of the GitHub client (at least on Mac; untested on Windows; I’m on v. 208) don’t exactly make the process simple, but at least they make it possible.

Open the GitHub client and if you haven’t worked on the project in a while, hit Sync. You should see your own fork; I’ll assume that you are working on the “master” branch and want to merge changes from the upstream (main project) master branch into your own branch.

Your screen should look something like this.  (I’ll be demoing on the amazing Latin dictionary program “whitakers-words”[*] which I do not have commit access on, so it’s like what my contributors to my projects would see).  All I’ve done is created a little demo text file.
Screen Shot 2015 08 08 at 16 18 50
Next I’ll pull down the tab on “master” and switch to the main developer’s master, “mk270/master” (third from bottom).
Screen Shot 2015 08 08 at 16 19 13
Screen Shot 2015 08 08 at 16 19 21
Click “Sync” in the upper left hand corner to copy it down.  The sync button will turn into a progress bar:
Screen Shot 2015 08 08 at 16 19 38
This actually creates a temporary branch confusingly called mscuthbert/mk270/master (or YOURNAME/THEIRNAME/BRANCH) but will just be displayed as mk270/master on the GitHub client.  No matter.  Now click the button next to the progress bar to create a pull request.  You will want to pull from this branch to your master branch (the one marked default branch):
Screen Shot 2015 08 08 at 16 20 07
Screen Shot 2015 08 08 at 16 20 18
You can leave the description blank since you’re just making a pull request to yourself. Go ahead and click “Send Pull Request”.
Screen Shot 2015 08 08 at 16 20 22
Click the link below the “Good work!” button to open up GitHub in your browser.  You should see something like this:
Screen Shot 2015 08 08 at 16 20 42
Scroll down to the bottom and you’ll see this.  Go ahead and click “Merge pull request” then “Confirm merge”.
Screen Shot 2015 08 08 at 16 20 50
Screen Shot 2015 08 08 at 16 20 55
Now you’ll see the option to delete this branch.  Go ahead and do it.  You are actually deleting “mscuthbert/mk270/master” not “mk270/master” — I hope it won’t let you delete the upstream master!
Screen Shot 2015 08 08 at 16 20 59
After doing so you’ll see this confirmation.
Screen Shot 2015 08 08 at 16 21 01
Now return to the GitHub client and switch back to “master” if it hasn’t already put you back there.
Screen Shot 2015 08 08 at 16 21 27
Go ahead and click “Sync” again for good measure.  Then you can check your History and see that you have all the most recent upstream commits:
Screen Shot 2015 08 08 at 16 21 53
Ta-da! Well, assuming that there aren’t any conflicts or anything of that sort.  In a case like that you’ll probably still need to get out the ol’ command-line tools, so you will still need to be a bit familiar with them (or have a friend who can help you out). Hopefully the next version of GitHub for Mac/Windows will make this much easier. But for day-to-day work, it’s now possible to stay in sync with the main repository on a more regular basis for people who use the graphic interface tools almost exclusively.

[*] edit August 10: EEK! Autocorrect originally changed “Whitaker’s Words” to “Whiskers-Words” — Fixed!  That app only does Cat-latin (catin?): "maumo, maumare, maumavi, maumatus  V (1st)     1 1 [GXXEK]   — meow;” not what we want!
[August 6, 2015 23:13 pm] « » [prolatio]


Often while marveling at Ricky Henderson’s amazing stats, I wondered how much greater a leadoff hitter he would have been if he had spent his whole career in the National League.  He had 11,180 plate appearances in the AL but only 2,166 in the NL. In both leagues, the leadoff hitter leads off the first inning, but is not guaranteed to bat leadoff in any following inning. However, I figured that in the National League, batting after the pitcher, it’d be substantially more common that the person batting first in the order would get to lead off. The pitcher almost always makes an out, so I figured it’d be pretty common for him to make the third out (and because of situations where the eighth batter is walked to get to the pitcher, probably more common than one in three).  The eighth batter isn’t that strong in the AL, but a lot stronger than almost any NL pitcher.

I’ve been working off and on over the past two years (more off before getting tenure, more on after getting tenure) on an extremely flexible python toolkit for examining baseball games and it finally got to the state of development where I could test my findings.  I’m not ready to release the toolkit yet (it needs to be polished enough that I’m proud of it), but here’s the code I used to work:

  gc = games.GameCollection()

  gc.yearStart = 2000

  gc.yearEnd = 2014

  gc.usesDH = True

  allGames = gc.parse()

  totalPAs = 0

  totalLeadOffs = 0

  for g in allGames:

      for halfInning in g.halfInnings:

          for p in halfInning.plateAppearances:

              if p.battingOrder == 1:

                  totalPAs += 1

                  if p.plateAppearanceInInning == 1:

                      totalLeadOffs += 1

  print(totalPAs, totalLeadOffs, totalLeadOffs*100/totalPAs)


It gets a collection of games where the DH is used or not used, looks at each game, then at each half inning, then at each plate appearance. If the batter is #1, then it checks whether it’s the first appearance in the inning, then prints out the percentage of all batter #1 plate appearances which are leadoffs.  The results were surprising to me. 

       PAs  Leadoff    %
No DH183,03375,36441.175
With DH163,178163,45138.885

The average difference in the percentage of leadoff plate appearances between the two leagues (accounting for interleague games) is only about 2.5%. This works out to about 15 PAs a year different for Ricky in his prime. So one hypothesis down, but many more to be investigated soon.

[August 1, 2015 18:38 pm] « » [prolatio]

[June 16, 2015 14:30 pm] « » [music21]

First we start the cluster system with ipcluster start, which on this six-core Mac Pro gives me 12 threads. Then I'll start iPython notebook with ipython notebook.

In [1]:
from __future__ import print_function
from IPython import parallel
clients = parallel.Client()
clients.block = True

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]

Now I'll create a view that can balance the load automatically.

In [2]:
view = clients.load_balanced_view()

Next let me get a list of all the Bach chorales' filenames inside music21:

In [3]:
from music21 import *
chorales = list(corpus.chorales.Iterator(returnType = 'filename'))

['bach/bwv269', 'bach/bwv347', 'bach/bwv153.1', 'bach/bwv86.6', 'bach/bwv267']

Now, I can use the function to automatically run a function, in this case corpus.parse on each element of the chorales list.

In [4]:, chorales[0:4])

[< 4467044944>,
< 4467216976>,
< 4465996368>,
< 4465734224>]

Note though that the overhead of returning a complete music21 Score from each processor is high enough that we don't get much of a savings, if any, from parsing on each core and returning the Score object:

In [5]:
import time
t = time.time()
x =, chorales[0:30])
print("Multiprocessed", time.time() - t)
t = time.time()
x = [corpus.parse(y) for y in chorales[0:30]]
print("Single processed", time.time() - t)

Multiprocessed 1.7093911171
Single processed 2.04412794113

But let's instead just return the length of each chorale, so we don't need to pass much information back to the main server. First we need to import music21 on each client:

In [6]:
clients[:].execute('from music21 import *')

<AsyncResult: finished>

Now, we'll define a function that parses the chorale and returns how many pitches are in the Chorale:

In [7]:
def parseLength(fn):
c = corpus.parse(fn)
return len(c.flat.pitches)

Now we're going to see a big difference:

In [8]:
t = time.time()
x =, chorales[0:30])
print("Multiprocessed", time.time() - t)
t = time.time()
x = [parseLength(y) for y in chorales[0:30]]
print("Multiprocessed", time.time() - t)

Multiprocessed 0.59440112114
Multiprocessed 2.97019314766

In fact, we can do the entire chorale dataset in about the same amount of time as it takes to do just the first 30 on single core:

In [9]:
t = time.time()
x =, chorales)
print(len(chorales), 'chorales in', time.time() - t, 'seconds')

347 chorales in 5.31799721718 seconds

I hope that this example gives some sense of what might be done w/ a cluster situation in music21. If you can't afford your own Mac Pro or you need even more power, it's possible to rent an hour of cluster computing time at Amazon Web Services for just a few bucks.

[June 16, 2015 13:38 pm] « » [music21]

The newest version of the beta 2.0 track of music21 has been released. A reminder that the 2.0 track involves potentially incompatible changes w/ 1.X so upgrade slowly and carefully if you need existing programs to work. Changes are being made to simplify and speed up usage and make the system more expandable for the future.

Download at or with PyPI.

Major Changes

  • Complete rewrite of TinyNotation. Tinynotation was one of the oldest modules in music21 and it showed — I was still learning Python when I wrote it. It documents a simple way of getting notation into music21 via a lily-like text interface. It was designed to be subclassable to make it work on whatever notation you wanted to use. And technically it was, but it was so difficult to do as to be nearly impossible. Now you’ll find it much simpler to subclass. Demos of subclassing are included in the code (esp. HarmonyNotation, and trecento.notation); a tutorial to come soon.
  • backwards incompatible changes: (1) you used to be able to specify an initial time signature to Tinynotation as corpus.parse(“tinynotation: c4 d e f”, “4/4”); now you must put the time signature string into the text itself, as corpus.parse(“tinynotation: 4/4 c4 d e f”). “cut” and “c” time signatures are no longer supported; use 2/2 and 4/4 instead. (2) calling tinyNotation.TinyNotationStream() directly doesn’t work any more. Use the corpus.parse interface either with the “tinynotation:” header or format=“tinynotation” instead. If you must use the guts, try tinyNotation.Converter(“4/4 c4 d e f”).parse().stream. (3) TinyNotation used to return its own “TinyNotationStream” class, which was basically incompatible with everything. Now it returns a standard stream.Part() (4) TinyNotation did not put notes into measures, etc. you needed to call .makeMeasures() afterwards. If you need the older method, use corpus.parse(‘tinynotation: 4/4 c2 d’, makeNotation=False)
  • Musescore works as a PNG/PDF format. First run: us = environment.UserSettings(); us[‘musescoreDirectPNGPath’] = '/Applications/MuseScore’ or wherever you have it). Then try calling “.show(‘musicxml.png’)” and watch the image arrive about 100x faster than it would in Lilypond. Thanks MuseScore folks! This is now the default format for .show() in iPython notebook. Examples using lily.png and lily.pdf will migrate to this format, so that lilypond can be moved to deprecated-but-not-to-be-removed status. (I just don’t have time to keep up)
  • demos/gatherAccidentals : a good first test programming assignment for students. I use it a lot in teaching.
  • musicxml parses clefs mid-measure (thanks fzalkow)
  • installer.command updated for OS X (thanks Andrew Hankinson) — let me know if this makes a problem.
  • postTonalTools demo in usersGuide.
  • DataSet feature extractor gets a .failFast = False option for debugging.

Under the hood / contributors

  • music21 now uses coverage checking via We are at 91.5% code coverage; meaning when the test suite is run, 91% of all the lines of code are tested. Aiming for 95% (100% is impossible). Adding coverage checking let me find a lot of places that weren’t being tested that, lo and behold!, had bugs. What it means for contributors: any commit that is longer than 20 lines of code needs to improve the coverage percentage and help us get to 95%. So make sure that at least 92% (better 99%) of your code is covered by tests.
  • the romanText.objects module has been renamed romanText.rtObjects to not conflict with external libraries. It’s an implementation detail.
  • added demo of how to subclass SubConverter.

Minor Changes

  • measure number suffixes in musicxml output, not just input.
  • language detector can detect Latin and Dutch language texts now.
  • fix pitch class errors in microtones.
  • midi files with negative durations no long crash the system.
  • bugs fixed in tonalCertainty. You can be more certain that it works.
  • cPickle is used in Python3 now. Faster.
  • midi parsing can specify quantization levels.
  • music21.__version__ gives the version (maxalbert did a lot this commit; forgot to shout out before!)
  • better detection of lilypond binaries.
  • certain Sibelius MusicXML files with UTF-16BOMs can now be read.
  • rests imported from MusicXML would not have expressions attached to them — fermatas, etc. fixed
  • serial.ToneRow() now has the notes each as quarter notes rather than as zero-length notes; it makes .show() possible; backwards incompatible for the small number of people using it.
  • colored notation now works better and in more places.
  • better docs.
  • about a trillion tiny bugs and untested pieces of code identified and fixed by glasperfan (Hugh Z.)


Looking forward to the 2.1 release!

[April 11, 2015 13:47 pm] « » [music21]
The RILM blog, Bibliolore, published a recent post about the early history of computational musicology:

In the 1940s Bertrand Harris Bronson became one of the first scholars to use computers for musicological work.
For one of his projects he encoded melodic characteristics of hundreds of tunes collected for the traditional ballad Barbara Allen on punch cards, so a computer could ferret out similarities. His project resulted in four groups of tunes, members of which came from both sides of the Atlantic with varying frequency.
Read more at their site.  The RILM blog is amazing in any case.

For older stories visit the Prolatio (general items) or music21 (computational musicology) blogs.

Michael Scott Cuthbert (cuthbert [at] is Associate Professor of Music and Homer A. Burnell Career Development Professor at M.I.T.

Cuthbert received his A.B. summa cum laude, A.M. and Ph.D. degrees from Harvard University. He spent 2004-05 at the American Academy as a Rome Prize winner in Medieval Studies, 2009-10 as Fellow at Harvard's Villa I Tatti Center for Italian Renaissance Studies in Florence, and in 2012–13 was a Fellow at the Radcliffe Institute in 2012-13. Prior to coming to MIT, Cuthbert was Visiting Assistant Professor on the faculties of Smith and Mount Holyoke Colleges. His teaching includes early music, music since 1900, computational musicology, and music theory.

Cuthbert has worked extensively on computer-aided musical analysis, fourteenth-century music, and the music of the past forty years. He is creator and principal investigator of the music21 project. He has lectured and published on fragments and palimpsests of the late Middle Ages, set analysis of Sub-Saharan African Rhythm, Minimalism, and the music of John Zorn.

Cuthbert is writing a book on Italian sacred music from the arrival of the Black Death to the end of the Great Schism.

Download what is almost certainly an out-of-date C.V. here (last modified June 2012)

Changing Musical Time in the Renaissance (and Today), for Festschrift Joseph Connors (forthcoming)

Bologna Q15: the making and remaking of a musical manuscript, review for Notes 66.3 (March), pp. 656-60.

Ars Nova: French and Italian Music in the Fourteenth Century, edited volume with John L. Nádas (Music in the Medieval World Reference Series vol. 6). London: Ashgate. Reviewed by Gary Towne, The Medieval Review, February 2010.

"Palimpsests, Sketches, and Extracts: The Organization and Compositions of Seville 5-2-25," L’Ars Nova Italiana del Trecento 7, pp. 57–78.

Der Mensural Codex St. Emmeram: Faksimile der Handschift Clm 14274 der Bayerischen Staatsbibliothek München, review for Notes 65.4 (June), pp. 252–4.

"A New Trecento Source of a French Ballade (Je voy mon cuer)," in Golden Muse: The Loeb Music Library at 50. Harvard Library Bulletin, new series 18, pp. 77–81.

"Esperance and the French Song in Foreign Sources," Studi Musicali 36.1, pp. 1–19.

"Trecento Fragments and Polyphony Beyond the Codex", Ph.D. Dissertation, Harvard University (unpublished).

"Generalized Set Analysis and Sub-Saharan African Rhythm? Evaluating and Expanding the Theories of Willie Anku," Journal of New Music Research (formerly Interface) 35.3, pp. 211–19. [.pdf]

"Zacara’s D’amor Languire and Strategies for Borrowing in the Early Fifteenth-Century Italian Mass," in Antonio Zacara da Teramo e il suo tempo, edited by Francesco Zimei. Lucca: LIM, pp. 337–57 and plates 10–13.

"Free Improvisation: John Zorn and the Construction of Jewish Identity through Music," in Studies in Jewish Musical Traditions, edited by Kay Kaufman Shelemay (Cambridge, Mass.: Harvard College Library). pp. 1-31. [.pdf]

Creative Commons License Unless otherwise mentioned, the writings, compositions and recordings on this site are licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.

Copyright 2010-11, Michael Scott Cuthbert. Web design by M.S.C.