[mb-style] CSG compromise?

Mike Morrison mikemorr at umich.edu
Sat Mar 1 18:47:42 UTC 2008


On Sat, 1 Mar 2008, Leiv Hellebo wrote:
> The last month my focus has not been on fixing the CSG, but rather to
> say that I do not want us to standardise on one way of formatting one
> piece of classical (movement, lied etc.) in the CSGS pages and make this
> mandatory for how we should deal with that piece.

I think each user can get exactly what he or she wants with NGS. Here's 
how:

The list of work titles currently known as CSGS, rather than being a 
prescriptive list of the "one true way" to format each work's title, could 
be structured as a database of all known classical works, with each work 
title broken up into many subfields, like all the boxes and sub-boxes on

http://wiki.musicbrainz.org/BrianFreud/sandbox#CSGworkstructure

or

http://wiki.musicbrainz.org/BrianFreud/ClassicalOntology

Rather than "CSGS", I will call it the WorksDatabase.

So a given composition would have a "work type" field, a "key" field, a 
"catalog number" field, etc. We could have multiple "common name" fields 
for works which have more than one common name. Some or all of the fields 
would have pointers to alternate translations into many languages, for 
example "E minor" vs. "mi mineur".

> I've been pushing the idea that the track listing from the liners is a 
> better common ground for solving edit wars than the CSGS pages.

That's great! Every track on every release can have its own track title or 
"TrackCoverText", which can be as exact a copy of the text on the liner 
as desired, and does not need to be consistent among different recordings 
of the same composition.

However, every track on every release would also have an AR to the 
WorksDatabase: "This track is a recording of the work Concerto for Piano 
No. 1 in F major, K. 37: I. Allegro". This AR would be the same for every 
recording of the work, although each recording's TrackCoverText can be 
formatted differently.

Then, with the right software, individual users could tell their taggers 
to concatenate only the fields of their choice, in the language of their 
choice, in the order of their choice, and get their tracks tagged in their 
preferred title style.

What if users just want their tracks to look like the cover, and not 
necessarily be consistent among different recordings of the same 
composition?

Then they tell their tagger to tag from the track title (TrackCoverText), 
rather than the WorksDatabase.

What if they want their tracks to look generally like the liner notes, but 
when there is important information missing, to add that information, 
while keeping the general form of the liner notes? (Note: what is 
considered "important information" can be different for each user)

Then they tell their tagger to use the TrackCoverText, but if certain 
information (such as a catalog number) is missing on a given track, to go 
fetch that information from the WorksDatabase and insert it as part of the 
tag string.

This might require that we have semantically marked up the TrackCoverText 
so it can be parsed by the tagger application in the same way as the 
WorksDatabase. We can do this with hidden markup tags, without changing 
the appearance of the track cover text. So if the cover says (actual 
example):

"Chaconne from Partita BWV 1004
D minor/d-moll/re mineur"

(there should be an accent over "re" but my email client has encoding 
problems)

this could be marked up as (example only, details could differ):

<dancetype lang="fr">Chaconne</dancetype>
<fromword lang="en">from</fromword>
<worktype lang="it">Partita</worktype>
<catalogidentifier lang="de">BWV</catalogidentifier>
<catalognumber lang="">1004</catalognumber>
<br />
<key lang="en">D</key>
<scale lang="en">minor</scale>/
<key lang="de">d</key>
<scale lang="de">-moll</scale>/
<key lang="fr">re</key>
<scale lang="fr">mineur</scale>

This semantic markup does not have to be done by newbie editors submitting 
releases. It can be done by more experienced editors after the release is 
submitted, or maybe by a semi-automated script.

Leiv, I think you've raised good questions. Here are my more specific 
responses to some of those questions:

> How do we resolve disagreements on what to put in the CSGS pages?

We can put all the information we know about a composition into the 
structured WorksDatabase. Users can set their taggers to pick and choose 
which of that information goes into their local track titles.

> you probably will need to answer "how different does works have to be in 
> order to be regarded as separate works?" (there might be small changes 
> in the scoring)

If a consensus can't be reached on whether Op. 1a and Op. 1b are the same 
or different works, let's enter them as separate works in the 
WorksDatabase, and within the WorksDatabase let's link each of them to a 
"parent" work Op. 1. The editors who say they are the same work can set a 
user profile preference more or less specific to that work where their 
tagger will tag recordings of Op. 1a and Op. 1b as if they were all Op. 1. 
The editors who say they are different will set their profile preference 
to tag Op. 1a and Op. 1b differently.

> you will maybe need a notion on utterly boring questions such as "How 
> common does a common name have to be to be included"

Let's put every known common name in the WorksDatabase, with a "frequency 
score" for each name. This score could be calculated, for example, by a 
script which looks for the common name in the TrackCoverText for all 
instances of the work in the MusicBrainz database. Then users can set 
their taggers to require a minimum score for inclusion of a common name in 
their track titles.

> "when do we translate them?"

Let's put every known translation of every work type, common name, 
key/scale, etc. in the WorksDatabase, with a language code for each. Users 
can set their tagger preferences to use their own language, the composer's 
native language, the language of the ReleaseCountry, or whatever they 
want.

> There's been some claims that quite a few CDs are lacking details. 
> Further, some say labels produce "wacky" listings, others that 
> rereleases very often have differing track lists. None of these claims 
> have been substantiated AFAICS, nor has anyone tried to fill in the 
> details on the significance of these differences.

I agree that we could benefit from some cover scans of wacky releases. 
Next person who finds a "Classical Chillout" release with ambiguous track 
titles, please scan it for us :)

> Now for all I know, you may on a "Classical Chillout" release find
> "Bach: Sarabande", "Bach: Sarabande from Partita", "Bach: Sarabande from
> Partita No. 1", "Bach: Sarabande from Partita, BWV 1002" or something else.
>
> The three first does lack information necessary to determine which piece
> of music is included. You and I might prefer to have this information
> included, thereby finding the BWV 1002 quite important, but this does
> not mean that the person buying this release finds it at all helpful.

I agree. You and I can set our taggers to get "BWV 1002" from the 
WorksDatabase. The person buying the release can set their tagger to read 
straight from the TrackCoverText if they want.

> MusicBrainz is not a pedagogical site, and the track listings are not
> places where MB should educate the masses.

While I do use MB to learn about music and I hope it can become ever more 
useful for that purpose, I agree with your sentiment here. If the masses 
want to use MB to learn about classical compositions, they should browse 
the WorksDatabase, not the TrackCoverText.

> What should we standardise on then? "Partita for Solo Violin, BWV 1002: 
> Sarabande", perhaps. No movement numbers?  Bach didn't number them

When it comes to users' tags, all of this could be determined by each 
user's tagger preferences. Maybe the Bach movement numbers would have a 
low "authenticity score" in the WorksDatabase, indicating that they are 
not the composer's own work.

When it comes to the TrackCoverText, let's go with more or less what's on 
the cover.

> and because different recordings often include the "Double" following 
> the "Sarabande", so you also get
>
> "...BWV 1002: Sarabande - Double"

Each track can be related to multiple movements in the WorksDatabase. The 
WorksDatabase should allow subdivision of movements into submovements as 
required by the releases we encounter.

> Further, some of Bach's music seem nearly infinitely malleable, and has 
> been transcribed for many different instruments

Indeed. We will probably need ways to handle this, both in the 
WorksDatabase and in the semantic markup of the TrackCoverText, such as 
<originalinstrument> and <transcribedinstrument> markup tags.

> I best shut up before people stop reading

I better do the same :)

> so let me sum up by saying:
> If I buy "Classical Chillout", I don't want one track title to read
>
> "Partita for Solo Violin in B minor, BWV 1002, Transposed to A minor for 
> Lute: Sarabande"

And I really do want my track title to say all of that. No kidding! But I 
really think we can all get what we want with a 
TrackCoverText/WorksDatabase/TaggerPreferences system like what I'm 
describing.

Cheers,

Mike



More information about the Musicbrainz-style mailing list