[mb-style] RFC: Works lists (and other related changes then implied)
Brian Schweitzer
brian.brianschweitzer at gmail.com
Sun Mar 23 07:08:08 UTC 2008
On Sun, Mar 23, 2008 at 2:23 AM, David K. Gasaway <dave at gasaway.org> wrote:
> On 22 Mar 2008 at 18:12, Brian Schweitzer wrote:
>
> > Considering that folksonomy tags have no single purpose, and are
> > intended to be a multi-use field, I don't know what you're objecting to
> > here.
>
> They are multi-purpose but they are not all-purpose. They are
> primarily used for social functions. They are not designed to handle
> those problems best solved by a structured database.
>
>
> > Please point me to the email which said all this.
>
> Let me be more be more clear: The links to the sample data I posted
> were a test for the ideas (suggested by Aaron) that the wiki pages
> could be used to address questions raised by me regarding translation,
> organization, and the like. This was a test, nothing more. I will
> note, however, that no one seemed to disagree at the time.
>
>
> > But you're
> > adding in a LOT of functionality to the wiki, and ignoring the point
> > that the wiki server can not handle these pages, long term.
>
> Adding a single link for each movement constitutes "a LOT of
> functionality"? If the wiki server cannot handle the load (not that I
> see that there would be all that much), then the frequent editors can
> keep local copies. I don't know a lot about MoinMoin or MB's
> particular implementation of it, but most wiki softwares have caching
> features to help reduce server load. Or are you concerned about
> bandwidth? Hopefully, this will not be required in the long term.
>
>
> > * organization: the wiki format limits this
> > to strictly a single organization order; all other possible methods then
> > still miss out (sorting by different catalogs, etc).
>
> This is all true. But the wiki at least can have TOCs (in multiple
> orders), and cross references for the catalogs. The work list, on the
> other hand, will seemingly be a flat, unindexed list with no well-
> defined order (or, at least, an order that is not configurable). I
> can't see how folksonomy tags could even solve this problem. If you
> have ideas, I'd like to hear them.
>
>
> > * translations:
> > Again, simply too much data to be well organized in a basic table, or to
> > be useable. Could you not also store these in work annotations?
>
> Possibly, but the idea was to take a CSGS page and translate it in
> whole to create another page complete with all the indexing features.
>
>
> * browseability: Loading a 2 meg (or 3 or 4, potentially, if you're
> also
> > doing translations) page that consists almost entirely of HTML tables,
> > vs loading a works list and navigating either by movements or work tags?
> > Neither is perfect, but the former is simply too much, for either the
> > server or the browser - some browsers already find the Mozart page too
> > large to load, and it can take my own machine 30 seconds to even render.
>
> If loading a CSGS page is really that burdonsome on the browser, we can
> divide it into multiple pages. Will we have that option for the works
> list page? How does executing tag searches on the server help vs.
> loading a work list or wiki page once? However the idea behind
> "browseability" was to have the indexes available to navigate (great
> for more casual users) rather than resorting to searches.
>
>
> > * store miscellaneous notes: Sounds perfect for a work annotation, much
> > less so (I know from experience trying to cleanly integrate them on the
> > Mozart page) mixed into a table format in the wiki.
>
> I don't mean to suggest that the wiki is any better suited to this
> function. However, if we do keep the wiki pages, they do seem to be a
> logical place to keep this information.
>
>
> > Anyhow, even when making the first CSGS page, I recognized it was a
> > problematic solution to use the wiki.
>
> I recognize that it is problematic, but I also recognize that this is
> only temporary until a better solution comes along. For the quick
> test, I found the process bearable. It may be too awkward to use to
> convert existing entries en-masse, this is true. That goodess may have
> to wait? Actually, I'm sure some enterprising folk (such as yourself)
> can find a way to make batch edits using a cached wiki page.
>
>
> > And we will want a
> > script to do so, whoever ends up writing it; doing track-track ARs is
> > otherwise so very very slow that most people won't ever bother to link
> > to works.
>
> Somehow, I don't believe that the folksonomy tag solution as presented
> (which seems designed for elite classical editors) will magically cause
> most people to bother. ;)
>
>
> > You may never have heard of anyone wanting to write such a
> > script, for wiki or otherwise, but it was discussed on the page notes
> > for the Bach page way back when that page was first started and it was
> > discussed in at least one of the CSG threads here.
>
> I'm very sorry I missed it. Was it brought up in the context of this
> RFC?
>
>
> > At least once a week
> > someone sends me an email asking for me to write a script to pull CSGS/M
> > track titles from the wiki to make adding Mozart releases faster. I'm
> > not willing - it would cripple the wiki server.
>
> If you're talking strictly about custom client-side scripts, wouldn't
> it be reasonable to have the scripts operate on a local copy of the
> wiki pages?
That might be logical, except GreaseMonkey security (and I assume the
Trident-GM emulators and UserJS as well?) specifically prevents
accessing local files. And, that also assumes it isn't implemented as
a server-side function; it would seem rather strange for a
server-based non-GM script to require the user to have copies of
wikipages saved locally. So you might suggest we then cache the
lookups within the script itself as composer-work arrays, which would
be fine, except they too would be massive; essentially you'd be
loading a meg or more of data into memory on each and every page to
which you attached such a script, plus either "solution" assumes that
the lists are done, or that the script author is willing to
continually update the array / users are going to be willing to
frequently save new copies of the pages or install new versions of the
script w/array. Neither is really likely, nor would I personally be
willing to code such a GM script - requiring that much array or local
file parsing (assuming the latter were even possible) that frequently
is simply a no-win in my opinion.
Conversely, a XMLHttpRequest (or something similar - JSON, etc) to
pull one small amount of data and load it into a select box, almost
exactly like the current "relate to..." one does: that's both
lightweight to the server and lightweight to implement, either as a
server or a client side script. It's also pretty much something that,
once written, doesn't really need any continuing support to keep it up
to date.
You say:
> They are multi-purpose but they are not all-purpose. They are
> primarily used for social functions. They are not designed to handle
> those problems best solved by a structured database.
In whose book? My understanding was when tagging was added, and when
we talked about any potential guidelines for tags, the intent and the
decision here was that tags would be ok in any format, for just about
any use anyone decided to use them for. Hence our using them for such
things as series identification, disc identification within box sets,
"seen live", "I want", "I have", and anything else. *You* may be only
using tags for "social" functions (whatever those are), but they very
clearly were left open to any possible use we might find for them.
It's not like such tags would even be "cluttering up" existing tag
info for whatever purposes someone might be using them for; with a few
very rare exceptions (SMP), works typically only have a handful of
movements, so having a tag that's used on several works wouldn't push
any one work tag into the "top tags". They'd be attached to works,
not tracks/artists/albums/etc, so per-type tag reporting wouldn't be
affected.
Any way we do it is a hack, something to allow us to do it with basic
work functionality until we have that full work functionality. But
there's good ways to do it and bad ways, and I see overloading the
wiki sever, expecting users to maintain static local copies of
wikipages, or loading everything into an array in a script all to be
*very* bad solutions, in the amount of stress they put on the servers
and/or the amount they expect of users (or users' computers), whereas
a tag+annotation-based solution, using tags to group works and
annotations for all the work notes you want to store, that solution I
see as very doable and very lightweight to both the servers and the
user.
You also say that you see a tag solution as something only for "elite
classical editors" - how so? What I suggested, yes, would involve
some structure to work-grouping tags, but that's the grouping of the
works, not the later use of those groupings. We've discussed using
"full CSG" to title works, but that doesn't mean anyone matching a
track to a work therefore has to be an "elite classical editor", only
that the person who originally created the work entry has top know
what they're doing; the later editor just has to know enough to be
able to identify which of the works to link to. Same for
work-tag-grouping; the editor setting the tag has to know what tag
structure to use, but the user who is then linking to a work via a
dynamicly populated drop down doesn't have to do anything more than,
say, type in a catalog number, wait a half second, then click on the
right work in the dropdown. In other words, use != implementation,
and the knowledge required for the such use doesn't have to be the
same as the knowledge required for implementation.
In fact, I think a wiki-based solution is asking *more* of the editor
than a tag based solution. With tags, the editor creates a work, tags
that work, and anyone immediately can use it. However you implement
it, a wiki solution then requires:
1) Add the work
2) Go to the wiki, find the page for that composer, add the work
(again). Create a link then to the work in the wiki.
3) Expect the user who's adding classical - especially a VA release,
and especially the casual editor - to go and find the wiki page for
each composer and save a copy somewhere locally, or have a script
which has each of those composers in some array. (note that already
we're introducing lag here - I now can't add a work and immediately
link to it via any script).
4) Either in the array based script, or via saved copies of pages, now
continually update your local copies to the most recent version, so
the script knows about recent additions to the works list.
In other words, if we do want a script to do this, and I think
expecting people to really do work-track ARs without one is an
untenable expectation, if we do it using the wiki, the concept simply
breaks right from the start. The only editors who would ever perhaps
have up to date local copies of files would be the very editors you're
calling the "elite". You've also then doubled the amount of work for
a classical editor creating a work listing; tagging is a 3 second
solution, duplicate-adding to a wiki is much more time intensive (and
much more prone to error, if the expected wiki format is varied from
in any way by the editor adding it to the wiki.) And if you choose to
leave it to an array for a script author, that script writer has to
then continually stay on top of all additions to all work lists, and
users of the script have to continually be installing new versions.
In other words, it'd be a massive amount of manual work to keep it
going, and noone in their right mind, I think, is going to be willing
to even bother to try to do it. Either someone will write a script
and it'll bit-rot, or (more likely) noone who knows how to write that
script is ever going to do it, as they'll recognize exactly these same
problems about any script so implimented.
Brian
More information about the Musicbrainz-style
mailing list