[mb-users] Policy regarding making automated edits (bots)
Lauri Watts
krazykiwi at gmail.com
Fri Dec 1 15:53:30 UTC 2006
On 12/1/06, Thomas Tholén <badnewsbnw at gmail.com> wrote:
> No, he asked for the policy regarding making automated edits via
> scripts/bots.
> //[bnw]
To which there really isn't an answer, so here's my view:
a) We've had bots run before. Rob ran one a few times to hook up the
"feat...." info to identifiable artists. While that's left quite a
lot of work to do cleaning up the few that it got wrong, it's also
been immensely beneficial. Not just because doing those initial
featured artists AR's by hand would have been an enormous job, but
because the process of cleaning up the misidentified feat's has been a
great way to find (and eventually fix) duplicated artists and artists
that are sharing a page and need splitting.
b) We've also had people suggest not precisely automated edits, but
'mass edits' based on simplistic criteria that were perhaps not
thought through. Things like, removing all dates before 1982 for
things with disc-id's on the grounds that CD's weren't invented until
then (which missed the fact that very many cd's are simple transfers
of lp's onto cd). That, done on a mass automated basis, would have
been terrible.
So my point of view: If the basic premise of the bot is sound, I doubt
there's any reason to object.
Adding 'latin' based on the script in the tracklistings can hardly go
wrong, it's either latin or it's not, it's very binary.
There's already a report in the 'suggested edits' that lists albums
with language but no script set, and vice versa, it's usually pretty
short and lots of people monitor it regularly, but it's not uncommon
to go to an artists page and see dozens of albums with neither set, so
setting the script would be a start.
Guessing language is another can of worms though. While the language
guesser is pretty good, I think we'd need to see some numbers (run it
across a few hundred already manually identified albums, in various
languages, and see how the numbers really stack up.) I know it has a
tendency for instance, to misidentify Swedish as a romance language if
something like the accented e bnw's surname is in there, and it's not
uncommon in 'slang' swedish, or in names)
--
Regards,
Lauri Watts
More information about the MusicBrainz-users
mailing list