[mb-devel] Using NGS live feed in CTDB
Lukáš Lalinský
lalinsky at gmail.com
Fri May 20 20:14:17 UTC 2011
On Fri, May 20, 2011 at 9:55 PM, Gregory Chudov <gchudov at gmail.com> wrote:
> Greetings all and congratulations on successful launch of NGS. I've been
> waiting a long time for this.
>
> I'm a maintainer of CTDB and developer of CUETools/CUERipper.
> CTDB is a fast growing on-line CD rip accuracy database, like AccurateRip,
> but a bit more advanced.
> Next version of CTDB will use Musicbrainz Live Feed to provide metadata
> together with CD accuracy data.
>
> Musicbrainz XML service is great, but a bit slow and i want to reduce the
> number of servers that ripper has to contact and the number of protocols it
> has to support, so CTDB will combine together information from multiple
> sources and provide it to the client in a single request.
>
> Here is an example:
> http://dbnew.cuetools.net/lookup2.php?toc=32:19720:37477:63232:81375:100780:129857:148577:167265:189560:210052:232170:250767:268662:285200:303867:323640:341842&musicbrainz=1
>
> I have several questions.
> First thing that i noticed after setting up a database is that release
> lookup by cdtoc was slow.
> Using 'EXPLAIN ANALYZE' command i found out that it was because one of the
> indexes was missing on medium_cdtoc table.
> I fixed it with the following: 'CREATE INDEX medium_cdtoc_idx_cdtoc ON
> medium_cdtoc (cdtoc)'
> Is this intentional or was this done manually on the main server and
> forgotten in admin/sql/CreateIndexes.sql?
> Isn't the main musicbrainz server slowed down by this?
MusicBrainz never uses the CD TOC directly for searching. It either
uses the DiscID for exact matches (which is really just a SHA-1 hash
of the CD TOC) or it uses the tracklist_index table which contains a
multi-dimensional index that can be used for fuzzy searches:
https://github.com/metabrainz/musicbrainz-server/blob/master/lib/MusicBrainz/Server/Data/DurationLookup.pm#L46
> Second question is about release_coverart table. It's empty in my replicated
> database.
> Did i forget to import something or should i just run
> admin/sql/updates/ngs-fast-rebuild-coverart.sql? Won't this break
> replication?
> Should i rerun it regularly if replication packets don't update this table?
Unfortunately, MusicBrainz is not allowed to redistribute that data. I
think the script to run is ./admin/RebuildCoverArtUrls.pl, but it will
take a long time and you need an AWS API key.
> Third question is about one weirdness in NGS.
> release_label table lets one release have several catalog numbers.
> I've noticed that it's usually used for multidisc sets, and lists catalog
> numbers for every disc in this release.
> Shouldn't catalog number be attached to the medium instead of release then?
> Or maybe catalog number should instead just support a wildcard to stand for
> disc number?
Linking catalog numbers to mediums is a planned feature, but not
implemented yet.
Lukas
More information about the MusicBrainz-devel
mailing list