[mb-devel] HOWTO - Creating just the database in a VMware box with Debian

Arcade Arcade at dotcomrefugee.org
Sun Jun 5 16:54:03 UTC 2005


Thanks goes to Dave for helping me with this.  It took me a while, but I
finally got it!  I'm posting my HOWTO for anyone who wants to do it
themselves ....

- Chris.
Arcade [-at-] dotcomrefugee [-dot-] org

========================================================================
===
========================================================================
===

HOWTO:
This how-to instructs how to step-by-step create a VMWare workstation
that contains just the Musicbrainz database (no webserver or interface).
We will download all the necessary packages and Musicbrainz database
throughout the how-to.  This has been successfully tested by myself
during the first week of June in 2005.

Prereq's:
- High speed connection
- VMware workstation 5
- 1 or 2 hours in front of the computer to install and setup everything.
- 2 to 5 hours to import the database.  It runs by itself.
- About 20 gigs of hard drive (at least I think you need it)
- The latest Debian minimal net install CD:
http://cdimage.debian.org/pub/cdimage-testing/sarge_d-i/i386/rc3/sarge-i
386-businesscard.iso.

========================================================================
===
========================================================================
===

(INSTALLING DEBIAN INTO A VMWARE WORKSTATION)
1.  Create a VMWare machine with a 15-20 gig hard drive.

2.  Install Debian ...

2a. Download the latest snapshot installer from Debian.org:
Important installation notes:
- I let Debian use all the defaults for the installation process.
- When asked if I wanted to install any additional packages (like "X
Desktop, or SQL Server, File server), I chose none of them.  We'll just
the ones we need later.

------------------------------------------------------------------------
---

(INSTALLING REQUIRED PACKAGES FOR MUSICBRAINZ SERVER)
3.  When you're at the prompt, login as root and run "apt-get update" to
make sure that you have the latest package list from Debian.

4.  Run "apt-get install postgresql" to install Postgres.
Important notes: When asked which locale you want, select "C".  By
default it selects "en_us" which does not seem to work.

5.  Run "apt-get install postgresql-dev".  It is required by a Perl
module that will later be installed.

6.  Run "apt-get install cvs".  Required to download the mb_server via
CVS.

7.  Run "apt-get install apache-perl".  Required include files for
mb_server's Perl scripts that we'll download later.

8.  Run "apt-get install bzip2".  Required to uncompress the the full
exports.

------------------------------------------------------------------------
---

(INSTALLING PERL MODULES REQUIRED TO IMPORT MUSICBRAINZ DATABASE)
9.  Run "cpan" to start the CPAN shell.
Important notes:  Because this is the first time we've run CPAN, it will
ask if we want to manually configure CPAN.  I answered "no".

10.  Type "install DBI" in the CPAN shell to install DBI.

11.  Type "install DBD::Pg" in the CPAN shell to install DBD::Pg.  
Important notes: If it asks to follow other dependencies, say "yes" and
let it download and install those too.

12.  Type "install Text::Unaccent".  
Important notes: If it asks to follow other dependencies, say "yes" and
let it download and install those too.

13.  Type "install Date::Calc"
Important notes: If it asks to follow other dependencies, say "yes" and
let it download and install those too.

14.  Type "install String::ShellQuote"
Important notes: If it asks to follow other dependencies, say "yes" and
let it download and install those too.

------------------------------------------------------------------------
---

(GETTING THE MB_SERVER FROM MUSICBRAINZ CVS)
15.  Cut and paste the following lines directly into the command prompt.
This will setup the environment variables necessary to download
"mb_server" from the CVS over an SSH connection.

CVSROOT=:ext:cvs at cvs.musicbrainz.org:/var/cvs
export CVSROOT
wget ftp://ftp.musicbrainz.org/pub/musicbrainz/misc/musicbrainz_anoncvs
chmod 700 musicbrainz_anoncvs
mv musicbrainz_anoncvs ~/
CVS_RSH=~/musicbrainz_anoncvs
export CVS_RSH

17.  Type the following into the command prompt.  This will create the
default directories found in the mb_server Perl scripts so that you
won't have to change any lines in the Musicbrainz code.

17a. mkdir /home/httpd
17b. mkdir /home/httpd/musicbrainz
17c. cd /home/httpd/musicbrainz

Important note:  From this point on, we're going to be working in the
"/home/httpd/musicbrainz" directory.

18.  Run "cvs co -r RELEASE_20050527-BRANCH mb_server" to begin
downloading the mb_server branch.  This was the most current mb_server
branch as of this writing.  I would advise you to check
"http://musicbrainz.org/CVS/Tag" to see which branch is the most
current.

19.  Run "ln -s /usr/bin/perl mb_server/cgi-bin/perl" to symbolically
link Perl.  Some of the mb_server scripts look for Perl in a different
place than it's really installed on Debian.  This will fix that issue.

------------------------------------------------------------------------
---

(CHANGE POSTGRES'S PERMISSIONS TO ALLOW EVERYONE)
20.  Edit "/etc/postgresql/pg_hba.conf" so that we open Postgres's
security up wide.  If you are familiar with Postgres's security, then
disregard steps #20 and #21 and change them to suit your own needs.  For
me, it's running entirely local, so security is not a concern.

20a.  Uncomment line 60 to read:
"Local all all trust"

20b.  Uncomment line 62 to read:
"host all all 127.0.0.1 255.255.255.255 trust"

20c.  Change line 86 to read:
"local all postgres password"

20d.  Change line 99 to read:
"host all all 0.0.0.0 0.0.0.0 password"

21.  Change the password for the PostgreSQL user "postgres" ...

21a.  Run "psql template1" to change the password for the Postgres user.

21b.  Run "psql template1" to start the PostgreSQL prompt.

21c.  Type "alter user postgres with password 'postgres';" at the prompt
to change the password.

21d.  Type "\q" at the prompt to quit PostgreSQL.

21e.  Type "exit" at the shell prompt to logout of the postgres user and
back to root.

21f.  Type "/etc/init.d/postgresql restart" to restart Postgres and make
the changes effective.

------------------------------------------------------------------------
---

(DOWNLOADING THE MUSICBRAINZ DATABASE DUMPS)
22.  If you havn't done it previous to this how-to, download all 5 the
Musicbrainz database dumps.  For this how-to, I suggest putting them
into "/home/httpd/musicbrainz".  Download the latest release of the
database.  To check, go to
"http://ftp.musicbrainz.org/pub/musicbrainz/data/fullexport/"

22a.  You can use "wget -c" to download them like this:
"wget -c
http://ftp.musicbrainz.org/pub/musicbrainz/data/fullexport/20050528-0740
29/mbdump.tar.bz2"

"wget -c
http://ftp.musicbrainz.org/pub/musicbrainz/data/fullexport/20050528-0740
29/mbdump-derived.tar.bz2"

"wget -c
http://ftp.musicbrainz.org/pub/musicbrainz/data/fullexport/20050528-0740
29/mbdump-moderation.tar.bz2"

"wget -c
http://ftp.musicbrainz.org/pub/musicbrainz/data/fullexport/20050528-0740
29/mbdump-closedmoderation.tar.bz2"

"wget -c
http://ftp.musicbrainz.org/pub/musicbrainz/data/fullexport/20050528-0740
29/mbdump-artistrelation.tar.bz2"

------------------------------------------------------------------------
---

(IMPORTING THE DATABASE INTO POSTGRES)
23.  Run "mb_server/admin/InitDb.pl --createdb --echo --import
mb*.tar.bz2" to begin importing the Musicbrainz database.

Important note:  This process took my computer (P3/1.0Ghz/1gig of ram,
VMware 5 Workstation with 384Mb of ram for the VMware machine) 4.5 hours
to complete!  Be patient.

Troubleshooting:
- If you see something similar to this:
"Schema sequence mismatch - codebase is 6, snapshot files are 7."
It means that the Perl scripts that you download in steps #15 thru #19
do not match the database you downloaded in step #22.  Make sure that
you are using the most current branch as noted in step #18 and that you
have the latest Musicbrainz databases in step #22!

- If you see something similar to this:
"Error loading /tmp/MBImport-hYEom30W/mbdump/moderation_closed: Error
loading data at
/drive2/home/httpd/musicbrainz/mb_server/admin/MBImport.pl line 259.

$sql->Rollback called without $sql->Begin at
/drive2/home/httpd/musicbrainz/mb_server/admin/MBImport.pl line 270"
It means that you ran out of hard drive space.  It's a VMware machine,
so make sure that you are creating the hard drive with a large maximum
hard drive space.  20 Gigs worked for me.  Thanks to Dave for both tips.

That's all!

------------------------------------------------------------------------
---

Final notes:

When you're done, you can download the Postgres database client "pgAdmin
III" for Windows from "http://www.pgadmin.org/".  Connect it to your
VMWare machine using the bridged network built into VMWare, and you can
do whatever you want with the data.

Good luck.





More information about the MusicBrainz-devel mailing list