[mb-users] Why is robots.txt disallowing all of this?

Chad Wilson chad.wilson at gmx.net
Sun May 4 01:46:41 UTC 2008


Steve Wyles wrote:
> On Sat, 3 May 2008, Philip Jägenstedt wrote:
>
>> Wondering why Google doesn't index MusicBrainz very well I turned to
>> http://musicbrainz.org/robots.txt
>>
>> It seems most of everything is off limits to the search engines. Why?
>> Wouldn't search traffic generate more traffic to MusicBrainz, giving
>> us more new users?
>
> The load generated by search engines indexing the whole site could 
> make it unusable for users.
>
> Steve (inhouseuk)
>
I personally think we should we trying to find the ability to 
accommodate this kind of hit. One doesn't take over the world by 
shutting ones doors. If we want MB data to become a serious reference 
point for people, it at least needs to be searchable and available. 
Several sites that take feeds of our data manage to be Google indexed; 
so it'd seem embarrassing for us not to be able to handle it.

Could we provide mirror hardware specially for search engines for the 
initial index? Or is the architecture/code fundamentally unscalable to 
high volumes of traffic?

Chad / voice



More information about the MusicBrainz-users mailing list