Song Fight! Data (for programmers)

Use this forum for title suggestions, bitching at moderators, whining about phpBB, and grand ideas that will solve all of Song Fight's problems.
Post Reply
User avatar
Lunkhead
You're No Good
Posts: 8107
Joined: Sat Sep 25, 2004 12:14 pm
Instruments: many
Recording Method: cubase/mac/tascam4x4
Submitting as: Berkeley Social Scene, Merisan, Tiny Robots
Pronouns: he/him
Location: Berkeley, CA
Contact:

Song Fight! Data (for programmers)

Post by Lunkhead »

This post is mostly for programmers out there. BLT, and other jokers, please don't post in here just to say that you don't know what I'm talking about, etc. Let's all just pretend that you've posted some lolz here for us all to enjoy then get on with our lives, thanks. :)

Anyway, I would like to suggest that any other crazy people like me and Manhattan Glutton who want to build Song Fight! related apps consider using the data that is already available from my "Jukebox" site before they go off and write more scraping code. The jukebox is really two apps: one a database of the fight related data with a simple RESTful Web service on top that makes the data available in easy to consume formats, and the other a jukebox built on top of the database. I'd love to grow the database aspect, and maybe even fully split that out from the jukebox.

Getting the data out of my site is very easy. Some examples:

All the fight data for fights started after July 1st last year, sorted with most recent fights first, in JSON:

http://sfjukebox.org/fights.json?minSta ... ding=false

All the artist data (minus the extended profile info, which I don't have yet and may never import since it's mostly stale) for artists who first entered after July 1st last year, sorted by artist name, in JSON:

http://sfjukebox.org/artists.json?minFi ... nding=true

I've limited these examples to data within the last year, but it's possible to return the whole dataset by removing the restriction. It's similarly easy to get the individual artist and fight info out in easy to consume formats.

I would love to work with anybody who wants to build a Song Fight! app to make the data available in whatever format works for them, and to work with people on trying to add data that is missing.

Maybe some day we could have something that was so good that we could then flip things around and make it the real system of record and drive songfight.org off of it and stop all the scraping and importing. ;)
User avatar
fluffy
Eruption
Posts: 11029
Joined: Sat Sep 25, 2004 10:56 am
Instruments: sometimes
Recording Method: Logic Pro X
Submitting as: Sockpuppet
Pronouns: she/they
Location: Seattle-ish
Contact:

Re: Song Fight! Data (for programmers)

Post by fluffy »

That's beautiful. I hope I didn't break things too badly from the improved band key mapping. :) (I've also just fixed the way ampersands are handled, namely by removing some additional archive weirdness where some were stored as & and some were as &.)
User avatar
Lunkhead
You're No Good
Posts: 8107
Joined: Sat Sep 25, 2004 12:14 pm
Instruments: many
Recording Method: cubase/mac/tascam4x4
Submitting as: Berkeley Social Scene, Merisan, Tiny Robots
Pronouns: he/him
Location: Berkeley, CA
Contact:

Re: Song Fight! Data (for programmers)

Post by Lunkhead »

I think the only way your changes would affect the jukebox is on the jukebox artist pages, where there is a link to the "official archive page". I had copied the old artist name to artist key mapping code that Spud sent me and pasted it into one of my Java classes and Java-fied it, to make those links. I think that's the only place where I used the artist key. I will update that to use the new code you posted in the other thread so those links work again. I would also happily link to the artist's wiki page, if MG can send me the code I need to convert an artist name into an artist wiki key (or give me a URL about how that works if it's some standard MediaWiki thing).
User avatar
Lunkhead
You're No Good
Posts: 8107
Joined: Sat Sep 25, 2004 12:14 pm
Instruments: many
Recording Method: cubase/mac/tascam4x4
Submitting as: Berkeley Social Scene, Merisan, Tiny Robots
Pronouns: he/him
Location: Berkeley, CA
Contact:

Re: Song Fight! Data (for programmers)

Post by Lunkhead »

Oh wait...

fluffy wrote:Well, another thing I did was made it so that you can use the plain artist name in the URL, like:

http://songfight.org/artistpage.php?key=Jon%20Eric

will map internally to the same key.
So probably what I should do is remove my copy of the old code and instead use the actual artist names in my links to the "official" archive pages? And this will work as long as I properly encode the funky characters in the artist names?
User avatar
fluffy
Eruption
Posts: 11029
Joined: Sat Sep 25, 2004 10:56 am
Instruments: sometimes
Recording Method: Logic Pro X
Submitting as: Sockpuppet
Pronouns: she/they
Location: Seattle-ish
Contact:

Re: Song Fight! Data (for programmers)

Post by fluffy »

I only put that in as a friendlier way of sanitizing the input, but yeah, I guess it's not a bad idea. I'd be concerned about some of the weirder interplays between non-ASCII characters and entities and whatever though.
User avatar
Lunkhead
You're No Good
Posts: 8107
Joined: Sat Sep 25, 2004 12:14 pm
Instruments: many
Recording Method: cubase/mac/tascam4x4
Submitting as: Berkeley Social Scene, Merisan, Tiny Robots
Pronouns: he/him
Location: Berkeley, CA
Contact:

Re: Song Fight! Data (for programmers)

Post by Lunkhead »

Oy, for example, ¡Juiceharp! is problematic. Doesn't work 100% right either on songfight.org or the jukebox. I'm probably just going to use the full artist names since that simplifies things on my said and removes the need for me to have to maintain a copy of the artist name to artist key mapping code. I'll just live with the few edge cases that are broken for now.
User avatar
Spud
Hot for Teacher
Posts: 4770
Joined: Fri Sep 24, 2004 10:25 am
Instruments: Bass, Keyboards, eHorn
Submitting as: Octothorpe
Location: Seattle
Contact:

Re: Song Fight! Data (for programmers)

Post by Spud »

fluffy, are you fucking with my code? just want to know, that's all.
"I only listen to good music. And Octothorpe." - Marcus Kellis
Song Fight! The Rockening
User avatar
fluffy
Eruption
Posts: 11029
Joined: Sat Sep 25, 2004 10:56 am
Instruments: sometimes
Recording Method: Logic Pro X
Submitting as: Sockpuppet
Pronouns: she/they
Location: Seattle-ish
Contact:

Re: Song Fight! Data (for programmers)

Post by fluffy »

Yes, I was. Because it was broken and I'd like it to not be.
User avatar
fluffy
Eruption
Posts: 11029
Joined: Sat Sep 25, 2004 10:56 am
Instruments: sometimes
Recording Method: Logic Pro X
Submitting as: Sockpuppet
Pronouns: she/they
Location: Seattle-ish
Contact:

Re: Song Fight! Data (for programmers)

Post by fluffy »

Oh also I found the bug with ¡Juiceharp! et al but I got distracted by friends showing up before I had a chance to fix it. It should be fixed now. (Although note that non-ASCII characters expect ISO-8859-1 and most browsers use UTF-8 so you still can't do a direct link like http://songfight.org/artistpage.php?key=¡Juiceharp! unless that link comes from an ISO-8859-1 page. And even then it's non-guaranteed. So basically what I'm saying is that you should use the programmatically-generated keys and not try to do anything fancy like using the plaintext name.)

If you want to do the mapping yourself, the current code is this:

Code: Select all

function makeKey($aName)
{
  $aKey = strtolower($aName);

  # strip a leading "a" or "the" as a word
  $aKey = preg_replace('/^(a|the) /','',$aKey);
  # convert spaces to underscores
  $aKey = str_replace(' ','_',$aKey);
  # convert entities to plaintext
  $aKey = html_entity_decode($aKey);
  # convert non-URL-safe characters to %xx
  $aKey = urlencode($aKey);
  # convert all runs of non-allowed characters into a single _
  $aKey = preg_replace('/[^a-zA-Z0-9\-]+/','_',$aKey);

  return $aKey;
}
but it would be better to just store the URL you scraped the data from. If you're still scraping, I mean. If you're processing the actual archive data file then I guess you need the code.
User avatar
jast
Ice Cream Man
Posts: 1325
Joined: Tue Jul 29, 2008 7:03 pm
Instruments: Vocals, guitar
Recording Method: Cubase, Steinberg UR44
Submitting as: Jan Krueger
Pronouns: .
Location: near Aachen, Germany
Contact:

Re: Song Fight! Data (for programmers)

Post by jast »

That will make it a lot easier to automatically update the wiki. Thanks for posting.
User avatar
jb
Hot for Teacher
Posts: 4159
Joined: Sat Sep 25, 2004 10:12 am
Instruments: Guitar, Cello, Keys, Uke, Vox, Perc
Recording Method: Logic X
Submitting as: The John Benjamin Band
Pronouns: he/him
Location: WASHINGTON, DC
Contact:

Re: Song Fight! Data (for programmers)

Post by jb »

Crossposting from the Wiki thread.

A while ago I took a stab at creating a schema for a DB that would serve not only Song Fight, but Cover Fight, Nur Ein, and any other entity that wanted to do the traditional "song fight" thing. It's never been implemented, but here it is:

https://spreadsheets.google.com/spreads ... utput=html

I am not a professional database administrator or designer, so this design is not in any standard database design format, just a spreadsheet. A pair of spreadsheet rows defines a table, with the name of the table on top. Foreign keys are colored according to the table they come from. I did that to make sure I was linking everything correctly.

If I remember correctly, this is compliant with at least the second normal form (I didn't evaluate it against the third).

I am releasing this into the public domain! Use as you please.

JB
blippity blop ya don’t stop heyyyyyyyyy
Post Reply