Archive data fun

Links and other hanky panky that doesn't have to do with anything in particular.
User avatar
Lunkhead
Rosselli
Posts: 8567
Joined: Sat Sep 25, 2004 12:14 pm
Instruments: many
Recording Method: cubase/mac/tascam4x4
Submitting as: Berkeley Social Scene
Pronouns: he/him
Location: Central Oregon
Contact:

Archive data fun

Post by Lunkhead »

http://sfjukebox.org/artists?sort=fight ... ding=false

Top 10 most entries under a single name:
106 Ross Durand
104 WreckdoM
103 Steve Durand
102 Melvin
101 Paco del Stinko
101 The Weakest Suit
93 King Arthur
84 Johnny Cashpoint
79 Octothorpe
62 Hostess Mostess

Berkeley Social Scene (currently #14 with 51 entries) is gunning for the top 10, but we are a long way off, especially from the rarified heights of the "100 club", now seven members strong. Will King Arthur break 100 this year? Will Melvin come out of retirement to stay in the top 5? Will WreckdoM get back in the game to battle Ross for the top spot? Can anyone really compete with the unstoppable Durand and Stinko juggernauts? Tune in next fight.
User avatar
Caravan Ray
bono
bono
Posts: 8745
Joined: Sat Sep 25, 2004 1:51 pm
Instruments: Penis
Recording Method: Garageband
Submitting as: Caravan Ray,G.O.R.T.E.C,Lyricburglar,The Thugs from the Scallop Industry
Location: Toowoomba, Queensland
Contact:

Re: Quantity and/or quality

Post by Caravan Ray »

Lunkhead wrote:http://sfjukebox.org/artists?sort=fight ... ding=false

Top 10 most entries under a single name:
106 Ross Durand
104 WreckdoM
103 Steve Durand
102 Melvin
101 Paco del Stinko
101 The Weakest Suit
93 King Arthur
84 Johnny Cashpoint
79 Octothorpe
62 Hostess Mostess
...and at number 11:

60 Caravan Ray

which will be 61 when my entry for this week is counted.
Your Top 10 days are numbered HoMo
User avatar
Lunkhead
Rosselli
Posts: 8567
Joined: Sat Sep 25, 2004 12:14 pm
Instruments: many
Recording Method: cubase/mac/tascam4x4
Submitting as: Berkeley Social Scene
Pronouns: he/him
Location: Central Oregon
Contact:

Re: Archive data fun

Post by Lunkhead »

Sorry, CR, I didn't mean to discount your performance. You will undoubtedly ascend to the Top 10 soon. Nicely done.

I'm messing around with making some charts out of the archive data. I'm doing "number of votes per entry over time" for artists, as a test case. I like MG's chart on the sfbase.net artist pages showing entries over time. I could do something like that too, though I don't think I can actually plot the fight titles on the chart (I'm using the Google Chart API for now in case anybody knows anything about it).

I'm not very experienced with making charts, and I don't really have many good ideas for what would be interesting to visualize. Anybody got any ideas or suggestions for charts/visualizations they'd like to see?
User avatar
JonPorobil
Ibárruri
Posts: 5682
Joined: Sat Sep 25, 2004 11:45 am
Instruments: Piano, Guitar, Harmonica, Mandolin, Accordion, Bass, lots of VSTs
Recording Method: Cubase 10.5
Submitting as: Jon Eric, Jon Porobil, others
Pronouns: He/Him
Location: Pittsburgh, PA
Contact:

Re: Archive data fun

Post by JonPorobil »

Perhaps some charts depicting the percent of vote share in each fight entered over a period of time. I would imagine that, as they enter more fights, many competitors have gotten better, some have been more or less level, and that some peaked and declined. Frontalot's chart would be exponential if we had the vote data going back that far.

Oh wait, we don't have percentages for the the multi-vote fights, do we? Dang.
"Warren Zevon would be proud." -Reve Mosquito

Stages, an album of about dealing with loss, anxiety, and grieving a difficult year, now available on Bandcamp and all streaming platforms! https://jonporobil.bandcamp.com/album/stages
User avatar
Lunkhead
Rosselli
Posts: 8567
Joined: Sat Sep 25, 2004 12:14 pm
Instruments: many
Recording Method: cubase/mac/tascam4x4
Submitting as: Berkeley Social Scene
Pronouns: he/him
Location: Central Oregon
Contact:

Re: Archive data fun

Post by Lunkhead »

Your wish is my kludge command. If you go to an artist page, say:

http://sfjukebox.org/artists/Jon+Eric

... there is a little chart icon to the right of the "Fight" and "Votes" column labels. If you click that, you should see a dialog with a chart of votes per fight over time, and a menu from which you can also select a chart that shows percent of votes per fight over time. (In both cases, where vote data is not available, I just used 0 as the data point for now.)

(For Internet Explorer users, I have not tested this on IE.)

Oh, also, for kicks I have it switching to a column chart if there are 10 or fewer fights to plot.
User avatar
JonPorobil
Ibárruri
Posts: 5682
Joined: Sat Sep 25, 2004 11:45 am
Instruments: Piano, Guitar, Harmonica, Mandolin, Accordion, Bass, lots of VSTs
Recording Method: Cubase 10.5
Submitting as: Jon Eric, Jon Porobil, others
Pronouns: He/Him
Location: Pittsburgh, PA
Contact:

Re: Archive data fun

Post by JonPorobil »

That's pretty sweet.

I love how there's a huge peak in the middle of my chart (for "What a Horrible Thing to Say") but it still wasn't a win. Oh well.

The only real trouble is that the percentages are all messed up post-June-2008 because of the switch to multi-votes. From that point on, the tallies all added up to greater than 100%. Often substantially greater. Can't have it all, can ya.
"Warren Zevon would be proud." -Reve Mosquito

Stages, an album of about dealing with loss, anxiety, and grieving a difficult year, now available on Bandcamp and all streaming platforms! https://jonporobil.bandcamp.com/album/stages
User avatar
Paco Del Stinko
Roosevelt
Posts: 3550
Joined: Fri Apr 07, 2006 11:20 am
Instruments: Basic rock, at a basic level.
Recording Method: Roland 2480
Submitting as: Paco del Stinko
Location: Massachusetts. God save the Commonwealth!

Re: Archive data fun

Post by Paco Del Stinko »

Ha! I am out of the fight this week, but I have been thinking about the most submissions thing lately. Nice goal!
Bringin' the stink since 2006.
User avatar
Lunkhead
Rosselli
Posts: 8567
Joined: Sat Sep 25, 2004 12:14 pm
Instruments: many
Recording Method: cubase/mac/tascam4x4
Submitting as: Berkeley Social Scene
Pronouns: he/him
Location: Central Oregon
Contact:

Re: Archive data fun

Post by Lunkhead »

Generic wrote:That's pretty sweet.

I love how there's a huge peak in the middle of my chart (for "What a Horrible Thing to Say") but it still wasn't a win. Oh well.

The only real trouble is that the percentages are all messed up post-June-2008 because of the switch to multi-votes. From that point on, the tallies all added up to greater than 100%. Often substantially greater. Can't have it all, can ya.
Thanks, it was fun to quickly hack in. I'm not sure I understand what you mean about the percentages, though. Can you include a URL of where the percentages aren't right? I don't know if you mean on sfjukebox.org, songfight.org, some other site, or just conceptually. Are you defining "one vote" as "one person's vote, which may be for multiple songs in multiple voting enabled fights" or something? I am probably being dense, sorry, had a long day.
User avatar
JonPorobil
Ibárruri
Posts: 5682
Joined: Sat Sep 25, 2004 11:45 am
Instruments: Piano, Guitar, Harmonica, Mandolin, Accordion, Bass, lots of VSTs
Recording Method: Cubase 10.5
Submitting as: Jon Eric, Jon Porobil, others
Pronouns: He/Him
Location: Pittsburgh, PA
Contact:

Re: Archive data fun

Post by JonPorobil »

It's not a correctable error, because the data simply isn't being saved anywhere.

Back in the one-vote days, everyone's percentage of the vote added up to 100. If you got 25% of the vote, that means 25% of the people thought you had the best song. Now, you can have 25% of the vote, but since many of them would be people who also voted for someone else, that number doesn't mean as much as before.

It just means anyone who's been active since before the summer of 2008 will probably see a spike in their vote share after that cutoff time.
"Warren Zevon would be proud." -Reve Mosquito

Stages, an album of about dealing with loss, anxiety, and grieving a difficult year, now available on Bandcamp and all streaming platforms! https://jonporobil.bandcamp.com/album/stages
User avatar
Lunkhead
Rosselli
Posts: 8567
Joined: Sat Sep 25, 2004 12:14 pm
Instruments: many
Recording Method: cubase/mac/tascam4x4
Submitting as: Berkeley Social Scene
Pronouns: he/him
Location: Central Oregon
Contact:

Re: Archive data fun

Post by Lunkhead »

I guess the votes might imply a different sentiment on the part of the voter now, since a voter can vote for more than one song in a fight. Maybe before people had to vote for only their "favorite" so the vote "meant" something "more". That's all very subjective though, and we can really only speculate about how people decided which song to vote for. Maybe lots of people just voted for themselves for fear of not getting any votes and now they can vote for others too? And then there's friend flooding, and a variety of other factors.

I don't think any of that changes much about calculating a percentage votes. (Isn't it still the number of votes for a song divided by total number of votes in the fight?) Also looking at the graphs for people who started submitting before 2008 and who've submitted a bunch after that I don't see any kind of clear trend like the one you're suggesting, but that's just anecdotal. Also anecdotally, looking at the graphs, and considering the above factors, the voting seems pretty random. I guess if I knew more about charts and graphs I would probably try to set all the graphs to the same scale or something ... ? I don't know... I'm tired...
User avatar
Manhattan Glutton
Niemöller
Posts: 1530
Joined: Tue Feb 15, 2005 12:10 pm
Instruments: Angst
Recording Method: REAPER
Location: Madison, WI
Contact:

Re: Archive data fun

Post by Manhattan Glutton »

Lunkhead - any chance you could make the graphs easy to link to from the wiki so I can do away with the ugly ones?

Also, I don't know much about math, and maybe this might be too meta, but perhaps it should be the %vote vs average% vote ratio? Or something of that sort? Standard deviation something something. Or maybe just rank and/or rank percentile - I was working on that for the wiki at some point.
If I had a dollar for every one of my songs j$ has called a 90s pastiche, I'd have $1 for every song I've written.

Nur Ein Archives | The New Ugly Podcast
User avatar
king_arthur
Niemöller
Posts: 1763
Joined: Sun Sep 26, 2004 6:56 am
Instruments: guitar, vocals, bass, BIAB, keyboards (synth anything)
Recording Method: Tascam DP-24SD
Submitting as: King Arthur
Pronouns: he/him
Location: Phoenix, AZ
Contact:

Re: Archive data fun

Post by king_arthur »

Yeah, I was thinking about that multiple-votes thing too. Presumably there is no way to go back and determined how many different voters there were for the fights since 6/08.

Maybe the chart could just include a little vertical red line indicating where the multiple vote stuff started?

Since we're all just asking for stuff, how about something like "what percentage of the people who entered this fight did I beat, vote-wise?" With a fudge factor so that if you won the fight, it's always 100%, all the fight wins come out as high points, regardless of how many votes were cast or songs there were. If you were 5th out of 20 entries, that would show up higher up the chart than if you were 5th out of six entries. That might tend to hide the effects of multiple votes...

N=number of entries (should never be zero)
B=number of fighters who got fewer votes than i did

X = ( B+1 / N )

so if I won the fight, B+1 = N, N/N = 100%

If I was third out of 5 entries, B=2; 2/5 = 40%

if I was third out of 10 entries, B=7; 8/10 = 80%

if I was third out of 20 entries, B=17, 18/20 = 90%

Something like that that would give you more credit for finishing 3rd in a big fight than it would for finishing 3rd in a small fight. If I was tied for third out of 20 entries, I wouldn't get quite as high a score as if I was the only person in third for that fight, since I beat fewer people, which seems appropriate...

Sorry, I get these ideas and tend to fly off with them...
"...one does not write in dactylic hexameter purely by accident..." - poetic designs
User avatar
Manhattan Glutton
Niemöller
Posts: 1530
Joined: Tue Feb 15, 2005 12:10 pm
Instruments: Angst
Recording Method: REAPER
Location: Madison, WI
Contact:

Re: Archive data fun

Post by Manhattan Glutton »

king_arthur wrote:X = ( B+1 / N )
It's more like this: http://en.wikipedia.org/wiki/Percentile_rank

(C + 0.5 * F) / N
Where C is the count ranked less, F is the count of equal rank, and N is the total count. I think.
If I had a dollar for every one of my songs j$ has called a 90s pastiche, I'd have $1 for every song I've written.

Nur Ein Archives | The New Ugly Podcast
User avatar
Lunkhead
Rosselli
Posts: 8567
Joined: Sat Sep 25, 2004 12:14 pm
Instruments: many
Recording Method: cubase/mac/tascam4x4
Submitting as: Berkeley Social Scene
Pronouns: he/him
Location: Central Oregon
Contact:

Re: Archive data fun

Post by Lunkhead »

OK, percentile rank is now the default chart:

http://sfjukebox.org/artists/Manhattan%20Glutton
User avatar
Manhattan Glutton
Niemöller
Posts: 1530
Joined: Tue Feb 15, 2005 12:10 pm
Instruments: Angst
Recording Method: REAPER
Location: Madison, WI
Contact:

Re: Archive data fun

Post by Manhattan Glutton »

Nice! That seems a little more coherent.
If I had a dollar for every one of my songs j$ has called a 90s pastiche, I'd have $1 for every song I've written.

Nur Ein Archives | The New Ugly Podcast
User avatar
king_arthur
Niemöller
Posts: 1763
Joined: Sun Sep 26, 2004 6:56 am
Instruments: guitar, vocals, bass, BIAB, keyboards (synth anything)
Recording Method: Tascam DP-24SD
Submitting as: King Arthur
Pronouns: he/him
Location: Phoenix, AZ
Contact:

Re: Archive data fun

Post by king_arthur »

yeah, thanks, this is getting interesting! a few surprises in my chart, although I think I tend to remember how I did in the reviews more than how I did in the voting, and there have been a few fights where the reviews and votes didn't seem to match up at all...

good stuff! thanks again!

Charles (KA)
"...one does not write in dactylic hexameter purely by accident..." - poetic designs
User avatar
Lunkhead
Rosselli
Posts: 8567
Joined: Sat Sep 25, 2004 12:14 pm
Instruments: many
Recording Method: cubase/mac/tascam4x4
Submitting as: Berkeley Social Scene
Pronouns: he/him
Location: Central Oregon
Contact:

Re: Archive data fun

Post by Lunkhead »

I also changed the vertical range for "% of votes" to try to make sure it goes from 0-100%. Maybe that will make that chart a little less spiky and random looking...? I am thinking I need to change the vertical range for # of votes, too, with the top of the range being a bit above whatever the votes were that were received by the song that's received the most votes. Depending on how big that range is and the distribution of votes I was thinking I might make it a logarithmic scale, too, on the # of votes vertical axis. I have no idea what I'm doing though so maybe none of that makes sense. ???

I still need to make the charts linkable. I think I'm going to be lazy and just kludge it so that if there is something in the URL like "?charts=true" then the chart dialog will popup automatically when the page loads. Is that suitable? Ideally I would like to be able to provide the charts as embeddable widgets or something but I won't have time for that for a while.
User avatar
Manhattan Glutton
Niemöller
Posts: 1530
Joined: Tue Feb 15, 2005 12:10 pm
Instruments: Angst
Recording Method: REAPER
Location: Madison, WI
Contact:

Re: Archive data fun

Post by Manhattan Glutton »

I'd really like to have those nicer graphs directly in the wiki, but the linking seemed like a better option. It seems like we kind of have somewhat overlapping goals now, and I'm trying to figure out how the wiki fits in and what direction to take it.
If I had a dollar for every one of my songs j$ has called a 90s pastiche, I'd have $1 for every song I've written.

Nur Ein Archives | The New Ugly Podcast
HeuristicsInc
Ibárruri
Posts: 5351
Joined: Sat Sep 25, 2004 6:14 pm
Instruments: Synths
Recording Method: Windows computer, Acid, Synths etc.
Submitting as: Heuristics Inc. (duh) + collabs
Pronouns: he/him
Location: Maryland USA
Contact:

Re: Archive data fun

Post by HeuristicsInc »

that's very cool. i appreciate the percentile rank chart.
152612141617123326211316121416172329292119162316331829382412351416132117152332252921
http://heuristicsinc.com
Liner Notes
SF Lyric Ideas
User avatar
Lunkhead
Rosselli
Posts: 8567
Joined: Sat Sep 25, 2004 12:14 pm
Instruments: many
Recording Method: cubase/mac/tascam4x4
Submitting as: Berkeley Social Scene
Pronouns: he/him
Location: Central Oregon
Contact:

Re: Archive data fun

Post by Lunkhead »

Thanks, folks. I appreciate getting some suggestions on stuff to throw out there, it's fun.

MG, I don't have the easy link yet but I realized you could already just embed the charts in an iframe if you want. That is the hacky way that I'm including them at the moment, too. The URLs are like this:

http://sfjukebox.org/artists/chart/Lunkhead
User avatar
Lunkhead
Rosselli
Posts: 8567
Joined: Sat Sep 25, 2004 12:14 pm
Instruments: many
Recording Method: cubase/mac/tascam4x4
Submitting as: Berkeley Social Scene
Pronouns: he/him
Location: Central Oregon
Contact:

Re: Archive data fun

Post by Lunkhead »

I whipped up a chart showing number of votes across the horizontal axis, and number of songs that received that number of votes on the vertical axis. I did not include songs from fights where votes were not recorded. Surprisingly, the most common numbers of votes received were 1, 2, 3, and then 0. I expected 0 to be higher ranked than that.

http://sfjukebox.org/songs/charts/votesDist
User avatar
king_arthur
Niemöller
Posts: 1763
Joined: Sun Sep 26, 2004 6:56 am
Instruments: guitar, vocals, bass, BIAB, keyboards (synth anything)
Recording Method: Tascam DP-24SD
Submitting as: King Arthur
Pronouns: he/him
Location: Phoenix, AZ
Contact:

Re: Archive data fun

Post by king_arthur »

This is one where it would be interesting to see two lines, one for the "vote for one" era and one for the "vote for all the ones you like" era.

I would guess that that 0 votes vs. 1 vote difference just shows how often we vote for our own songs, even if we know they're crap :-)

Charles (KA)
"...one does not write in dactylic hexameter purely by accident..." - poetic designs
Post Reply