Artist Consolidation
- Lunkhead
- You're No Good
- Posts: 8175
- Joined: Sat Sep 25, 2004 12:14 pm
- Instruments: many
- Recording Method: cubase/mac/tascam4x4
- Submitting as: Berkeley Social Scene, Merisan, Tiny Robots
- Pronouns: he/him
- Location: Berkeley, CA
- Contact:
Artist Consolidation
Dear FMs, what do you think of consolidating the following artists? I initially came upon two of these by chance, but then it occurred to me to whip up a bit of code to find similar artist names (using Levenshtein distance for now) and found many more. I think it'd be an improvement to the archive. I might just go ahead and consolidate them all on the Jukebox as I've set up a simple form for doing that.
Tobin's Spirit Guide -> Tobin's Spirit Guide (merge the one with the fancy curly apostrophe into the the one with the simple straight apostrophe)
Thanks Glenny for the Frisbee -> Thank Glennny for the Frisbee (the second one has three entries and has "Glennny" spelled right)
The Bersfordians -> The Beresfordians (second one shows up in Google search results other than Song Fight! related results)
Big Matt Hyatt and his Rusty Red Riders -> Big Matt Hyatt and His Rusty Red Ryders (second one has two songs, first only has one)
ChinMusic -> Chin Music (second has way more entries)
Chips, Abbott & Ray -> Chips, Abbott, & Ray (serial comma)
A Werkenhorse -> Daj Werkenhorse
Dan Werkenhorse -> Daj Werkenhorse (Daj has a bunch of entries, the others each have only one, they all sound like the same dude)
Dr Spectacular's Power Circus -> Dr. Spectacular's Power Circus
Elastic Wasteband -> Elastic Waste Band (second has two entries, first only one)
Evil E -> Evil-E
Flvxxvm Forvm -> Flvxxvm Florvm
Fortune's Favorite -> Fortune's Favorites
Freddie Love -> Freddielove
Hey, It's Romer -> Hey it's Romer
The Interchangables -> The Interchangeables (although I like the ring of "The Inter-chang-ables" a la Ben Chang from "Community")
JeebasJones -> Jeebas Jones
Links vs. Music -> Link vs. Music
MC Milk Plus -> MC Milk-Plus (Google search results indicate he uses the dash)
Meat Knob -> Meatknob
The Mexican Champanzees -> The Mexican Chimpanzees (although the misspelled one is funny)
Napoleons Toes -> Napoleon's Toes
Nobody, et al -> Nobody, et al.
Project-D -> Project D
RadioShow -> Radio Show
Ratt Poizon -> RattPoizon
rice Henry and the Transformers -> Brice Henry and the Transformers
RioMondo -> Rio Mondo
Ryan Rickenback -> Ryan Rickenbach
Star Crossed Voyager -> Star-Crossed Voyager
Also, dear eclectic spoons guy, wtf? Is it "her" or "the" spoons? Is it "Eclectic" or "@eclectic"? Is it "SpOOns" or "spOOns" or "sp00ns"? Oy!
Tobin's Spirit Guide -> Tobin's Spirit Guide (merge the one with the fancy curly apostrophe into the the one with the simple straight apostrophe)
Thanks Glenny for the Frisbee -> Thank Glennny for the Frisbee (the second one has three entries and has "Glennny" spelled right)
The Bersfordians -> The Beresfordians (second one shows up in Google search results other than Song Fight! related results)
Big Matt Hyatt and his Rusty Red Riders -> Big Matt Hyatt and His Rusty Red Ryders (second one has two songs, first only has one)
ChinMusic -> Chin Music (second has way more entries)
Chips, Abbott & Ray -> Chips, Abbott, & Ray (serial comma)
A Werkenhorse -> Daj Werkenhorse
Dan Werkenhorse -> Daj Werkenhorse (Daj has a bunch of entries, the others each have only one, they all sound like the same dude)
Dr Spectacular's Power Circus -> Dr. Spectacular's Power Circus
Elastic Wasteband -> Elastic Waste Band (second has two entries, first only one)
Evil E -> Evil-E
Flvxxvm Forvm -> Flvxxvm Florvm
Fortune's Favorite -> Fortune's Favorites
Freddie Love -> Freddielove
Hey, It's Romer -> Hey it's Romer
The Interchangables -> The Interchangeables (although I like the ring of "The Inter-chang-ables" a la Ben Chang from "Community")
JeebasJones -> Jeebas Jones
Links vs. Music -> Link vs. Music
MC Milk Plus -> MC Milk-Plus (Google search results indicate he uses the dash)
Meat Knob -> Meatknob
The Mexican Champanzees -> The Mexican Chimpanzees (although the misspelled one is funny)
Napoleons Toes -> Napoleon's Toes
Nobody, et al -> Nobody, et al.
Project-D -> Project D
RadioShow -> Radio Show
Ratt Poizon -> RattPoizon
rice Henry and the Transformers -> Brice Henry and the Transformers
RioMondo -> Rio Mondo
Ryan Rickenback -> Ryan Rickenbach
Star Crossed Voyager -> Star-Crossed Voyager
Also, dear eclectic spoons guy, wtf? Is it "her" or "the" spoons? Is it "Eclectic" or "@eclectic"? Is it "SpOOns" or "spOOns" or "sp00ns"? Oy!
- Lunkhead
- You're No Good
- Posts: 8175
- Joined: Sat Sep 25, 2004 12:14 pm
- Instruments: many
- Recording Method: cubase/mac/tascam4x4
- Submitting as: Berkeley Social Scene, Merisan, Tiny Robots
- Pronouns: he/him
- Location: Berkeley, CA
- Contact:
Re: Artist Consolidation
There's also "Space Pub" and "Spacepub", but I'm not sure which way to go with that one. Maybe j$ could clarify. It sounds like him singing on those.
- fluffy
- Eruption
- Posts: 11097
- Joined: Sat Sep 25, 2004 10:56 am
- Instruments: sometimes
- Recording Method: Logic Pro X
- Submitting as: Sockpuppet
- Pronouns: she/they
- Location: Seattle-ish
- Contact:
Re: Artist Consolidation
IANAFM, but sounds good to me, at least for the cases where it isn't someone purposefully slightly changing their name every week for humor value.
What'd be really awesome is a properly normalized scheme where there's an abstract ID that multiple entry names can map onto but it's hard enough getting people to submit correctly as it is.
What'd be really awesome is a properly normalized scheme where there's an abstract ID that multiple entry names can map onto but it's hard enough getting people to submit correctly as it is.
- Spud
- Hot for Teacher
- Posts: 4770
- Joined: Fri Sep 24, 2004 10:25 am
- Instruments: Bass, Keyboards, eHorn
- Submitting as: Octothorpe
- Location: Seattle
- Contact:
Re: Artist Consolidation
What'd really be awesome would be if people would spell their band name the same every time. You know where these come from, right, people?
Not putting down your idea for an abstract ID, fluffy, that would be cool, but it would require some sort of additional registration on the part of entrants.
Not putting down your idea for an abstract ID, fluffy, that would be cool, but it would require some sort of additional registration on the part of entrants.
- Caravan Ray
- bono
- Posts: 8665
- Joined: Sat Sep 25, 2004 1:51 pm
- Instruments: Penis
- Recording Method: Garageband
- Submitting as: Caravan Ray,G.O.R.T.E.C,Lyricburglar,The Thugs from the Scallop Industry
- Location: Toowoomba, Queensland
- Contact:
Re: Artist Consolidation
"Caravan Ray" has 70 entries, and "Caravan ray" has one. I am guessing that one of those may have been a typo.
Oh - there is also a Caravan Ray 1 and Caravan Ray 2 - which actually were not my typos or the names I used - they came from a special fight for JB where I did 2 songs, and they came up under those names, I assume for voting purposes. They look a bit funny in the archive that way.
Oh - there is also a Caravan Ray 1 and Caravan Ray 2 - which actually were not my typos or the names I used - they came from a special fight for JB where I did 2 songs, and they came up under those names, I assume for voting purposes. They look a bit funny in the archive that way.
- fluffy
- Eruption
- Posts: 11097
- Joined: Sat Sep 25, 2004 10:56 am
- Instruments: sometimes
- Recording Method: Logic Pro X
- Submitting as: Sockpuppet
- Pronouns: she/they
- Location: Seattle-ish
- Contact:
Re: Artist Consolidation
Well, yeah, that's what I was trying to imply with the idea of an abstract ID in the first place. And of course people will completely lose their registration information and whatever and just open a new account and not really help anything.Spud wrote:What'd really be awesome would be if people would spell their band name the same every time. You know where these come from, right, people?
Not putting down your idea for an abstract ID, fluffy, that would be cool, but it would require some sort of additional registration on the part of entrants.
- Lunkhead
- You're No Good
- Posts: 8175
- Joined: Sat Sep 25, 2004 12:14 pm
- Instruments: many
- Recording Method: cubase/mac/tascam4x4
- Submitting as: Berkeley Social Scene, Merisan, Tiny Robots
- Pronouns: he/him
- Location: Berkeley, CA
- Contact:
Re: Artist Consolidation
If you go to the profile page for either "Caravan Ray" or "Caravan ray" you'll see that the case doesn't matter and there is only one profile page, with all 71 songs. This is the case on the official archive and the Jukebox. On the Jukebox there can't be both "Caravan Ray" and "Caravan ray" as artist names, but it looks like, even though there is only one profile for both in the official archive, the list of artists still shows both, which seems a little weird.Caravan Ray wrote:"Caravan Ray" has 70 entries, and "Caravan ray" has one. I am guessing that one of those may have been a typo.
Oh - there is also a Caravan Ray 1 and Caravan Ray 2 - which actually were not my typos or the names I used - they came from a special fight for JB where I did 2 songs, and they came up under those names, I assume for voting purposes. They look a bit funny in the archive that way.
With the other ones, the numbers are there so people can tell the entries apart on the fight page. To deal with it better would be a bit complicated. There'd have to be a piece of data about every song indicating what number entry it was for that artist for that fight. For most songs, that would just be "1", and in the few odd cases like yours it would be "1" and "2". Then the songs could both be associated with the artist "Caravan Ray" and there'd be some other data to show to distinguish them if necessary. On the Jukebox side, I would add a new column to the songs table in my database and default the value to 1 for all songs, then munge the data either manually or with a script from there. Then I'd have to change the display code to check if an artist has >1 songs in a fight, and if so, show those numbers next to their name... Not sure how it could be handled on the official archive song but it seems like it might potentially be more complicated because of how the data is stored. Just speculating though.
- fluffy
- Eruption
- Posts: 11097
- Joined: Sat Sep 25, 2004 10:56 am
- Instruments: sometimes
- Recording Method: Logic Pro X
- Submitting as: Sockpuppet
- Pronouns: she/they
- Location: Seattle-ish
- Contact:
Re: Artist Consolidation
The way the official archive is structured right now it would be impossible to handle it well. A normalized abstract entrant and entry ID (with different display names for the two entries from the same entrant) would work better, although still not perfect for voting purposes.
- fluffy
- Eruption
- Posts: 11097
- Joined: Sat Sep 25, 2004 10:56 am
- Instruments: sometimes
- Recording Method: Logic Pro X
- Submitting as: Sockpuppet
- Pronouns: she/they
- Location: Seattle-ish
- Contact:
Re: Artist Consolidation
Also, while normalizing solves one problem, it introduces another - how do we handle collaborations?
- Lunkhead
- You're No Good
- Posts: 8175
- Joined: Sat Sep 25, 2004 12:14 pm
- Instruments: many
- Recording Method: cubase/mac/tascam4x4
- Submitting as: Berkeley Social Scene, Merisan, Tiny Robots
- Pronouns: he/him
- Location: Berkeley, CA
- Contact:
Re: Artist Consolidation
Yeah, handling those better would be nice. In the case of there being a "primary" artist and a guest/featured/etc. artist, a simple option would be to add a piece of text data to every song that could hold the "featuring So-and-So" info, to keep it out of the primary artist's name, so the artist name could stay clean, and the collaborations could then be consolidated under the "primary" artist. More better of course would be for songs to have a many-to-many relationship to artists. Maybe those relationships would have a "primary" flag or not, maybe a text field holding the text for describing the relationship (in case people didn't want it to just be a list of the artists' name, e.g. if they wanted "Example Artist with So-and-So").
Whenever I think of this kind of stuff now though I think of Google's knowledge graph and how ultimately if you really want to keep breaking things down that's one place where you'd wind up. That's a bit beyond me though.
Whenever I think of this kind of stuff now though I think of Google's knowledge graph and how ultimately if you really want to keep breaking things down that's one place where you'd wind up. That's a bit beyond me though.
- Spud
- Hot for Teacher
- Posts: 4770
- Joined: Fri Sep 24, 2004 10:25 am
- Instruments: Bass, Keyboards, eHorn
- Submitting as: Octothorpe
- Location: Seattle
- Contact:
Re: Artist Consolidation
By the way, did I forget to mention that I did in fact do the consolidations, as requested?
SPUD
SPUD
- Lunkhead
- You're No Good
- Posts: 8175
- Joined: Sat Sep 25, 2004 12:14 pm
- Instruments: many
- Recording Method: cubase/mac/tascam4x4
- Submitting as: Berkeley Social Scene, Merisan, Tiny Robots
- Pronouns: he/him
- Location: Berkeley, CA
- Contact:
Re: Artist Consolidation
OK, done on the Jukebox side too. Hooray for cleaner data!