Jukebox issue with artist names with entities in them

Let us know when something isn't working correctly, or if you find a typo. Do not post complaints or suggestions here.
Post Reply
User avatar
fluffy
Eruption
Posts: 11073
Joined: Sat Sep 25, 2004 10:56 am
Instruments: sometimes
Recording Method: Logic Pro X
Submitting as: Sockpuppet
Pronouns: she/they
Location: Seattle-ish
Contact:

Jukebox issue with artist names with entities in them

Post by fluffy »

The link to <3 from https://sfjukebox.org/fights/rhymes_with_lucia goes to https://sfjukebox.org/artists/%26lt%3B3 which doesn't work. To be fair this artist name was a giant pain in the ass to get working right on songfight itself too, and entities are always a pain in the butt.

On the Song Fight! archive proper, band names are normalized to a "band key" using the following algorithm:

Code: Select all

function makeKey($aName)
{
  $aKey = strtolower($aName);

  # strip a leading "a" or "the" as a word
  $aKey = preg_replace('/^(a|the) /','',$aKey);
  # convert spaces to underscores
  $aKey = str_replace(' ','_',$aKey);
  # convert entities to plaintext
  $aKey = html_entity_decode($aKey);
  # convert non-URL-safe characters to %xx
  $aKey = urlencode($aKey);
  # convert all runs of non-allowed characters into a single _
  $aKey = preg_replace('/[^a-zA-Z0-9\-]+/','_',$aKey);

  return $aKey;
}
That might be worth considering for the jukebox as well.

Presumably there are other bands affected by this, since there's some pretty wild punctuation in the archive.
User avatar
Lunkhead
You're No Good
Posts: 8140
Joined: Sat Sep 25, 2004 12:14 pm
Instruments: many
Recording Method: cubase/mac/tascam4x4
Submitting as: Berkeley Social Scene, Merisan, Tiny Robots
Pronouns: he/him
Location: Berkeley, CA
Contact:

Re: Jukebox issue with artist names with entities in them

Post by Lunkhead »

Yeah, I noticed that last week. I had everything working great on my old hosting setup but I guess when I set things up anew I think I didn't set something up properly, possibly with character encoding stuff in Nginx, which I had to use and am not terribly experienced with. Some day I will get around to try to fix it.
User avatar
fluffy
Eruption
Posts: 11073
Joined: Sat Sep 25, 2004 10:56 am
Instruments: sometimes
Recording Method: Logic Pro X
Submitting as: Sockpuppet
Pronouns: she/they
Location: Seattle-ish
Contact:

Re: Jukebox issue with artist names with entities in them

Post by fluffy »

Yeah this stuff is really frustrating and obnoxious and 100% understandable. :) Nothing urgent here of course.

I’m not sure why nginx would have anything to do with it but who even knows what moving parts there are in a modern web stack anymore.
User avatar
Lunkhead
You're No Good
Posts: 8140
Joined: Sat Sep 25, 2004 12:14 pm
Instruments: many
Recording Method: cubase/mac/tascam4x4
Submitting as: Berkeley Social Scene, Merisan, Tiny Robots
Pronouns: he/him
Location: Berkeley, CA
Contact:

Re: Jukebox issue with artist names with entities in them

Post by Lunkhead »

I suspect Nginx because it's something different in my new hosting setup, and I'm not very good at configuring it. My old setup used Apache as the front end, which my brother managed, and he was very experienced at that. So possibly it was setup in a way that handled character encodings etc. different? \ Behind the http server, the Java webapp is using an embedded Tomcat instance, and afaik nothing would have changed at all about that between setups. Same JDK, same jar file. But maybe somehow that's still involved, I guess I don't know 100% for sure.
Post Reply