Jukebox issue with artist names with entities in them
Posted: Tue Mar 26, 2024 11:21 pm
The link to <3 from https://sfjukebox.org/fights/rhymes_with_lucia goes to https://sfjukebox.org/artists/%26lt%3B3 which doesn't work. To be fair this artist name was a giant pain in the ass to get working right on songfight itself too, and entities are always a pain in the butt.
On the Song Fight! archive proper, band names are normalized to a "band key" using the following algorithm:
That might be worth considering for the jukebox as well.
Presumably there are other bands affected by this, since there's some pretty wild punctuation in the archive.
On the Song Fight! archive proper, band names are normalized to a "band key" using the following algorithm:
Code: Select all
function makeKey($aName)
{
$aKey = strtolower($aName);
# strip a leading "a" or "the" as a word
$aKey = preg_replace('/^(a|the) /','',$aKey);
# convert spaces to underscores
$aKey = str_replace(' ','_',$aKey);
# convert entities to plaintext
$aKey = html_entity_decode($aKey);
# convert non-URL-safe characters to %xx
$aKey = urlencode($aKey);
# convert all runs of non-allowed characters into a single _
$aKey = preg_replace('/[^a-zA-Z0-9\-]+/','_',$aKey);
return $aKey;
}
Presumably there are other bands affected by this, since there's some pretty wild punctuation in the archive.