Tag Archives: useless

Which words have the letter P in it and what’s the percentage?

Another code snippet from the useless dept, this time for PHP. If you’ve ever written a post and wondered how many words use the letter P (or any character), here’s something for you in PHP:

function find_words_with_letters($letter, $str) {

	$a = preg_split("#[ \n\r]+#", $str);

	$letterupper = strtoupper($letter);
	$letterlower = strtolower($letter);

	$result = array();
	$result['word_count'] = count($a);
	$result['words_with_letter'] = array();

	foreach($a as $i) {
		if (strpos($i, $letterlower) !== false || strpos($i, $letterupper) !== false) {
			$result['words_with_letter'][] = $i;
		}
	}
	$result['words_with_letter_ratio'] = count($result['words_with_letter']) / count($a);

	return $result;
}

// use it
print_r(find_words_with_letters('p', 'This message has one P'));

By the way, 17.3% of this post have P’s in it!

Javascript snippet to convert raw UTF8 to unicode

For the I-don’t-a-sane-use-for-this department comes this piece of code which takes a stream of raw UTF-8 bytes, decodes it and fromCharCode it, rendering it in a unicode supported browser. A possible use would be if the web page character set is not UTF-8 and you want to display UTF-8. To use it, just put it in a script tag and call utf8decode(myrawutf8string). But seriously, all web pages should be UTF-8 by default nowadays. Here it is, in case anyone wants it:

function TryGetCharUTF8(c, intc, b, i, count)
		{
			/*
			 * 10000000 80
			 * 11000000 C0
			 * 11100000 E0
			 * 11110000 F0
			 * 11111000 F8
			 * 11111100 FC
			 * 
			 * FEFF = 65279 = BOM
			 * 
			 * string musicalbassclef = "" + (char)0xD834 + (char)0xDD1E; 119070 0x1D11E
			 */

			if ((b.charCodeAt(i) & 0x80) == 0)
			{
				intc = b.charCodeAt(i);
			}
			else
			{
				if ((b.charCodeAt(i) & 0xE0) == 0xC0)
				{
					//if (i+1 >= count) return false;
					intc = ((b.charCodeAt(i) & 0x1F) << 6) | ((b.charCodeAt(i + 1) & 0x3F));
					
					i += 1;
				}
				else if ((b.charCodeAt(i) & 0xF0) == 0xE0)
				{
					// 3 bytes Covers the rest of the BMP
					//if (i+2 >= count) return false;
					intc = ((b.charCodeAt(i) & 0xF) << 12) | ((b.charCodeAt(i + 1) & 0x3F) << 6) | ((b.charCodeAt(i + 2) & 0x3F));
					alert(b.charCodeAt(i) + ' '+b.charCodeAt(i + 1) +' '+b.charCodeAt(i + 2));
					i += 2;
				}
				else if ((b.charCodeAt(i) & 0xF8) == 0xF0)
				{
					intc = ((b.charCodeAt(i) & 0x7) << 18) | ((b.charCodeAt(i + 1) & 0x3F) << 12) | ((b.charCodeAt(i + 2) & 0x3F) << 6) | ((b.charCodeAt(i + 3) & 0x3F));
					
					i += 1;
				}
				else
					return false;
			}
window.utf8_out_intc = intc;
window.utf8_out_i = i;
			return true;
		}

function utf8decode(s) {
	var ss = "";
	for(utf8_out_i = 0; utf8_out_i < s.length; utf8_out_i++) {
		TryGetCharUTF8(window.utf8_out_c, window.utf8_out_intc, s, window.utf8_out_i, s.length);
		ss += String.fromCharCode(window.utf8_out_intc);
	}
	return ss;
}