Category Archives: Uncategorized

Javascript snippet to convert raw UTF8 to unicode

For the I-don’t-a-sane-use-for-this department comes this piece of code which takes a stream of raw UTF-8 bytes, decodes it and fromCharCode it, rendering it in a unicode supported browser. A possible use would be if the web page character set is not UTF-8 and you want to display UTF-8. To use it, just put it in a script tag and call utf8decode(myrawutf8string). But seriously, all web pages should be UTF-8 by default nowadays. Here it is, in case anyone wants it:

[js]
function TryGetCharUTF8(c, intc, b, i, count)
{
/*
* 10000000 80
* 11000000 C0
* 11100000 E0
* 11110000 F0
* 11111000 F8
* 11111100 FC
*
* FEFF = 65279 = BOM
*
* string musicalbassclef = "" + (char)0xD834 + (char)0xDD1E; 119070 0x1D11E
*/

if ((b.charCodeAt(i) & 0x80) == 0)
{
intc = b.charCodeAt(i);
}
else
{
if ((b.charCodeAt(i) & 0xE0) == 0xC0)
{
//if (i+1 >= count) return false;
intc = ((b.charCodeAt(i) & 0x1F) << 6) | ((b.charCodeAt(i + 1) & 0x3F));

i += 1;
}
else if ((b.charCodeAt(i) & 0xF0) == 0xE0)
{
// 3 bytes Covers the rest of the BMP
//if (i+2 >= count) return false;
intc = ((b.charCodeAt(i) & 0xF) << 12) | ((b.charCodeAt(i + 1) & 0x3F) << 6) | ((b.charCodeAt(i + 2) & 0x3F));
alert(b.charCodeAt(i) + ‘ ‘+b.charCodeAt(i + 1) +’ ‘+b.charCodeAt(i + 2));
i += 2;
}
else if ((b.charCodeAt(i) & 0xF8) == 0xF0)
{
intc = ((b.charCodeAt(i) & 0x7) << 18) | ((b.charCodeAt(i + 1) & 0x3F) << 12) | ((b.charCodeAt(i + 2) & 0x3F) << 6) | ((b.charCodeAt(i + 3) & 0x3F));

i += 1;
}
else
return false;
}
window.utf8_out_intc = intc;
window.utf8_out_i = i;
return true;
}

function utf8decode(s) {
var ss = "";
for(utf8_out_i = 0; utf8_out_i < s.length; utf8_out_i++) {
TryGetCharUTF8(window.utf8_out_c, window.utf8_out_intc, s, window.utf8_out_i, s.length);
ss += String.fromCharCode(window.utf8_out_intc);
}
return ss;
}
[/js]

Quick Tip: The platform “Windows Mobile 6 Professional SDK (ARMV4I)” is not defined within Visual Studio.

If you are setting up Qt Visual Studio Addin in Vista and get this message when trying to add the Qt folder (in Qt Options > Qt Versions):
The platform “Windows Mobile 6 Professional SDK (ARMV4I)” is not defined within Visual Studio. Make sure you have installed the required SDK.
-or-
The platform “Windows Mobile 5.0 Pocket PC SDK (ARMV4I)” is not defined within Visual Studio. Make sure you have installed the required SDK.

You need to make sure you have Visual Studio running in Administrator mode. You also need to be in Administrator mode when creating your WinCE project.

You can also type in the command line: checksdk -list to check that you have the correct Windows CE SDK installed.

Getting the intersection points of two [path] geometries in WPF

While working on an app that utilises geometries in WPF, I needed a way to get the intersection points of the lines of two arbitrary geometries. A google search didn’t yield any useful hints except a post suggesting to use mathematics. That would be ideal for very simple geometries like lines, but with complex geometries, it becomes tiresome real quick. The framework doesn’t seem to have any built in functions that calculate that, so it’s time for some hack and slash.

After a few days of thinking, I came up with a simple, yet effective (but not the most efficient solution). For those who just want the intersection between two geometries, there is the CombinedGeometry geometry class which takes input in the form of two geometries. Setting the GeometryCombineMode to Intersect gives a geometry which is the intersection of the two. At the otherside of the WPF realm, we have Geometry.GetWidenedPathGeometry(). This method basically converts/strokes path lines to an approximate geometry. Combining these two concepts, we can produce an intersection from two widened path geometries (the two paths which we want the intersection of).

[csharp]
public static Point[] GetIntersectionPoints(Geometry g1, Geometry g2)
{
Geometry og1 = g1.GetWidenedPathGeometry(new Pen(Brushes.Black, 1.0));
Geometry og2 = g2.GetWidenedPathGeometry(new Pen(Brushes.Black, 1.0));

CombinedGeometry cg = new CombinedGeometry(GeometryCombineMode.Intersect, og1, og2);

PathGeometry pg = cg.GetFlattenedPathGeometry();
Point[] result = new Point[pg.Figures.Count];
for (int i = 0; i < pg.Figures.Count; i++)
{
Rect fig = new PathGeometry(new PathFigure[] { pg.Figures[i] }).Bounds;
result[i] = new Point(fig.Left + fig.Width / 2.0, fig.Top + fig.Height / 2.0);
}
return result;
}
[/csharp]

The function will return an array of zero or more points of intersection. To test it

[csharp]
sg1 = StreamGeometry.Parse("M0,0 L100,100");
sg2 = StreamGeometry.Parse("M0,100 L100,0");
Point[] pts = GetIntersectionPoints(sg1, sg2);
// pts[0] is {50,50}
[/csharp]

Hope this helps someone.

The curious case of WindowsFormsParkingWindow

I was debugging a problem the other day involving WebKit.NET, a webkit wrapper control for winforms. When the webkit control was hosted inside winforms, it had a strange problem of always thinking that it is out of focus. This had the effect of drawing all selections as grayed out.

Everything I threw at it (WM_SETFOCUS, WM_ACTIVATE) seemed to be going into a void, so it was time to break into the source and figure out what exactly was wrong. After a few breakpoints and step overs, I found that the WebKit control searches for the parent window and listens to its messages by subclassing the window. It listens for WM_NCACTIVATE event and then determines whether it has focus or not.

This was working exactly as expected on a normal non-winforms window, so what was different? Debugging the code, I found that it got the root parent window as WindowsFormsParkingWindow. Why is it returning that as the parent window and not the actual parent window I’m not entirely sure. Maybe it’s some special super root window like the Application window in Delphi? It wasn’t shown in Spy++ either. Then again what’s a WindowsFormsParkingWindow? It’s something that I ought to google. Unfortunately google wasn’t very forth coming. Puzzled, I devised a hack by simply temporarily unassigning the immediate parent window of the webkit control WS_CHILD, simply because the webkit find ancestry detection algorithm stops when it encounters a non-child. After the setHostWindow() call, I make the parent window WS_CHILD again. That seemed to work but something lingered in the back of my mind telling that there is a better solution.

The next day as I was about to post a solution on the mailing list describing my solution, just when I was about to talk about this weird WindowsFormsParkingWindow, something in my mind clicked. Suddenly it all seemed to make sense. There was a reason WindowsFormsParkingWindow wasn’t in the tree ancestry when I looked at Spy++. There’s also a reason for the name. It turns out that on creation, all controls get put into WindowsFormsParkingWindow. It just so happens the setHostWindow call was made in the constructor when the parent control was still parked. The solution seemed obvious now. The setHostWindow should be called after the parent is correctly set, ie on the Load event. So, another case closed.

Compiling WebKit/Cairo on Windows with Visual C++ Express

Just for interest, I decided to build webkit on Windows. This supposed to be a painless task but unfortunately, anything that involves build scripts creates drama. The instructions on compiling are a bit scant, giving the impression that it’s super easy. Maybe it is and I’m just a bit dumb, but anyways after a few hours battling with it I finally got it working. Don’t bother trying with Visual Studio 2008 as you might as well jump off a cliff from compilation errors – Get the Visual C++ Express 2005 edition and SP1 patch. Here are the steps from scratch:

Go to http://nightly.webkit.org/ and get the source tar.bz2.

Extract it somewhere (WinRAR is a good tool for this). This somewhere will now be referred to as {EXTRACTED}.

For compiling on windows, you need windows Download the WebKitSupportLibrary.zip then copy this zip file on the root of your extracted webkit folder. Don’t extract it. This zip gives you unicode/uchar.h or pthread.h which would otherwise be reported missing.

So what’s the difference between webkit/cairo and regular webkit? The regular webkit depends on proprietary Apple libraries such as CoreGraphics. The webkit/cairo build substitutes that dependency with the free Cairo 2D graphics library. The library has a clean C API and contains very compehensive drawing routines. It also sports multiple output backends allowing you to create PDFs or SVGs.

Webkit/cairo has more dependencies on open source libraries such as curl. You either build them or get the pre-built ones some nice folks have put online – Get it here and extract it to a place like {EXTRACTED}\requirements.

Now you have to install Cygwin as noted here on point number 3. The important point I want to highlight is due to some assumptions made in the VS project files, the directory HAS to be: at c:\cygwin (or $(SYSTEMROOT)\cygwin). For Vista, you need to perform additional minor gymnastics which I’m not going to outline here. Read it on the previous link.

You also need to get the Quick Time SDK. Annoyingly, to download quicktime SDK you need to register a Apple Developer Connection (ADC) account. After entering details of your daily life, you’re grant access to the file. Good thing was that you don’t need quick time or itunes installed. You should probably keep the default install directory as well.

Next is setting two environment variables. This involves going into the System Properties in your Control Panel. You can just put add it under the User variables:

  • WEBKITLIBRARIESDIR – point it to your {EXTRACTED}\WebKit\WebKitLibraries\win
  • WEBKITOUTPUTDIR – point it somewhere like c:\webkitdist (create this directory as well)

You will now need to add some VC++ paths. Open C++ Express and go to Tools > Options > Projects > VC++ Directories

On the top drop down, select show directories for include files and add

  • {EXTRACTED}\requirements\include
  • {EXTRACTED}\requirements\include\cairo
  • {EXTRACTED}\requirements\include\curl
  • {EXTRACTED}\requirements\include\libpng13

then select show directories for library files and add

  • {EXTRACTED}\requirements\lib

Lastly in the Executables files section add

  • C:\cygwin\bin – Put this just above $(PATH) but after all the other directories. This is important as if you put it on top, cygwin’s link.exe will conflict with MSVC’s, and you need to put it above $(PATH) or the PATH directories that have similarly named binaries to cygwin’s will conflict.

You will also need to get the Platform SDK as noted in the build instructions if you don’t have it already. Any recent version of the SDK should be alright. Add the executable, include and lib directories to the VC++ directories like the above, described here. The instructions about vcprops editing aren’t really important.

For the include path, you also need to add the mfc includes or the compiler will complain about missing winres.h

Due to some bootstrapping scripts, the first time you compile you can’t do it through the Visual C++ IDE. You will need to compile it via the command line in cygwin. Open the cygwin shell and change directory to {EXTRACTED} ie cd c:/webkit/

Then run:

WebkitTools/Scripts/update-webkit

After it completes, you can start building

WebkitTools/Scripts/build-webkit --wincairo --debug

Remove the –debug flag if you want to compile for Release. Note: Previously the wincairo flag was cairo-win32.

After the first successful build, you can now open {EXTRACTED}\WebKit\win\WebKit.vcproj in Visual Studio. Make sure the build profile is set to Debug_Cairo, set it. You can choose Release_Cairo as well if you so wish. Finally, winlauncher is the test browser you use to try out your new build, so set that as your active project. Don’t forget to copy all the requirements dlls (ie cairo.dll) into the dist\bin folder where winlauncher.exe is or else it’ll complain about missing DLLs.

Detecting the back (or refresh) button click

While developing a web app, I came across an interesting problem: I had a page which had a button to perform an action. If the button is clicked, the action request is sent to the server side script and redirected back to the same page but with a message displayed on the top of the page (ie Your post has been submitted).

If you then navigate to another page but click back, you would see the same page with the same message popping up. I want to detect that we’re clicking back so we will hide the message. There are plenty of solutions in google, but a lot of them involved setting a cookie (what if cookies are disabled), or a server side script detecting referer (what if page is still cached?), or using time by detecting if the server page load time and the current time differs by a large amount (what if client time is wrong?). Without an ideal solution, I set about finding a new solution. Surely it can’t be hard to detect that we’ve already been in that same page. If only there was a way to save a flag just for that page and for the duration of the page session. I tried modifying the DOM, but that gets reverted when you click back. The onload event also get called again, so you can’t use that to differentiate.

I then remembered that at least on recent browsers, there exists a functionality in forms that retained form field information if you clicked back – very handy if you’re submitting a post and the connection died, you can just click back and your long winded post would be intact.

Solution – Use a hidden form field to detect that we’ve been on this page before

Building on this idea, it’s possible to temporarily store a flag on a hidden form field that says, yep I’ve been on this page before. Here is a code snippet:

[html]<html>
<body>
Try
<a href="http://www.google.com/">jumping to another page</a>
</body>

<script>

document.write("<form style=’display: none’><input name=’__detectback’ id=’__detectback’ value=”></form>");

function checkPageBackOrRefresh(load_id) {
if (document.getElementById(‘__detectback’).value == load_id) {
return true;
} else {
document.getElementById(‘__detectback’).value = load_id;
return false;
}
}

window.onload = function() {
if (checkPageBackOrRefresh(‘tt’))
alert(‘You clicked back or refreshed the page’);
}

</script>

</html>[/html]

Unfortunately, this solution does not work in some browsers where “fast back” (ie, fbcache in firefox) is enabled, as the fast back stores the scripting state so a onload does not trigger again.

The script should work fine with IE7 and IE8. With Firefox, it only works on certain pages. These pages seems to be pages that link to heavy javascripts (ie jquery?).

With fbcache enabled browsers, a possible solution would be to hide the message at the event onbeforeunload so it will not appear even when clicking back.

Pondering per user accounting in Linux

I’ve been researching for the better part of the day on what the best method to account for bandwidth (and cpu/memory) used by a particular user is. This is useful if you run a hosting business and give out shell access. At first I was looking for a way to meter SSH. There seems to be an old patch for it, but as I continued reading, a old mailing from a mailing list pointed out that there are heaps of ways to generate traffic when you have a shell account (ie wget). In fact you don’t even need shell access – any scripting language that could download will consume bandwidth that may not be accounted for.

So this began my quest to find the best solution to per user accounting in linux. The basic concept is that since the bandwidth consumption is triggered by a process, and owned by a specific user, we should be able to trace traffic to a user and record as such. The advantage is even greater if you run peruser mpm apache or suexec’d php.

I began looking at netfilter/iptables, which had a match -m owner uid. This works only on the OUTPUT chain and will tell you who sent the packet, but unfortunately doesn’t tell you who a packet was destined for.

iptables has a connection tracking feature, that tracks active connections, allowing for stateful packet inspection. If you have the kernel feature enabled, it will also count the traffic numbers, which you can then view in /proc/net/ip_conntrack (or /proc/net/nf_conntrack for newer installations). Using that, and cross referencing it with the netstat -anp and process table will give you an idea of which user owns the connection. This is assuming of course that the process doesn’t setuid to change users.

But then, how are we going to collect all the data? Polling would be extremely slow and tedious and you might miss short lived connections. It seems that using libnetfilter_conntrack, you can subscribe to an event that notifies when connection states have changed (CONFIG_IP_NF_CONNTRACK_EVENTS). Using that, you can record when connections are opened and when they are closed as they happen.

What about processes? Processing accounting can be easily taken care of by the unix acct tools, which monitors processes as they are created and destroyed, provided you have the correct kernel options enabled. But what if you don’t have this option, ie on a VPS – Is there an alternative? The answer is yes, but ugly. You might remember that process information can be access via /proc. What if I set inotify, the file system change mechanism to tell me when /proc has changed? Somebody already thought of this and found it didn’t work quite as expected. The reason for this was mentioned in the linked thread, but the responders did give a good alternative – using ptrace ().

The ptrace command is a powerful unix system call that can manipulate processes it has attached to. It is what the debugger gdb uses to debug running applications. Using the ptrace function, you can set an option to notify the controlling process via SIGTRAP that the ptrace’d process has terminated, or forked/execed. Using this, you can potentially hook into every process and closely monitor their lifecycle. The downside is that you cannot have two ptrace active on the same process, which means application like gdb will fail if your monitoring system is active. Since ptrace is primarily used for debugging, it may also degrade performance of application it has been attached to. So the bottom line is that it looks like it is too extravagant and thus the wrong way to go for implementing a process accounting/monitoring system.

Looks like my quest to find a viable way of accounting per user accounting has so far eluded me. Perhaps the old ways of individual accounting in every application service – apache/ftp/imap/smtp is here to stay.

Why is my {mysql, htpassswd, getpass} prompts not showing?

I had problem on a VM running linux of password prompts not showing. What would happen is that the mysql command would just say wrong password without prompting even if I added the -p switch. Furthermore, the less command and man pages were broken as well! As a programmer I set out to narrow down the scope of the problem via looking a source code. I traced down the offending function as getpass(). Having searched every where on the web, I finally came across a reference telling the tty might be broken and to run:

mknod /dev/tty c 5 0

That immediately fixed it! device nodes are a bit of mystery to me, tty’s more so. One day I’ll understand it thoroughly.

Stinger Root Certificate for Windows Mobile

What’s a ‘Stinger Root Certificate’? I’m not really sure, but it’s the certificate that Microsoft uses to sign a number of software components that they release for Windows CE and mobile devices, for example, the Application Verifier for Windows CE and Windows Mobile 5.0 or MSMQ. The certificate is installed part of the Windows Mobile 5 emulator image, but on phones, like the HTC diamond, they aren’t installed. Hence running the Application Verifier, because it needs to load the Stinger signed kernel module, Shim Engine (shimeng.dll), it would not work on the HTC as it doesn’t have the certificate “Stinger Root Certificate” installed.

You can actually see the certificates installed by connecting your mobile device and in Visual Studio (2008), going to Tools > Device Security Manager. There was the old Security Configuration Manager which I tried but crashed on me on start up, so don’t use that. In it you can see all the certificates installed and you can also change your device security level.

Anyway, I wanted to install the root certificate but I could not find it distributed anywhere on the internet. In the end, using the Device Security Manager I managed to extract it out of the emulator ROM, by copy and pasting the certificate base64 value. After that was just a matter of building a cab install package and defining the certificate installation in setup XML.

The result is available here

Why is my cron not sending emails?

I’ve been getting a strange problem where my cron daemon (dcron in this case) was not sending emails of scheduled tasks output. Searching google yielded no useful or relevant result. Turns out in my case, I was missing the symlink /usr/lib/sendmail to my exim installation, as complained in my cron.log

Problem solved.