11 | July | 2009 | Playing on the frontier

While developing a web app, I came across an interesting problem: I had a page which had a button to perform an action. If the button is clicked, the action request is sent to the server side script and redirected back to the same page but with a message displayed on the top of the page (ie Your post has been submitted).

If you then navigate to another page but click back, you would see the same page with the same message popping up. I want to detect that we’re clicking back so we will hide the message. There are plenty of solutions in google, but a lot of them involved setting a cookie (what if cookies are disabled), or a server side script detecting referer (what if page is still cached?), or using time by detecting if the server page load time and the current time differs by a large amount (what if client time is wrong?). Without an ideal solution, I set about finding a new solution. Surely it can’t be hard to detect that we’ve already been in that same page. If only there was a way to save a flag just for that page and for the duration of the page session. I tried modifying the DOM, but that gets reverted when you click back. The onload event also get called again, so you can’t use that to differentiate.

I then remembered that at least on recent browsers, there exists a functionality in forms that retained form field information if you clicked back – very handy if you’re submitting a post and the connection died, you can just click back and your long winded post would be intact.

Solution – Use a hidden form field to detect that we’ve been on this page before

Building on this idea, it’s possible to temporarily store a flag on a hidden form field that says, yep I’ve been on this page before. Here is a code snippet:

[html]<html>
<body>
Try
<a href="http://www.google.com/">jumping to another page</a>
</body>

document.write("<form style=’display: none’><input name=’__detectback’ id=’__detectback’ value=”></form>");

window.onload = function() {
if (checkPageBackOrRefresh(‘tt’))
alert(‘You clicked back or refreshed the page’);
}

</script>

</html>[/html]

Unfortunately, this solution does not work in some browsers where “fast back” (ie, fbcache in firefox) is enabled, as the fast back stores the scripting state so a onload does not trigger again.

The script should work fine with IE7 and IE8. With Firefox, it only works on certain pages. These pages seems to be pages that link to heavy javascripts (ie jquery?).

With fbcache enabled browsers, a possible solution would be to hide the message at the event onbeforeunload so it will not appear even when clicking back.

I’ve been researching for the better part of the day on what the best method to account for bandwidth (and cpu/memory) used by a particular user is. This is useful if you run a hosting business and give out shell access. At first I was looking for a way to meter SSH. There seems to be an old patch for it, but as I continued reading, a old mailing from a mailing list pointed out that there are heaps of ways to generate traffic when you have a shell account (ie wget). In fact you don’t even need shell access – any scripting language that could download will consume bandwidth that may not be accounted for.

So this began my quest to find the best solution to per user accounting in linux. The basic concept is that since the bandwidth consumption is triggered by a process, and owned by a specific user, we should be able to trace traffic to a user and record as such. The advantage is even greater if you run peruser mpm apache or suexec’d php.

I began looking at netfilter/iptables, which had a match -m owner uid. This works only on the OUTPUT chain and will tell you who sent the packet, but unfortunately doesn’t tell you who a packet was destined for.

iptables has a connection tracking feature, that tracks active connections, allowing for stateful packet inspection. If you have the kernel feature enabled, it will also count the traffic numbers, which you can then view in /proc/net/ip_conntrack (or /proc/net/nf_conntrack for newer installations). Using that, and cross referencing it with the netstat -anp and process table will give you an idea of which user owns the connection. This is assuming of course that the process doesn’t setuid to change users.

But then, how are we going to collect all the data? Polling would be extremely slow and tedious and you might miss short lived connections. It seems that using libnetfilter_conntrack, you can subscribe to an event that notifies when connection states have changed (CONFIG_IP_NF_CONNTRACK_EVENTS). Using that, you can record when connections are opened and when they are closed as they happen.

What about processes? Processing accounting can be easily taken care of by the unix acct tools, which monitors processes as they are created and destroyed, provided you have the correct kernel options enabled. But what if you don’t have this option, ie on a VPS – Is there an alternative? The answer is yes, but ugly. You might remember that process information can be access via /proc. What if I set inotify, the file system change mechanism to tell me when /proc has changed? Somebody already thought of this and found it didn’t work quite as expected. The reason for this was mentioned in the linked thread, but the responders did give a good alternative – using ptrace ().

The ptrace command is a powerful unix system call that can manipulate processes it has attached to. It is what the debugger gdb uses to debug running applications. Using the ptrace function, you can set an option to notify the controlling process via SIGTRAP that the ptrace’d process has terminated, or forked/execed. Using this, you can potentially hook into every process and closely monitor their lifecycle. The downside is that you cannot have two ptrace active on the same process, which means application like gdb will fail if your monitoring system is active. Since ptrace is primarily used for debugging, it may also degrade performance of application it has been attached to. So the bottom line is that it looks like it is too extravagant and thus the wrong way to go for implementing a process accounting/monitoring system.

Looks like my quest to find a viable way of accounting per user accounting has so far eluded me. Perhaps the old ways of individual accounting in every application service – apache/ftp/imap/smtp is here to stay.

S	M	T	W	T	F	S
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

Playing on the frontier

A log of random stuff that interests me.

Daily Archives: July 11, 2009

Detecting the back (or refresh) button click

Pondering per user accounting in Linux