![]() |
Why you shouldn't rely entirely on an IP
Filtering users by their IP address may seem like a good idea when you're beginning PHP, even coders that should know better have been caught giving too much trust to an IP. I remember the first PHP project I did, entitled rMetal - which was a website dedicated to various bands from the metal genre in my younger days, I didn't know about sessions and I didn't care to read up, either. As a consequence my system's login was constructed by if your IP address is the same as the one in the database, you're OK. Boy, was I in for a shock!
Today I ensure that no trust is placed on the user's IP address. It cannot be trusted. For once, where I live, the ISP uses web proxy servers and so everybody in my local area - stretching 5+ fairly large sized towns, all have the same IP address when they visit a website, whereas they have unique IPs for other services. This is partly for security and partly to save the tight ISP some bandwidth - they simply load in pages from the web proxy unless the target page has been modified. Take the following PHP code into consideration and place it at the forefront of your mind: PHP Code:
PHP Code:
A conservative estimate would be that 5,000 other households have this identical IP address that I have, and thus relying on the IP address is clearly a big no no! "What's the solution?" you might ask. Well... ...The honest truth is there is no solution. We can never be 100% certain that the member visiting our website is truly unique. Sure we have cookies, IP addresses and sessions (related to cookies), but these are not reliable. Perhaps the best way to identify and tag a visitor is by hashing some information together and constructing them a fingerprint identity. If you remember from this article, every single header sent from the client to the server is optional. Put another way, the client's browser decides whether or not to set it. The HTTP protocol only expects the page request. It can extract the client's IP address from the TCP/IP packet so although this can be spoofed, if the client wants a response, which the request will need to complete TCP's 3-way-handshake, then spoofing it would be a monumental exercise in futility. The HTTP_USER_AGENT is a HTTP parameter that is set in the HTTP header and extracted by PHP and placed into the $_SERVER predefined array. This parameter contains information on the user's browser and operating system. Mine being:Quote:
HTTP_USER_AGENT as me, but we cut the chances of a duplicate if we concatenate the IP address to it and then hash it. Something like the following would give us more possibilities:PHP Code:
Quote:
There is of course a good chance that if me and another user share point 3 then we will also share point 4, but even still, you can see our chances have increased significantly since naively relying on the IP address alone. Sadly, as previously touched upon, the header parameters are options and so if HTTP_USER_AGENT is empty then we'll be merely hashing our IP address on its own which is a pointless exercise. Unfortunately, there's very little we can do here other than set cookies to rat them out. The good news is that that almost every browser, and certainly every new browser, sets a HTTP_USER_AGENT and so if identifying a user is crucial, preventing users from accessing the website who do not have HTTP_USER_AGENT set is a path you may consider:PHP Code:
Quote:
HTTP_USER_AGENT would more often than not be a lot less than the individuals with Javascript disabled, which is around 6% according to W3Schools' browser statistics. |
Nice article, what do you recommend then instead? Sessions are not a clear indicator of a unique visitor either, how can we track unique visitors correctly if a large number of them will not have unique IP's?
|
I'm hoping someone can put me wrong on this one, but I'm fairly sure there isn't an alternative. There's only so much data we can get our hands on, unfortunately there just isn't enough data to determine if one user is truly valid.
|
I have read some discussion about HTTP_X_FORWARDED_FOR before, so after a little browsing i found this on the php.net website
PHP Code:
Edit: By the looks of things HTTP_X_FORWARDED_FOR can contain a comma delimited list (when a user has been through multiple proxies) so it might be worth exploding that aswell. |
Yea, I covered that in this article but unfortunately it is an optional parameter. There really is no way to detect a unique visitor. As Karl mentioned, we only have limited data, the user agent is a good one but again it's optional.
|
Nice Articles, but I don't think foot prints are the best bet for certain things, say your building a script to see how many time this user visited? If they switch between browsers, pretty inefficient, specially if there banned.
|
Yea, you can create a pretty good hash of a user though from all the information combined, but to be honest you should never go by any information that is sent by the user. Including potential headers, ip's etc, majority of things like that can be spoofed.
Its best to create a session, and simply have the user login again if they don't have the cookie. |
Quote:
|
Nor, I think you're misunderstanding the point of the fingerprint in this case. This fingerprint would be useful where you want the try and recognize the same IP, for example, if you were creating a rating system you wouldn't want a user to able to submit a rating, then switch from FF to IE and submit again.
|
Quote:
Quote:
|
Ah yes, my mistake. I gave the wrong example, the point behind this type of fingerprint is to protect against session hijacking. This method would be especially helpful to stop, for example, one person attepting to hijack another persons session by masking their IP. Assuming they're using different browsers, the fingerprint could detect this type of attack.
And by the way, it doesn't hurt to be polite to people. After all, if I'm going to be hit with a blunt attitude for trying to help, then why should I even bother in the first place? |
There is no sure fire way of detecting and tracking a unique visitor, that's just the nature of how things on the web are done. I think we're all in agreement with that. However one particular issue that I could see cropping up here is something like when a certain big ISP who shall remain nameless rotates their IPs sometimes on a page-by-page basis. Obviously an extreme event but in that instance, the hash would be different and the visitor would be deemed "unique". Then again, that example shows quite clearly why you shouldn't rely on an IP. :)
|
Quote:
|
So, how would you be able to get as most accurate info on what IP the user is on? And, could you combine the X_FORWARDED_FOR with the HTTP_USER_AGENT to a "best-we-can-get" -function which would try to get a "fingerprint" of the user?
|
Well, as Salathe rightly pointed out, some ISPs rotate the IP on every page, and as Salathe is not man enough to say it himself (grins), I'll say it myself and face the consequences! As AOL rotates the IP on every page, identifying unique visitors is something that can never be determined.
However, as Bluesaga mentioned, and I have verified this with another source, then the comma-separated X_FORWARDED_FOR would be extremely useful. What I couldn't locate is the ordering of the comma+space separated X_FORWARDED_FOR and so if anybody knows, please let me know. Let's assume that I am correct in assuming that the original IP will be at the beginning of the X_FORWARDED_FOR attribute. The following code would work: PHP Code:
Let's mimic the X_FORWARDED_FOR and add the following code that calls our getUserIP function:PHP Code:
reset function with end. Though knowing the correct order would be much appreciated! |
According to Wikipedia (X-Forwarded-For) the accepted order is to have "the left-most being the farthest downstream client, and each successive proxy that passed the request adding the IP address where it received the request from." So your assumption on the IP order is correct.
P.S. I did say that there is no way of identifying unique visitors and I was even man enough to start off my post with that point. But it's a valid enough point to be hammered home by repeating it. :) |
Quote:
But can't you use for example HTTP_USER_AGENT in the function aswell? Or wouldn't that work? |
Could use HTTP_USER_AGENT together with the function call, if you wish:
PHP Code:
|
Okey, so how's this?
PHP Code:
|
Looks good to me!
|
| All times are GMT. The time now is 02:13 AM. |
Powered by vBulletin® Version 3.6.8
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.1.0