![]() |
help with fscanf
I have to read a very large text file with PHP (about 1 million lines and 45.4 MB) and speed is critical. The file is layed out like so.
Notice the extra space before and after each line Code:
{"one":1727294,"two":2667541,"three":9998168} Right now I have it so I can select an entire line if I know the whole thing. Here is what I got for that: PHP Code:
PHP Code:
|
If you're after just a simple search, you could use fgets to grab the line and strpos to search it.
PHP Code:
|
Quote:
Wow that's pretty fast for a file with a million records... |
@allworknoplay: That's because I don't have to load the data into PHP when just using fscanf.
@Salathe: The problem with your code is that it requires me to load every line into PHP and then check it. I need to be able to do this with only fscanf if at all possible. |
Quote:
I don't necessarily trust databases for everything. Sometimes I think people go overboard with databases because of the ease of use for the SQL language, but it doesn't always make sense to use a database for everything. |
Quote:
What if you were to return all of the lines you were searching for anyway, but then once you retrieve the line, you then parse the info? This way you're only parsing what you need AFTER you get the full line, instead of trying to parse it ahead of time which would require much more resources? Does that make sense? |
Quote:
|
Quote:
You seem to really know your PHP. This is a bit OT from this thread, but would you happen to know of any good PHP/CSS way of creating really professional looking charts/graphs? I'm currenly using Flash graphs right now which look spectacular but I really want to get away from flash and just use charts that can be outputted to PNG or JPG format. I've seen a couple of the popular PHP ways to generate graphs and they just don't look clean and sharp. What I mean, is they look pixelated. I know this is kind of a loaded question, but if you are familiar of any ways to make really need looking charts in PHP, please let me know! i can scour around and provide to you the "look and feel" of what I'm looking for and you can let me know if this is possible or not.. |
allworknoplay, post a new topic for that.
|
Quote:
The problem is I have to know what the entire line is in order to do that. EDIT: By looking at the comments on http://us.php.net/fscanf I found you can use regex. That could very well solve my problem. |
All fscanf does is parse a single line according to the formatting string provided. Show me your code which finds all matching rows in 80ms and the benchmark you did as my own experiments were nothing like that amount of time.
|
1 Attachment(s)
Attached are the scripts I used to do the same test on a similar file. Takes me about 60 ms to run huge_select.php
Note that this is being run on a ~ 8.5 MB 1 million line file. I also ran these tests on the file listed above. Run huge_create.php first, then huge_select.php. files.php is a file class I made, file::scan() is a wrapper for the fscanf function. The other php file is the class used for benchmarking. It worked on my and my friend's server so I'm not crazy. If it doesn't work for you then you are doing something wrong... That or I'm doing something terribly right. :-D Obviously fscanf was not written in PHP and thus works a lot faster than comparing each string manually with PHP. |
I can only reiterate what was said before, fscanf is only parsing a line into your requested format (if it can). Whether you use fscanf to read a line or fgets, PHP is still reading the file line by line behind-the-scenes.
Your servers must have much faster disk IO than my laptop and cheap shared hosting, which run your tests at over 2 seconds: both with fscanf and fgets. |
I don't know what to tell ya man, but I'm just running on a cheap dell... The program did take over 3 seconds when I used the wrong syntax for fscanf, and when I tried loading and checking every line. But, when I got the syntax right it took a little less than 60ms for me and my friend. Every time, and it still does. And that is on completely different hardware and software too.
I highly doubt it is my hardware, or the fact that I'm running vista that is making it so fast. I'll have to run these test on my web host and my old 2000 XP computer. Could anybody else try running run these test? This is all very interesting indeed. |
What were you doing wrong initially, to take over 3 seconds, and what did you do to fix it? Do you now have a working script doing what you initially wanted (find a matching line)?
|
Initially, I had the format for the fscanf function messed up so it matched all of the lines in the table. It's pretty easy to mess up, like the second piece of code I posted waaay up at the top of this thread:
PHP Code:
|
Quote:
Yes this is a very interesting issue. I'll test this out myself. I am running a 64bit Vista laptop but that part actually doesn't matter. I have VMware running with centos 5.2. I can definitley help verify how quickly the script runs because if I can get it at around 60-80ms as well like you, then to me that would be great benchmark speeds since it's running on virtualized OS... I'll let you know what I discover... |
Tested on a XP-machine(32bit).
Hardware: AMD Turion 64x2 Harddrive has 5400rpm(can't remember more details :-P ) (Hope I did'nt make something wrong ;-)) ![]() |
how many milliseconds equals 1 second?
|
Quote:
@allworknoplay: A millisecond is one thousandth of a second. |
| All times are GMT. The time now is 07:21 AM. |
Powered by vBulletin® Version 3.6.8
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.1.0