05-08-2008, 07:41 PM
|
#9 (permalink)
|
|
The Acquainted
Join Date: May 2008
Posts: 175
Thanks: 9
|
No worries Cory :) No database can handle that type of data out of the box. There is definate tuning / optimization that needs to be done to handle that. I'm almost starting to think i created some type of locking issue with it, rather then just being from the sheer row count.
I am batch processing in 25, however, it runs a little like this:
Query 25 records, update 25 records with status of 'in-process' (already two queries in the first batch);
loop over each record and update it with its unique page information saving information in an array; also grabs possible new records from each page it is data mining from;
Update the 25 records with there new information and set status to 'complete'
INSERT IGNORE new possible records, silently failing insert on duplicate keys
.......
So for every 25 records that I process, I make 4 queries to the same table. I don't think it can get much more optimized then this as I need the newly pulled records to be marked as inprocess, and while updating the 25 records I use a transaction to process this in hopes for performance... however I currently believe I am making 25 seperate calls to the database during this update (unique data with each unique record), regardless of the transaction... I need to research that further;
Last edited by drewbee : 05-08-2008 at 08:01 PM.
|
|
|