Quantcast

Jump to content


Photo

problem when pushing to array in php length is over 1 million


  • Please log in to reply
21 replies to this topic

#1 Noitidart

Noitidart
  • Neocodex Co-Founder

  • 23214 posts


Users Awards

Posted 07 September 2011 - 11:09 PM

So I selected like 1.6 million items from table. No problem there. Now:
		while ($row = mysql_fetch_assoc($checkUpdateMadeQuery)) {
			//$raw[] = array('id' => $row['id'], 'nut' => $row['neoUpdateTime'], 'cur' => $row['curr'], 'tick' => $row['ticker']);
		}

That works fine. But soon as you uncoment the $raw[] which is pushing to the array it returns a blank page after like 1 min. Sometiems it says no data delievered.




plzzz help me get this pushed into the array

#2 artificial

artificial
  • 186 posts


Users Awards

Posted 08 September 2011 - 05:51 AM

That's poor design. You shouldn't be loading 1.6 million entries in to an array like that.

If I were to take a guess, I would say you're either exhausting PHP's memory limit configuration, or the max page execution time. Try:

ini_set('display_errors', 'On');


at the top of your script. But like I said, poor design, and there would be a better alternative. What exactly are you trying to accomplish?

#3 Waser Lave

Waser Lave

  • 25516 posts


Users Awards

Posted 08 September 2011 - 05:52 AM

That's a shit load of data to be putting into memory...

#4 DoNotAnnoyMe

DoNotAnnoyMe
  • 157 posts

Posted 08 September 2011 - 06:45 AM

why do you need all rows in memory at once? you probably want to process them one-at-a-time.

in any case, maybe try to ellaborate on what you're actually trying to do - I'm sure there's some better way then grabbing a whole database for it.

#5 Noitidart

Noitidart
  • Neocodex Co-Founder

  • 23214 posts


Users Awards

Posted 10 September 2011 - 07:52 AM

I'm doing this locally. I'm having to clean up a table on the database. I'm removing data that was received multiple times during a calculated periods.

Any other ways to clean it rather then using php?

I can do it piece wise (take like 100k rows at a time) and have something keep loading the page but I just want to get this done in one shot no matter how long it takes to load.

#6 Waser Lave

Waser Lave

  • 25516 posts


Users Awards

Posted 10 September 2011 - 08:11 AM

I'm doing this locally. I'm having to clean up a table on the database. I'm removing data that was received multiple times during a calculated periods.

Any other ways to clean it rather then using php?

I can do it piece wise (take like 100k rows at a time) and have something keep loading the page but I just want to get this done in one shot no matter how long it takes to load.


Can you not just do it in a single sql query rather than using php?

#7 artificial

artificial
  • 186 posts


Users Awards

Posted 10 September 2011 - 06:10 PM

If you really have to, increase PHP's memory limit and page execution time. Either at the top of your script:

ini_set('display_errors', 'On');
ini_set('memory_limit', '512M');
set_time_limit('300');


Sets a memory limit of 512M and page execution time of 5 minutes. If that has no effect, edit your php.ini files.

Edited by heartificial, 10 September 2011 - 06:10 PM.


#8 DoNotAnnoyMe

DoNotAnnoyMe
  • 157 posts

Posted 11 September 2011 - 04:09 AM

I'm doing this locally. I'm having to clean up a table on the database. I'm removing data that was received multiple times during a calculated periods.

Any other ways to clean it rather then using php?

I can do it piece wise (take like 100k rows at a time) and have something keep loading the page but I just want to get this done in one shot no matter how long it takes to load.



Can you not just do it in a single sql query rather than using php?


I'd recommend exactly that. e.g. take a look at http://www.xaprb.com...-rows-with-sql/ (probably not the most efficient way to do it, but still a lot better than what you attempted)

#9 Noitidart

Noitidart
  • Neocodex Co-Founder

  • 23214 posts


Users Awards

Posted 11 September 2011 - 09:25 AM

I can't do it with query because the problem is the minute of the log time to be declared a dupe has to be 58-27 or 28-57. Kind of see what I mean? Can mysql do this?

Thanks xth I'll try that.

#10 Waser Lave

Waser Lave

  • 25516 posts


Users Awards

Posted 11 September 2011 - 09:53 AM

Can you not just select the entries which are at 58-27 or 28-57 and then put them in their own table?

#11 Noitidart

Noitidart
  • Neocodex Co-Founder

  • 23214 posts


Users Awards

Posted 11 September 2011 - 09:56 AM

there are duplicate entries in that range though :(

#12 Waser Lave

Waser Lave

  • 25516 posts


Users Awards

Posted 11 September 2011 - 09:58 AM

there are duplicate entries in that range though :(


Group them by date too then?

#13 Noitidart

Noitidart
  • Neocodex Co-Founder

  • 23214 posts


Users Awards

Posted 11 September 2011 - 11:50 PM

Shet man I have no idea. I can't imagine what we're talking about. :S sorry man

Plus rep donot and xth and laser for all your help!

If you really have to, increase PHP's memory limit and page execution time. Either at the top of your script:

ini_set('display_errors', 'On');
ini_set('memory_limit', '512M');
set_time_limit('300');


Sets a memory limit of 512M and page execution time of 5 minutes. If that has no effect, edit your php.ini files.

the execution time definitely has effect. as i tested it. but i dont know about the memory_limit. is it possible to increase that even higher?
because i still got a blank page :(

#14 DoNotAnnoyMe

DoNotAnnoyMe
  • 157 posts

Posted 13 September 2011 - 01:09 AM

I can't do it with query because the problem is the minute of the log time to be declared a dupe has to be 58-27 or 28-57. Kind of see what I mean? Can mysql do this?

Thanks xth I'll try that.


yes, sql can do this just fine.
simply define a scalar function that groups those times accordingly (e.g. (note: "pseudocode") function(time) { if(time.seconds > 28) time.seconds = 30; else time.seconds = 0; return time; }) and then instead of using the column in the select as shown in the article I posted, use something along the lines of "SELECT yourFunction(foo) AS seperation FROM bar GROUP BY seperation"

Edited by DoNotAnnoyMe, 13 September 2011 - 01:23 AM.


#15 Noitidart

Noitidart
  • Neocodex Co-Founder

  • 23214 posts


Users Awards

Posted 13 September 2011 - 02:19 AM

yes, sql can do this just fine.
simply define a scalar function that groups those times accordingly (e.g. (note: "pseudocode") function(time) { if(time.seconds > 28) time.seconds = 30; else time.seconds = 0; return time; }) and then instead of using the column in the select as shown in the article I posted, use something along the lines of "SELECT yourFunction(foo) AS seperation FROM bar GROUP BY seperation"

Oh WOW! I never knew that was possible!! Can you link me to something that teaches more about this. This is absolutely amazing!
Thanks!

#16 artificial

artificial
  • 186 posts


Users Awards

Posted 13 September 2011 - 06:07 AM

I don't know if it's as simple as what he's hinting at. Are you sure that sort of syntax is native to SQL (at a glance I would say not)? I would have thought it would have been the database application to implement features like that. For instance, in MySQL you can create and store functions/procedures:

http://dev.mysql.com...-procedure.html
http://dev.mysql.com...nes-syntax.html
http://dev.mysql.com...-functions.html

And I'm pretty sure it's different depending on what source of database system you want to use.

#17 DoNotAnnoyMe

DoNotAnnoyMe
  • 157 posts

Posted 13 September 2011 - 08:01 AM

well, yeah, it's not exactly part of SQL, but I think all implementations (e.g. MySQL, SQLite) implement that sort of thing.
anyway, in your case you probably don't even need a function, as you could probably do that straight away in the query.
(think towards SELECT (time % 60 > 27 ? time - time % 60 : time - time % 60 + 30) AS newtime ...)

#18 Noitidart

Noitidart
  • Neocodex Co-Founder

  • 23214 posts


Users Awards

Posted 14 September 2011 - 04:52 PM

Thanks all I'm learning a lot here.
i got all the data and now my new issue is I have this huge insert but nothing is getting inserted:
php:
mysql_query('INSERT INTO ibf_cdx_stockmanager_daily (ticker, nst_unix, avg, min, max, open, close) VALUES ' . $insertStr);

$inserStr has about ~29k inserts.
("AAVL",1311058800,20,20,20,20,20),("ACFI",1311058800,8.58,7,9,8,9),......................


#19 artificial

artificial
  • 186 posts


Users Awards

Posted 14 September 2011 - 09:13 PM

Try breaking them up in to separate queries, loop through and log all that fail (or alternatively log all that succeed). You should be able to write a small script within a few minutes that loops through and ensures each entry is successfully added.

#20 DoNotAnnoyMe

DoNotAnnoyMe
  • 157 posts

Posted 15 September 2011 - 12:28 AM

Thanks all I'm learning a lot here.
i got all the data and now my new issue is I have this huge insert but nothing is getting inserted:
php:

mysql_query('INSERT INTO ibf_cdx_stockmanager_daily (ticker, nst_unix, avg, min, max, open, close) VALUES ' . $insertStr);

$inserStr has about ~29k inserts.
("AAVL",1311058800,20,20,20,20,20),("ACFI",1311058800,8.58,7,9,8,9),......................


try - instead of creating one huge insert - creating many inserts in one transaction using prepared statements for increased performance (the query only has to be parsed once, but is executed often with different values)

i.e. something along the lines of this (using PDO):
$data = array(array(first data set), array(second data set), ...);
$connection = new PDO('mysql:host=yourdbhost;dbname=yourdbname', $user, $pass);
$query = $db_connection->prepare('INSERT INTO ibf_cdx_stockmanager_daily (ticker, nst_unix, avg, min, max, open, close) VALUES (?,?,?,?,?,?,?)');
try
{
  $connection->beginTransaction();
  for($i = 0; $i < $data.length; ++$i)
  {
	$query->execute($data[$i]);
  }
  $connection->commit();
}
catch(Exception $e)
{
  $connection->rollBack();
  echo $e->getMessage();
}

note that the data doesn't have to be available all at once, e.g. you could also grab the data between the beginTransaction and commit by some means and execute the query there as you get the data
also you can use named identifiers instead of enumerated ones, in this case the "(?,?,...)" would change to something like "(:ticker,:time,:avg,...)" and instead of having a simple array for each data set, you'd have an associative array with the keys matching your placeholders "array(ticker=>"whatever", time=>"...", ...)"

#21 Noitidart

Noitidart
  • Neocodex Co-Founder

  • 23214 posts


Users Awards

Posted 16 September 2011 - 04:56 PM

Thanks xth and dnam!! I will try and get back to you soon. :)

#22 Noitidart

Noitidart
  • Neocodex Co-Founder

  • 23214 posts


Users Awards

Posted 20 September 2011 - 12:25 AM

It worked thanks guys!


1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users