jeudi 13 août 2015

Increasing INSERT Performance in Django For Many Records of HUGE Data

So I've been trying to solve this for a while now and can't seem to find a way to speed up performance of inserts with Django despite the many suggestions and tips found on StackOverflow and many Google searches.

So basically I need to insert a LOT of data records (~2 million) through Django into my MySQL DB, each record entry being a whopping 180KB. I've scaled my testing down to 2,000 inserts yet still cant get the running time down to a reasonable amount. 2,000 inserts currently takes approximately 120 seconds.

So I've tried ALL of the following (and many combinations of each) to no avail:

  • "Classic" Django ORM create model and .save()
  • Single transaction (transaction.atomic())
  • Bulk_create
  • Raw SQL INSERT in for loop
  • Raw SQL "executemany" (multiple value inserts in one query)
  • Setting SQL attributes like "SET FOREIGN_KEY_CHECKS=0"
  • SQL BEGIN ... COMMIT
  • Dividing the mass insert into smaller batches

Apologizes if I forgot to list something, but I've just tried so many different things at this point, I can't even keep track ahah.

Would greatly appreciate a little help here in speeding up performance from someone who maybe had to perform a similar task with Django database insertions.

Please let me know if I've left out any necessary information!



via Chebli Mohamed

Aucun commentaire:

Enregistrer un commentaire