Out Of Memory Exception on DataAdapter.Fill(Dataset)

Pho3nix

The Legend
Joined
Jul 31, 2009
Messages
30,589
Hi all,

Been struggling with this for the last couple days and maybe someone can help me out a bit.
Pulling data from a PosgreSQL box using Npgsql in C# as a web service.
Lately OOM exceptions have been popping up which is a problem :p

16 tables, 3 have 4 million rows.
Any smarter way to do this VS what I can currently doing and yes, I need all the columns.

Code:
List<DataSet> jTables = new List<DataSet>();
NpgsqlDataAdapter da = new NpgsqlDataAdapter();
try
{
                da.SelectCommand = new NpgsqlCommand("SELECT *  FROM \"projectversion\";", conn);
                DataSet projectversionTable = new DataSet();
                projectversionTable.DataSetName = "projectversionTable";
                da.Fill(projectversionTable);  --------------------------------------------> breaks here
                jTables.Add(projectversionTable);
}

/inb4 code nazi's.

Working on creating classes for the tables in the interim to get around this but your thoughts and opinions would be appreciated.
 

me_

Senior Member
Joined
Oct 11, 2013
Messages
830
Firstly, DataSets are really inefficient. They store the data twice (original value and current value) and store a lot of other additional data. I would rather use dataReaders populating custom DTOs that are then loaded to domain objects that contain any logic.
EF or another ORM mapper would probably work better.
Why do you need all the rows? What are you doing that you need all current data all the time? Would it not be sufficient to use an aggregated view of some of the data?
 

Spacerat

Expert Member
Joined
Jul 29, 2015
Messages
1,328
Running 64bit or 32bit?
How many columns in the table?

But seriously, please get away from dataset/datatable. They are bloated and slow. Farking terrible to use. Use a lightweight ORM to return a list of object i.e. IList<MyObject>.

you'll thank me later :whistling:
 
Last edited:

gkm

Expert Member
Joined
May 10, 2005
Messages
1,519
Why on earth are you reading the whole table in a single query. You should supply where clauses to only get the data you need.

If you actually need to deal with the whole table, you probably need to use some form of pagination like the stuff discussed here: http://stackoverflow.com/questions/109232/what-is-the-best-way-to-paginate-results-in-sql-server

Modern computers has a lot of memory, but it is not infinite. Also, database resources are also not infinite. And even if you have a huge database and tons of memory (which would be extremely wasteful), then you still do not really want to return millions of rows over the poor network from a webservice call. It will be slower than mud. And if you webservice client is lucky, it will time out the call, if not it might also OOM.
 
Last edited:

Spacerat

Expert Member
Joined
Jul 29, 2015
Messages
1,328
eh, get that daily. Our production server does ~20m a day and peaked over 100m easily.

Well yes we handled 300m rows a day in the telecoms billing domain. But you eat them in bite size chunks. You dont query them all to a resultset in one go....

@ Pho3nix: I think you might need to relook at your design and get rid of datasets
 

CamiKaze

Honorary Master
Joined
May 19, 2010
Messages
14,846
You would want to use a "using" for that connection string.

a SProc for that SQL command.

A data reader as well.
 

Pho3nix

The Legend
Joined
Jul 31, 2009
Messages
30,589
Well yes we handled 300m rows a day in the telecoms billing domain. But you eat them in bite size chunks. You dont query them all to a resultset in one go....

@ Pho3nix: I think you might need to relook at your design and get rid of datasets

Legacy code :p
Already started re-deving
 

Pho3nix

The Legend
Joined
Jul 31, 2009
Messages
30,589
Running 64bit or 32bit?
How many columns in the table?

But seriously, please get away from dataset/datatable. They are bloated and slow. Farking terrible to use. Use a lightweight ORM to return a list of object i.e. IList<MyObject>.

you'll thank me later :whistling:

5 columns. Moving away from it. Returning a list as per your example now :p
 

Pho3nix

The Legend
Joined
Jul 31, 2009
Messages
30,589
Why on earth are you reading the whole table in a single query. You should supply where clauses to only get the data you need.

If you actually need to deal with the whole table, you probably need to use some form of pagination like the stuff discussed here: http://stackoverflow.com/questions/109232/what-is-the-best-way-to-paginate-results-in-sql-server

Modern computers has a lot of memory, but it is not infinite. Also, database resources are also not infinite. And even if you have a huge database and tons of memory (which would be extremely wasteful), then you still do not really want to return millions of rows over the poor network from a webservice call. It will be slower than mud. And if you webservice client is lucky, it will time out the call, if not it might also OOM.

Everything is needed as this was for a data warehouse :p. Timeout was set at 50000000 and connection closed every 5 tables.

Will give the link a gander. Thanks :)
 

Solarion

Honorary Master
Joined
Nov 14, 2012
Messages
21,886
Select * could also be a problem. You are basically select a lot of hidden objects too.
 

Solarion

Honorary Master
Joined
Nov 14, 2012
Messages
21,886
Hmm.. Good point!

See how it goes. I used to use select * and stopped a while back, I think for similar reasons to yours can't remember but something bad happened along the way. Now even if I do need all the columns I specify each one.
 

etienne_marais

Honorary Master
Joined
Mar 16, 2008
Messages
15,093
As has been stated you will need to clarify what you are trying to achieve and limiting to batches of results at a time using a where clause (also stated).

I had a scenario where I needed to populate a grid in the UI with very large datasets retrieved via a webservice, but it entailed a searchbox in which you could type parts of words/sentences and match/filter out the relevant data in the grid. The grid was populated incrementally by repeatedly calling the webservice which returned data in batches (eg. 50 rows at a time) with enough idle-time in-between to prevent loss of UI interactivity / UI-Thread locking. If the user should type something in the searchbox it would make an additional webservice call to ensure that all rows matching that specific criteria are populated in the grid (checking of course if the data is not already in the grid before adding it).

Probably beyond your needs, I further made use of Reflection to dynamically build classe(s)representing the table(s) (or subset of the table(s)) and then creating instances of this dynamic type populated with the retrieved data. This allowed for the algorithm to retrieve any table(s) and building collections which could be assigned as the datasource of the grid.

I Used DataReader if I remember correctly.
 

Spacerat

Expert Member
Joined
Jul 29, 2015
Messages
1,328
As has been stated you will need to clarify what you are trying to achieve and limiting to batches of results at a time using a where clause (also stated).

I had a scenario where I needed to populate a grid in the UI with very large datasets retrieved via a webservice, but it entailed a searchbox in which you could type parts of words/sentences and match/filter out the relevant data in the grid. The grid was populated incrementally by repeatedly calling the webservice which returned data in batches (eg. 50 rows at a time) with enough idle-time in-between to prevent loss of UI interactivity / UI-Thread locking. If the user should type something in the searchbox it would make an additional webservice call to ensure that all rows matching that specific criteria are populated in the grid (checking of course if the data is not already in the grid before adding it).

Probably beyond your needs, I further made use of Reflection to dynamically build classe(s)representing the table(s) (or subset of the table(s)) and then creating instances of this dynamic type populated with the retrieved data. This allowed for the algorithm to retrieve any table(s) and building collections which could be assigned as the datasource of the grid.

I Used DataReader if I remember correctly.

this this this...

The Dapper ORM will do all this for you. If you want to aggregate DW data then doing it in the DB with a cursor is a better idea. Like etienne says, it all depends on what your goal is. The way he describes the search functioality is exactly how I do it. None of that paging rubbish.
 

Pho3nix

The Legend
Joined
Jul 31, 2009
Messages
30,589
Firstly, DataSets are really inefficient. They store the data twice (original value and current value) and store a lot of other additional data. I would rather use dataReaders populating custom DTOs that are then loaded to domain objects that contain any logic.
EF or another ORM mapper would probably work better.
Why do you need all the rows? What are you doing that you need all current data all the time? Would it not be sufficient to use an aggregated view of some of the data?

this this this...

The Dapper ORM will do all this for you. If you want to aggregate DW data then doing it in the DB with a cursor is a better idea. Like etienne says, it all depends on what your goal is. The way he describes the search functioality is exactly how I do it. None of that paging rubbish.

PostgreSQL OLEDB drivers aren't free sadly where I've checked and need those for DB implementation.
Pulling data from that to a MSSQL box thus the web service.
 

me_

Senior Member
Joined
Oct 11, 2013
Messages
830
PostgreSQL OLEDB drivers aren't free sadly where I've checked and need those for DB implementation.
Pulling data from that to a MSSQL box thus the web service.

? - I'm not following. Your using Npgsql which has an EF extension.
Also, you can do standard data readers from any ADO.NET provider.
 

Pho3nix

The Legend
Joined
Jul 31, 2009
Messages
30,589
? - I'm not following. Your using Npgsql which has an EF extension.
Also, you can do standard data readers from any ADO.NET provider.

Was replying to Spacerat but it seems to have tagged you as well.
In regards to Npgsql. Running service in .net3.5 which means I can't use the latest version of Npgsql or EF6 either :(

Think I have sorted this out with an ORM though...
 
Top