Python vs C#

bchip

Expert Member
Joined
Mar 12, 2013
Messages
1,324
Reaction score
418
So wrote an app in C# that does some heavy calculations, happy with the end product everything works but then
had an idea to convert the app to Python due to some other reasons (like certain libraries that I needed, etc)

Long story short I ran the same calc on my C# app and in my python app, with the same data (and with the python calcs being slightly more optimized)

Run Time - Runs 253 tests in
C# : 12 sec
Python: 52min 44sec

o_O o_O o_O
 
You would expect c# to be loads faster as its a compiled language but that time difference is a shocker.
What calculations were you preforming?
 
So wrote an app in C# that does some heavy calculations, happy with the end product everything works but then
had an idea to convert the app to Python due to some other reasons (like certain libraries that I needed, etc)

Long story short I ran the same calc on my C# app and in my python app, with the same data (and with the python calcs being slightly more optimized)

Run Time - Runs 253 tests in
C# : 12 sec
Python: 52min 44sec

o_O o_O o_O

Python isn't exactly known for its speed. It's more a rapid prototyping tool for scenarios requiring high performance. You probably want to port your code to something like C, once you have it working.
 
Python isn't exactly known for its speed. It's more a rapid prototyping tool for scenarios requiring high performance. You probably want to port your code to something like C, once you have it working.

Don't be silly. You slap that Python code in a K8 pod with an Envoy sidecar and scale it until it is fast enough :whistling:
 
Even with its bytecode compilation and auto-boxing bloat, this seems excessive...

Try rewriting in assembler as punishment for whatever you did wrong...
 
Even with its bytecode compilation and auto-boxing bloat, this seems excessive...

Try rewriting in assembler as punishment for whatever you did wrong...
Yeah it does seem a bit rough. I did some benchmarks a while back and had a 15-times slower performance on Python than the C equivalent. Obviously this isn't a standard, but I'd start wondering what I'm doing wrong when things are 300 times slower.
 
I don't have python experience, but I kind of wonder if it's the way you wrote it in python being the problem rather than python itself? Sounds very odd to me.

I'm sure most of you are thinking that its completely different code, but its not.
The code does exactly the same logic. I've tested this on smaller samples of data.
I am sure that the code could be written more effeciently but it would then apply to both applications.

C# it uses LINQ functionality, and in Python it uses the Pandas Dataframe sum,mean,max, etc
If I use an object to store values in C# then I use an object to store values in Python
If I use an If statement in C#, I use the same if statement in Python.
I have slightly optimized the python code though, by leaning a bit more on the dataframes, but overall is 95% the same

It could obviously be that fact that I reference dataframes a lot ..like df['Val1'] vs arrays in C# which will take more instructions to compile
That would be one major difference.

1602678032865.png
 
Running it on the 253 datasets, the first couple of dataframes take less than 6 seconds in python (and in C#),
Suddenly it becomes exponentially slower in python after that.

In C# I destroy the data specifically as I finish the calc and get the results, whereas Im not sure what Python does.
I reuse the variable names (in a for loop creating a temp variable) hoping that it destroys the data so it doesnt have some kind of memory leak
Not sure if that has any affects on it.

* Using Spyder - Python 3.7
 
I saw a 88x improvement moving from C# to python. But to be fair the language was targeting Cuda runtime.
 
BTW, depending on how you wrote the C# test, the compiler may detect that the loop iterations are redundant and are generating the same result and only run 1 iteration. If the result can be computed at compile time, it’s also possible that it’s not running anything live. One has to be careful to defeat the optimizer. How fast is the debug C# build vs optimized?
 
I saw a 88x improvement moving from C# to python. But to be fair the language was targeting Cuda runtime.

Then you probably just saw something go from native CPU code to a bit of Python + Native GPU code.
 
I'm sure most of you are thinking that its completely different code, but its not.
The code does exactly the same logic. I've tested this on smaller samples of data.
I am sure that the code could be written more effeciently but it would then apply to both applications.

C# it uses LINQ functionality, and in Python it uses the Pandas Dataframe sum,mean,max, etc
If I use an object to store values in C# then I use an object to store values in Python
If I use an If statement in C#, I use the same if statement in Python.
I have slightly optimized the python code though, by leaning a bit more on the dataframes, but overall is 95% the same

It could obviously be that fact that I reference dataframes a lot ..like df['Val1'] vs arrays in C# which will take more instructions to compile
That would be one major difference.

View attachment 932277
Another gotcha is that you may be using entirely different data structures in the one vs the other. List or dict vs array from the above it sounds like it ( O(1) compiler vs O(logn) interpreted can be a huge perf disparity). Can’t you use an np array instead?
 
Last edited:
Running it on the 253 datasets, the first couple of dataframes take less than 6 seconds in python (and in C#),
Suddenly it becomes exponentially slower in python after that.

In C# I destroy the data specifically as I finish the calc and get the results, whereas Im not sure what Python does.
I reuse the variable names (in a for loop creating a temp variable) hoping that it destroys the data so it doesnt have some kind of memory leak
Not sure if that has any affects on it.

* Using Spyder - Python 3.7

Sounds like a memory issue of some kind. Perhaps monitor how much memory the process is using. But yeah, it could be that the paradigm you are using is wrong for Python, and you need to do something slightly different.

I'd expect something 10-20 slower, not much more than that.
 
Another gotcha is that you may be using entirely different data structures in the one vs the other. List or dict vs array from the above it sounds like it ( O(1) compiler vs O(logn) interpreted can be a huge perf disparity). Can’t you use an np array instead?

Think your right that the data structures probably play a big influence on this, I'm sure moving to np.array would increase the efficiency. Will play around a bit.

Regarding the C# comment...the app has been tested quite extensively, Ive been using it for a few months and each result is printed out.
It was just this week that I decided to convert it to python, cause I feel theres too much tech involved.
I wrote code in EasyLanguage, then transfer it to C# for testing on a wider universe and finally, write the code for Python so it can send me emails on signals.
So far I've found that Python is extremely useful for data collection, but not the processing. I was hoping to cut out the C# portion but think I might stick with it for a bit longer.
 
As far as I know, if you want to do heavy number crunching in Python, you generally implement it in C and then call it from Python. But cguy knows this area much better than me.

EDIT: Possibly a stupid question, but are you sure that you get identical results for identical input?
 
Top
Sign up to the MyBroadband newsletter
X