Data Analytics/Machine Learning

eternaloptimist

Well-Known Member
Joined
Jul 10, 2013
Messages
175
hey all,
Just finished a module in data analytics and wondering if there are any data scientists on here? We covered some common models(linear regression, logistic regression, random forests, naive bayes etc) in scikit learn and statsmodels. I'm curious as to what the people in industry actually do?
thanks!
 

cguy

Executive Member
Joined
Jan 2, 2013
Messages
8,527
Yeah, I've used most of those. I can't go into detail, but I use them to predict market movement.
 

flippakitten

Expert Member
Joined
Aug 5, 2015
Messages
2,486
As with cguy, can't really say what we use it for exactly but one of the uses has to do with detecting abuse in networks.

Summary: It's used everywhere. That's where these features are coming from on Social networks (google photo's assistant is an example), Face recognition, voice commands, beauty filters, ad suggestions... All that kind of stuff which people interact with daily.
That's just the easy to spot stuff. It's extremely huge in large companies.

The downside is it's used a lot to replace humans.

Anyway, Machine learning and Data Science is a really good choice at the moment.
 

eternaloptimist

Well-Known Member
Joined
Jul 10, 2013
Messages
175
thanks. would there be a lot of jobs going for it? do you use R or Python? I have to do a summer project and must choose between data analytics/machine learning, android app and web app ... I'm worried that I'll do something that's super niche and end up unemployed.
 
Last edited:

animal531

Expert Member
Joined
Nov 12, 2013
Messages
2,728
If you can be taught machine learning/data analytics and use it to build even some basic systems then you're not going to end up unemployed (simply because it means that you're already more capable than a large segment of the population and can probably pick up other types of development by yourself).

However, having said that SA IT jobs are very much into your run of the mill business app development. Smaller companies don't see the need or have the finances for serious data/ML, so you have to find placements are larger companies which restricts the field somewhat.

But as flip says, it's one of the big areas that a lot of interest is being poured into right now, so having it as part of your resume will always look good.

Look at e.g.: https://jacquesmattheij.com/sorting-two-metric-tons-of-lego
He built his own little lego sorter with a camera/computer vision/conveyor belt etc which sorts the 2 tons of lego he bought by type, a fun little project (albeit one with a hardware component which might be difficult for a pure software guy)
 

semaphore

Honorary Master
Joined
Nov 13, 2007
Messages
15,194
thanks. would there be a lot of jobs going for it? do you use R or Python? I have to do a summer project and have to choose between data analytics/machine learning, android app and web app ... I'm worried that I'll end up doing something that's super niche and end up unemployed.

You won't be unemployed, machine learning is used in so many fields. I've used it in trend analysis and transaction fraud detection.
 

cguy

Executive Member
Joined
Jan 2, 2013
Messages
8,527
thanks. would there be a lot of jobs going for it? do you use R or Python? I have to do a summer project and must choose between data analytics/machine learning, android app and web app ... I'm worried that I'll do something that's super niche and end up unemployed.

Data analytics/machine learning have much higher potential, but it really depends on how good you are relative to the competetive landscape. One of the potential risk factors are that you may have stiff competition via stats/physics/app.maths/math PhDs. Sometimes coding skills can compensate for not having the same level of heavy maths, but it is something to consider.

I personally use R, Python and C++. I don't do analytics in Python, I just use it to organize testing, deployment, resource management, GUIs, etc.

I would rather do DA/ML over mobile app development. Anyone can pick up mobile app development from a few online to tutorials. It is much harder to gain a foundation in DA/ML.
 
Last edited:

flippakitten

Expert Member
Joined
Aug 5, 2015
Messages
2,486
Data analytics/machine learning have much higher potential, but it really depends on how good you are relative to the competetive landscape. One of the potential risk factors are that you may have stiff competition via stats/physics/app.maths/math PhDs. Sometimes coding skills can compensate for not having the same level of heavy maths, but it is something to consider.

I personally use R, Python and C++. I don't do analytics in Python, I just use it to organize testing, deployment, resource management, GUIs, etc.

I would rather do DA/ML over mobile app development. Anyone can pick up mobile app development from a few online to tutorials. It is much harder to gain a foundation in DA/ML.

^^ This

Data science is the new sliced bread, even the 'new' db technologies show how important it's becoming.
 

eternaloptimist

Well-Known Member
Joined
Jul 10, 2013
Messages
175
cheers for the replies. @cguy what you said about competition is exactly what I was also thinking about. Most people on Kaggle have really strong stats backgrounds. I'll give it a go anyways, thanks!
 

XennoX

Expert Member
Joined
Nov 15, 2007
Messages
2,205
> I am studying applied maths and statistics.
> I am studying machine learning.
> I am in the data space.
> Goal is to become a data scientist.

One of the posters mentioned that in SA, smaller companies don't see the need for analytics beyond descriptive analytics - emphasis is my addition.

I'd go so far as to say, depending on the industry, even larger companies do not really care about anything beyond descriptive analytics. My company's clients fall into that category. They are fairly large companies and all they're really interested in, is their operational reports. This leads to a lot of frustration from my side, as it is bland work. Sure the problem solving is fun, but I want to explore more.

If you want to do (what I deem proper) data science in this country, you need to get into the finance and insurance industry - where you have enormous amounts of data that can train models for many different purposes. One large entry barrier for a lot of the positions for data science, is that they're looking for candidates that have a minimum of a MSc in mathematics and/or statistics.

There was a job advert that I saw the other day for a junior data scientist, for a company in Sandton. Their requirements were incredibly demanding. Some of the items were:

  • PhD in applied mathematics, statistics, risk analysis or actuarial.
  • Firm understanding of artificial neural networks.
  • Firm understanding of regression/classification modelling (logistic, random forest, naive Bayes, etc)
  • Firm understanding of MapReduce.
  • Required technologies: R, Hadoop, Python, SQL, NoSQL, SAS.
  • 1 year work experience.

They were offering R25k per month. I honestly laughed at the absurdity of the requirements and the amount they were offering. Surely they cannot be serious offering that salary with all of those competencies, and worse still: expecting a person with 1 year work experience to understand all of those concepts.
 

cguy

Executive Member
Joined
Jan 2, 2013
Messages
8,527
> I am studying applied maths and statistics.
> I am studying machine learning.
> I am in the data space.
> Goal is to become a data scientist.

One of the posters mentioned that in SA, smaller companies don't see the need for analytics beyond descriptive analytics - emphasis is my addition.

I'd go so far as to say, depending on the industry, even larger companies do not really care about anything beyond descriptive analytics. My company's clients fall into that category. They are fairly large companies and all they're really interested in, is their operational reports. This leads to a lot of frustration from my side, as it is bland work. Sure the problem solving is fun, but I want to explore more.

If you want to do (what I deem proper) data science in this country, you need to get into the finance and insurance industry - where you have enormous amounts of data that can train models for many different purposes. One large entry barrier for a lot of the positions for data science, is that they're looking for candidates that have a minimum of a MSc in mathematics and/or statistics.

There was a job advert that I saw the other day for a junior data scientist, for a company in Sandton. Their requirements were incredibly demanding. Some of the items were:

  • PhD in applied mathematics, statistics, risk analysis or actuarial.
  • Firm understanding of artificial neural networks.
  • Firm understanding of regression/classification modelling (logistic, random forest, naive Bayes, etc)
  • Firm understanding of MapReduce.
  • Required technologies: R, Hadoop, Python, SQL, NoSQL, SAS.
  • 1 year work experience.

They were offering R25k per month. I honestly laughed at the absurdity of the requirements and the amount they were offering. Surely they cannot be serious offering that salary with all of those competencies, and worse still: expecting a person with 1 year work experience to understand all of those concepts.

Heh, someone meeting those qualifications could easily make 10-20x that in the US.
 

animal531

Expert Member
Joined
Nov 12, 2013
Messages
2,728
Wow 25k. You won't find those skills on a new graduate, so the ad will only really apply to someone with the skills that have swopped to data science (for at least a year). And even locally that 25k is a serious slap to the face.


Another interesting thing with ML that I'm seeing is a push to rather teach it to domain experts who can then apply it as a skill, rather than hiring a pure ML person that you have to teach the domain to (as in the past). But I imagine that if your company is large enough to have some of both sets then you'll be even better off.
 

XennoX

Expert Member
Joined
Nov 15, 2007
Messages
2,205
Another interesting thing with ML that I'm seeing is a push to rather teach it to domain experts who can then apply it as a skill, rather than hiring a pure ML person that you have to teach the domain to (as in the past). But I imagine that if your company is large enough to have some of both sets then you'll be even better off.

To be fair, a data scientist is someone that is able to combine statistical and mathematical methods with domain knowledge to derive business insights. Thus I think business intelligence candidates are perhaps the best suited to make the leap into data science.
 

battletoad

Expert Member
Joined
Mar 10, 2009
Messages
1,451
One of the posters mentioned that in SA, smaller companies don't see the need for analytics beyond descriptive analytics - emphasis is my addition.

Some large open source projects "suffer" from this too... implementing dashboards and the like but do not move onto predictive (or better yet, prescriptive/decision making) analysis. Even after getting the buy-in from higher ups, its pretty much a question of resources, where largely only the well heeled would be able to go that far.

You will probably have to run on passion in the meanwhile while plotting a course for DA/ML professionally. Maybe even join an open source project on some subject you like, with an active and open community.

As for me, began development on a learning analytics application which attempts to personalize content using a scaffolding approach. Plotting out a probabilistic graph theoretic approach for the engine, so to say.
 

XennoX

Expert Member
Joined
Nov 15, 2007
Messages
2,205
Some large open source projects "suffer" from this too... implementing dashboards and the like but do not move onto predictive (or better yet, prescriptive/decision making) analysis. Even after getting the buy-in from higher ups, its pretty much a question of resources, where largely only the well heeled would be able to go that far.

You will probably have to run on passion in the meanwhile while plotting a course for DA/ML professionally. Maybe even join an open source project on some subject you like, with an active and open community.

As for me, began development on a learning analytics application which attempts to personalize content using a scaffolding approach. Plotting out a probabilistic graph theoretic approach for the engine, so to say.

Re: Open source projects.

I've searched, and maybe my Google-Fu is absolute ****, but I cannot find open source projects in the conventional sense of "contribute to this project by doing x-y-z" - it's more of a "help us fix issues with the ML libraries everyone uses!"

Do you perhaps know of any projects where it's a "please help us analyse this data" type thing?
 

battletoad

Expert Member
Joined
Mar 10, 2009
Messages
1,451
Re: Open source projects.

I've searched, and maybe my Google-Fu is absolute ****, but I cannot find open source projects in the conventional sense of "contribute to this project by doing x-y-z" - it's more of a "help us fix issues with the ML libraries everyone uses!"

Do you perhaps know of any projects where it's a "please help us analyse this data" type thing?

Eh, I see its surprisingly difficult to find projects... I seem to think there's an open source version of anything :D

Kaggle like OP suggested looks like a good place to start, especially so for the discussions around datasets.

Otherwise, I guess its best to start with an open dataset of interest; move from there to track down discussion forums (and the like) involving the analysis thereof. Here are a few lists which may be worthwhile:
Uber movement looks particularly interesting, although its not open to the general public yet. Its a natural progression to use data gleaned from transportation apps for urban development or logistics, so good on Uber for committing opening it up. Will keep my eye on this one.

Finally, check out of the governmental agencies' datasets, say NOAA or NASA on climate change, and the subsequent peer reviewed articles linked to them.
 
Top