If you want to become a data scientist, be careful.

In fact, data scientists use their knowledge of advanced data science, including statistics, machine learning, and deep learning, to solve various business problems and create value. Machine learning engineers will implement AI and data science in applications and systems, and as a result, they will be required to contribute

Substitution of dependent parts such as experience and intuition
Substitution of labor, including intellectual work, based on articulated rules
Deriving patterns and insights from large, complex data that humans cannot.
It is said that the source of a company’s competition is not economies of scale, but rather how it can achieve exponential growth through the use of AI and data science.

Against this background, I would like to remind you that you should not aim for data scientist easily, as I actually aspire to be a data scientist and am in a job called data scientist. Of course, I’m not in a position to say anything about it because it’s each person’s life, but if you don’t understand the assumptions based on the latest situation, there is a risk that you may get lost as a result of the initial assumption and reality being different.

Here are three reasons to think so.

  1. data science and AI as a plus alpha (only one of the means): Data science alone will become commoditized unless you are a top-level researcher or development engineer, and data science as a plus alpha is more important.
  2. end of the data science/AI bubble (excessive boom): The data science boom itself is becoming saturated, and unless the next major change can be triggered, there is little first-mover advantage left.
  3. Data scientist obsolescence (emergence of alternatives): With the rise of UI-based automatable software, the value of the data scientist in most tasks such as data processing and model building will become obsolete.

Those who aspire to become a data scientist in the future need to be aware of these trends and think about survival strategies, taking into account the possibility that data scientists will be scarce, that it will be hard to expect to make a living solely on data science in the future, and that in some cases there will be no need for data scientists in disappointed business fields.

Technology is advancing so fast that the realm of AI x data itself is essentially on a mission to replace what people do, and as a result, the realm itself will be most affected by its own progress. With the development of open source modules with various functions and convenient Auto ML and automated AI software, a large part of the work of a data scientist (machine learning engineer) is being replaced by coding, which used to be done by processing complex databases in SQL, Spark, etc., building models in Python or R, and then deploying the models to production.

The so-called data preparation and model building is automated, and the added value is in how to replace business issues with data science problems, design the analysis, and give the correct interpretation to the analysis results. This kind of capability is very useful in the verification and development phase of a data science project, but when it comes to the actual implementation of the business, the need for a data scientist becomes less and less, and there is a possibility that it would be better for a business side business person to acquire the minimum knowledge of statistics and machine learning necessary to handle the subject correctly, based on the correct domain and business knowledge.

Up until a year or two ago, the model building part was trending to be replaced by automated AI, while the data preparation was still a human activity. Recently, there has been a wave of automation in the data preparation part of the process.

For example, there are multiple automated/semi-automated enterprise software and tools alone that I’m aware of. Users who are not able to program can do everything from data preparation to model building without coding with intuitive UI operations. Of course, we can expect full automation using APIs, and once it is automated, there will be no more data scientists, or it will be enough to have one senior-level data scientist in the company to be the gatekeeper of AI and data science.