5 Minutes with a Data Scientist: Alejandro Correa Bahnsen of Easy Solutions
Lead data scientist Alejandro Correa Bahnsen develops machine learning algorithms for fraud detection. He described for Upside the basic skills and personality traits he believes are necessary to succeed in data science.
- By James E. Powell
- November 9, 2016
Alejandro Correa Bahnsen serves as the lead data scientist for Easy Solutions, Inc., a security provider focused on detecting and preventing electronic fraud. He has several years of experience in banking analytics, applying data mining models in a variety of areas from advertisement to credit risk, and holds a Ph.D. in machine learning from Luxembourg University.
In his current role, he modifies state-of-the-art machine learning algorithms to detect credit card fraud and develops advanced models for intrusion detection, context-based user authentication, and phishing classification. He recently spoke to Upside about his experiences in data science.
UPSIDE: What’s the one thing you wish people knew about your job?
Alejandro Correa Bahnsen: It’s important for people to understand that data science is the mix of different disciplines -- in particular, math and statistics, computer science, and business. Successful data science solutions always involve an equal mix of those three disciplines. Historically, the challenge was that there were few formal academic programs that taught the necessary skills expected from a data scientist. The good news is this is changing.
What’s your favorite part about being a data scientist? Your least favorite part?
Without a doubt the interdisciplinary aspect of the field is one of the more interesting parts. When dealing with a data science problem you must first think from a business perspective about the right solution, then develop the mathematical model, and in the best case invent a new model. You must also think about the right implementation of the solution, usually involving interesting deployments of modern big data technologies.
If you could go back in time, what’s the one thing you would tell yourself as a new data scientist?
Learn about software engineering -- including the fundamental concepts such as time complexity of algorithms and what certain data structures mean. Also learn the basics such as testing, collaboration, and code review. Data scientists must be able to think from a software engineer perspective.
Sometimes I propose a very cool (and complex) mathematical solution to a given problem, only to find out that it cannot be implemented. I have to think about the mathematical model and the computational resources at the same time so that the final solution succeeds.
What’s a personality trait you think people need to succeed at your job?
Passion is definitely one of the most important personality traits you need to be successful doing data science. Very often you must quickly learn a new skill in order to solve a problem. What I have found is that data scientists who are always reading, learning, and contributing to new technologies, algorithms, or libraries are more successful in their roles.
What’s a typical day like for you? Do you work mostly with a team or mostly alone? Which do you prefer?
During a typical day, I try to read and study about something new for about an hour. This is very important, and I try to do it regardless of my workload. Afterwards, I mix the time between business discussions and data analytics. Normally, I have to spend a lot of time understanding, collecting, and cleaning data before I can start doing the actual machine learning models.
I always prefer working with a team. Data science projects usually include a period of time when you must work alone, which is always difficult. With my team, we try to talk as much as possible about our current projects, even if they are projects being worked on by only one team member. As data science is a fast-paced discipline, discussing solutions that other team members are using for their problems helps us to excel as a team.
Where is data science headed in the next few years?
In the next few years we will see increased use of data science in more fields outside of technology. Most successful applications of data science nowadays are from tech-intensive companies, which have been traditionally more open to new technologies. There are still huge opportunities in other industries.
Moreover, we will see wider adoption of the newest machine learning models with deep learning. To date, these models have been limited to companies with enough computer power to allow them to train complex models, but I expect with increased commoditization of such specialized computers, other companies will have the chance to start using these powerful models.
James E. Powell is the editorial director of TDWI, including the Business Intelligence Journal and Upside newsletter.