About Myself and Data Dunk
Ever since I can remember, two things piqued my learning interests more than anything – sports and numbers. I can talk about sports with anyone who will listen and was obsessed with collecting sports cards while analyzing players’ stats in my head growing up. I also love numbers, from solving complex equations in my head to learning about the quantitative methodologies behind statistical tests and regression models. It probably helped that my dad is a math professor!
As a kid, I thought I could play in the NBA when I grew up, and after not making varsity basketball as a junior I thought about coaching/working in sports in some management capacity. However, when pursuing this career path on its own, something seemed to be missing for me. I loved learning about and debating sports topics in school as an initial Sports Administration major at the University of Cincinnati, yet I quickly came to miss the challenges and joys of learning more ”academic” subjects like math and English. I was at a crossroads, as I could not visualize an enjoyable career that would encompass my love of sports and my love of general academic learning. Soon after, I took a couple Business Analytics and Technology courses as a freshman, and quickly became inspired by the plethora of opportunities to crunch numbers as a data scientist. Learning about Artificial Intelligence and Machine Learning opened up parts of my brain that I did not even know existed before, and I fell in love with the concept of using data to better describe and predict all sorts of outcomes. At the same time, sports analytics were starting to take off in all sports, and especially basketball (not just baseball, finally!), and so I spent much of my free time reading as many articles and tweets about basketball analytics as I could.
Having graduated from the University of Cincinnati almost a year ago with a double-major in Business Analytics and Information Systems, I have tried to take advantage of every opportunity I can to break into the data science field. Currently, I work as a data analyst at JPMorgan Chase & Co. and will be attending Northwestern University to complete their MS program in Data Science with a specialization in Artificial Intelligence. Working at a large company like JPMorgan has allowed me many opportunities to network with and learn from seasoned data scientists, many of whom pushed me to apply data science/advanced analytical methods to projects I am working on both within and outside of my 9-5 job. Just as much as higher education mattered, they stressed, real-world experience building models and learning statistical techniques are equally important in search of your dream job in this field.
Thus, I am writing this blog to practice different machine learning and data analytics techniques in the realm of basketball analytics. I love strengthening my basketball knowledge as a fan through data-driven takeaways, and so I decided that I may as well take a stab at building out my own models from scratch. There is so much basketball data available publicly year-over-year, allowing for easy access for data analysis as well as allowing machine learning models to be tweaked as a new season’s data comes in. Using R (and maybe Python as well), I will mine various college and NBA datasets to build out my own basketball prediction models while leveraging real-world data science skills in combination with Tableau for data visualization.
This blog is meant to be an interactive, fun portfolio of my work and growth as a data scientist. My goal is to draw topics from friends, family, and other readers as a way to generate challenging and enjoyable projects to build out. As this is a learning experience for myself, I want all of my analysis to be transparent and collaborative. Therefore, I will post all of my source code with detailed commentary, as well as explain my logic for choosing the methodologies that I do. To me, data scientists seem to keep their code and methodologies private (as Intellectual Property almost) but I think that I can learn best and improve my skillset by fully publicizing and explaining my techniques. This way, I can push myself to write better code, consider my biases as well as the data’s biases in itself, and challenge myself with new data science techniques over time.
On that note … let the data dunks begin!
Comments
Post a Comment