kylie_jenner = pandas.DataFrame(['KUWTK', 'Kylie Kosmetics', 'Tyga4ever', 'why is there a python in this picture?'])
Kylie Jenner said that 2016 was the year of realizing things, but I’d bet Cooper Union’s sticker price (TOO SOON) that she wasn’t referring to the illuminating experience of learning Python for fun over winter break. Yes, I realize that Kylie has lip kits and white Ferraris to focus on, but girl should check out pandas dataframes if she really wants to live.
In an attempt to get over myself and the resulting self-doubt and stubbornness that made me think I wasn’t capable of programming and therefore terrified of failing at it, I spent the last three weeks crash-coursing myself in Python and all of its very awesomely intuitive data science packages.
EdX is great – check out some of their ‘Python for Data Science’ courses if you’re trying to teach yourself to code and have some solid self-discipline to keep you going.
Now that I’m proficient in numpy, pandas, matplotlib, and scikit-learn, I’ve seen the light that is data manipulation/machine learning with Python and have all the regrets that I tried to do all my Statistical Learning assignments last semester in MATLAB. *shudders*
This is cool. Now I should make some cool things that attempt to answer some cool questions.
So if you read my aptly-titled ‘Brain Barf’ post, you know that I have all the feels about doing projects that fulfill my arbitrary standard of what is valuable and useful. Are those feels (and that post) just my thinly-veiled insecurities about never being good enough? Probably. Like I said, working on getting over myself.
I’ve come to terms with the fact that right now it is most valuable for me to practice my skills on projects that challenge the way that I think; doing significant things that change the world will come later when my skill level and mental elasticity get there.
So right now, I’m planning on doing projects related to some questions that I’ve jotted down in my notes recently:
What will happen if Congress defunds Planned Parenthood?
Yes I realize that this is a massive question, but I’m curious about the relationships between maternal death rates, infant and fetal mortality, and crime rates, among other things, and how they’ve changed since Planned Parenthood started offering abortion services in 1970. I also wonder if there’s a significant difference in the trends of graduation rates, suicide rates, and quality of life over that period of time between the biological sexes (namely the male and female sexes, as intersex data is largely unavailable).
The hardest part of this project will probably be the data collection. Some of the features that I’m interested in analyzing are readily available in nice clean datasets, but many (including some of the features I have yet to think of), are not.
What type of brown ale should we brew next?
For those of you following along at home, some important context:
- I’m a senior studying electrical engineering at Cooper Union (More About Me!)
- I helped start an interdisciplinary independent study in beer brewing last semester.
- We brewed some delicious stouts (milk and imperial), an IPA (session), a blonde ale, and a brown ale.
This semester, we’re continuing to brew for fun even though there aren’t credits involved, and we’re trying to refine our process and clone our favorite beers.
BeerAdvocate, a noted beer review website that we use for reference, has reviews for 2677 different brown ales alone. As tempting as it may be, it’s not feasible for our class of 5 people to try 2677 different brown ales before deciding which one to clone.
Enter data science! My brewing professor found a gigabyte worth of scraped beer reviews (YASSSSS I don’t have to deal with scraping!!!) that I can do some text analysis on. The preliminary plan is to look for themes in the reviews and determine the ones that match with our class’s verbal description of the type of brown ale we’d like to brew.