Thoughts on starting a new job while watching the 1999 Classic, Dick

Goal of this blog post: to prove to myself (and other overly ambitious data scientists or those interested in television analytics) that personally curated content can lead to weird thought patterns.

NOTE: I am not in any way affiliated with Roku. Apparently I’m just very passionate about their product when I’m running on little sleep.


  • old beat up refurbished iphone 6 with a plan which until very recently was paid for by my former company. Sim card changed, but still questionably secure.
  • 4K Roku TV that an amazing sales person at Costco convinced us we wanted. Turned out to be my favorite tv of all time. And for the low low price of half the price of a smart tv.


What is this new TV streaming app I’ve never seen before? (sidenote, when I just went back to this screen the This is Us picture was replaced with an annoucement that Boy Meets World is now on Hulu. Smartly played Hulu)


Free TV for me? As in just for me? As in aggregating my user preferences and clustering me with other users that have similar watching behavior? Genius, that’s pretty much my idea, but in execution so of course it’s better.


Whaaaat Roku has a handy search feature that searches across all streaming sites? Let’s search this movie that I remember being hilarious and seems super relevant to the current state of the world but no one has ever heard of. Is Roku going to judge me for using the search term “Dick”? Wait, is this movie titled Dick so it can fly under the radar? That’s my idea, to use loaded terms to fly under the search radar. Wait, the Koch Brothers do that so much better than I ever could, is that why their last name is a homophone with a company that is basically America’s greatest accomplishment?


NOTE: Since watching this movie (twice in two days) it has since been taken off my Roku channel? Coincidence? Probably.

NOTE p2: When it was on the Roku Channel Ryan Reynolds and Will Ferrell were first billed, Will Ferrell plays Bob Woodword, Ryan Reynolds’s part was so small I had no memory of him being in the movie. So I turned to imdb.

Screen Shot 2017-10-08 at 6.58.40 AM.png

This is terrible search engine optimization. What happened to you imdb, who owns you? 

Screen Shot 2017-10-08 at 6.59.54 AM.png

Oh that’s right.

Okay let’s get to the actual movie. Kirsten Dunst and Michelle Williams play ditsy 15 year old girls who get to the bottom of the Watergate Scandal by stumbling their way into a friendship with Richard Nixon after getting lost on a class field trip. Best side plot line ever, Kirsten Dunst’s brother hides his pot in a walnut jar, hidden in plain sight in their kitchen, tells her it’s just the walnut leaves, she and Michelle Williams make cookies with the contaminated walnuts and they feed the resulting cookies to everyone in the White House.

IMG_5932 2.JPG

Am I on to something? Will I be rewarded for finding out about this amazing Roku channel that is pretty much talking to me right now by finding this movie and watching it? I’ve googled “what streaming sites is Dick on” with very terrible results enough times to conciveably trigger some sort of watch list. 


Ha Dick on TV, that’s a funny image.

Wait, I’m really on to something, all this deep study of the Bachelor is finally paying off. Politics is basically the bachelor. Richard Nixon = Dean Unglert, celebrity status sucks. And if Dean had to sign an iron clad contract not to disparage the Bachelor after his unflattering edit I can’t imagine what Richard Nixon had to sign. Maybe if I take a picture of it sideways the robots at the CIA won’t realize I’m on to them. 


The CIA drives around in a van that says Plumbers? That’s not suspicious at all, flying under the radar. Oh wait, Plumbers is a superset of my new team name (Player Lifecycle Marketing, the u is for fun). Do I work for the CIA?

Pretty much. I have to take this video sideways and from far away because it’s so relevant to my life it has to be proprietary. 


Baby Ryan Reynolds, I remember you, how could anyone forget? Oh wait, I forgot you were in this movie. 


Are burps the only thing that are real? Is Elon Musk right, are we really just living in a simulation? Is this movie at this time a break in the simulation? Or am I just being personally recruited by Elon Musk after googling him so many times?

DEAN?!?! DEAN UNGLERT? Are you in on this too? Is the Bachelor just a front for the CIA? Do they have a hit out on Dean for being so vocal and candid about his experience on the show? Do they pick exotic locations because some Bachelor producer (or possibly Mike Fleiss) has to travel around the world to kill operatives? I knew Chuck Barris wasn’t crazy for writing Confessions of a Dangerous Mind


Oh I am just high? It’s possible. I do live in Seattle. Yeah the Bachelor probably has no connection to the CIA. 

**You’re so Vain by Carly Simon starts playing for the closing montage. This is the moment the whole movie has been leading up to, Nixon’s resignation. Brought down by two 15 year old girls just trying their best to be helpful. **

This song is so iconic, there’s no way they could have messed with this song. Wait what is Gavote?? Are they cutting up the flag?? I guess they can, they pretty much saved America by getting Nixon stoned enough to stop the war. I better shake this video as I’m taking it so it can’t be traced by robots owned by the CIA. Or just because I don’t want to have to pay for clips from this movie when I present my case to an interested party. Is that how movie rights work?


Clouds in my coffee? Carly Simon, you’re deep. Maybe this is just a really great movie made by someone who likes puns. I didn’t think puns were funny, but maybe they’re just too high brow for me to understand because this is probably my favorite still from a movie, punny sign and all.


SUPER DRUGS ARE REAL?!?! Or maybe the message is just follow your bliss and it doesn’t matter if you make money. Or maybe I should just invest my money in big pharma. Was this really a documentary hidden in plain sight with amazing actors a la HBO’s hit movies, Game Change and Too Big to Fail? Only one way to check, let’s consult the most reputable source I know. 

Screen Shot 2017-10-08 at 7.48.12 AM.png

Yep, basically a documentary. 


In hindsight this maybe wasn’t the best thing to watch instead of sleeping.

Bachelor Week 2 – Maps and Nick Hate

This was an especially crazy week for the Bachelor (and for the US). Corinne continued to be the perfect Bachelor villain or just very drunk, it’s too early to differentiate. We heard Liz repeat the same soundbite about her and Nick hooking up at Jade and Tanner’s wedding after every commercial break, and when Nick finally reveals the truth about their relationship on the group date we get hit with a “to be continued…”

Let’s see what people are talking about:

Screen Shot 2017-01-15 at 7.31.50 PM.png

Timeline notes provided by my dad. His complete detailed minute by minute notes are here.

  1. Nick enters I am confident as I have ever been to fall in love
  3. Brittany IS half naked and she is freaking me out said Corinne
  4. Liz talking about her past with Nick — she has a knowing smile
  5. One of you did outstanding — Corinne you are the lucky winner
  6. A rose is on the table
  7. Danielle M Gets a date card for the first solo date — Liz is excited for her, but she thought it was going to be her
  8. Corinne said if you did not want to be interrupted then why are you here
  9. Corinne interrupts again
  10. Corinne is pissed — the way I go about things is very classy
  11. Corinne gets rose — Corinne this is very awkward, wow
  12. in a Helicopter one on one date with Nick and Danielle M
  13. Liz and Christen talking by the pool about things that will come out on the show
  14. Danielle M telling her history –5 years ago my fiance overdosed on drugs and I found him. I did not know he was an addict
  15. Kissing on the Ferris wheel — all the way around
  16. Everyone is going to be breaking up with nick as part of a symposium at the museum
  17. The end of a chapter for us, but a new beginning for you [Liz breakup speech]
  18. Liz and nick walk off the set to chat
  19. Nick — I don’t think we have a future — it is best if we just say goodby tonight


This week I wanted to look at location data. About 1% of the tweets I pulled had full location data, so to look at a bigger set I extracted state information from the user location description. Since the location description is freeform text I had to do some extra work to extract usable data. I made a dictionary of state names, abbreviations, and big unambiguous cities (no Springfields or Auburns) that mapped to each state. After this process I ended up with location data for a little over 50% of tweets. And then I made a ton of maps.

NOTE: This is not a random sample, if the other 50% of tweets had extractable locations it could completely change these maps. There are also not as many west coast tweeters in the time I collect tweets, so that influences the numbers.

Total Number of Tweets (Retweets Included)

Screen Shot 2017-01-15 at 9.20.28 PM.pngScreen Shot 2017-01-15 at 9.20.39 PM.png

Number of Tweeters

Screen Shot 2017-01-15 at 9.21.17 PM.pngScreen Shot 2017-01-15 at 9.21.23 PM.png

In the above two visualizations states in the south, particularly Alabama and South Carolina, have some of the most dramatic decreases from week 1 to week 2. These numbers could have been affected by the CFP National Championship that also aired at 8pm est, the two teams playing were Alabama and Clemson.

Number of Verified Tweeters

Screen Shot 2017-01-15 at 9.21.59 PM.pngScreen Shot 2017-01-15 at 9.22.05 PM.png


Most Talked about Contestant 

Screen Shot 2017-01-15 at 8.02.54 PM.pngScreen Shot 2017-01-15 at 8.01.20 PM.png

Screen Shot 2017-01-15 at 9.23.14 PM.pngScreen Shot 2017-01-15 at 9.23.19 PM.png

Contestants with prominent story lines were excluded from analysis.


I spent a lot of time last week looking at people were saying about the women, this week I wanted to look at what people are saying about Nick. Almost none of it was flattering. I’m still a fan of Nick, but for one reason or another (producers) he is making some questionable decisions.

Last week I used nGram to pull out popular phrases about the top women. This works well when everyone is saying similar things because it just uses frequency. People were saying a lot of different things about Nick, and I had some ideas about the topics, but I wasn’t quite sure where to start. With a term frequency method some of the smaller topics (still 300-500) tweets would have gotten buried, and I would have never gotten to read all of the tweets about Nick’s camo button down.

To find out the kinds of things people are saying about Nick, I used LDA (Latent Dirichlet Allocation) to pull out popular topics. Here are some of the more well-defined topics that came out of the analysis and a few tweets that summarize each topic:

Nick and Danielle M:

Danielle is an absolute angel. Nick doesn’t deserve her NO ONE DOES -@summer95
Danielle M is the not only the best but as my mom said brings out the best in Nick #TeamDanielleM #hopeforthisshow -@Feeeeeeney
Danielle is too good for Nick but can we please introduce her to LukePell  -@haleyrgeorge

Nick and the Boob Hold:

Nick held my boobs today LIKE HE HELD MY BOOBS. This girl is literally the reason I hate everyone -@edem_ily
Nick held my boobs today Like my BOOBS Great commentary Corrine -@BrittJayne28
Nick held my boobs today My BOOBS  -Corinne future CEO and successful business woman -@Swainsch

Nick’s Outfit:

911 yes hello Nick is wearing a camo button down -@monica_aldean
Oh man Forgot how much Nick likes to wear jeggings -@Uve_Been_Duped
Nick looks like a motor biker if all motor bikers were characters from Westside Story  -@NealLovesYou

Nick and Kissing:

Well were only 10 minutes into TheBachelor and Nick has already kissed 6 girls -@Sammy10101
Nick is kissing everyone tonight He would have kissed Chris Harrison if he was on right now -@rhlederer
Nick is bored with this though Hes like can we kiss or something -@paigeDav

Nick giving Corinne the group date rose:

I have been so Team Nick but giving Corrine the rose makes me think twice -@liz_lolol
Nick gave the date rose to Corinne. Is he trying to prove hes scummy? -@Kinabutterjelly
Nick whyy I stood up for you. WHYYYY HER #byecorrine -@CadieNGetz

Liz and Nick, the perfect drinking game to blackout:

Did Liz hook up with Nick at Jade and Tanner's wedding?  I can’t tell -@expecthexpectd
I'm taking a shot every time Liz mentions sleeping with Nick at the wedding #cirrhosis -@toni_nic0le

General Nick Criticism:

The robots on westworld have more natural conversations than Nick & the women on #TheBachelor... -@seabass5555
Why are all these girls literally like 13 years younger than Nick -@whitmv_
Oh Nick pls try to act like you're looking for a wife and not your extra 15 minutes boy bye #ICant -@ CarolinaGirlToo


So all of this criticism brings up the point, would Luke have been better bachelor? There are a lot of people who think so:

If Liz had just given Nick her number all those months ago then LUKE could've been our rightful bachelor TheBachelor -@emilynrutt
Luke wouldn't have given the rose to Corinne -@ daniip13
Danielle and Luke would have been the cutest couple -@RachelJ614
I can't even watch the bachelor anymore -worst two episodes I’ve ever seen. Should have chosen Luke #trash -@lindseyytanner

I assumed that these people would be more focused in the south, but they are pretty spread out across the country with a huge hub in Texas. The below map shows the breakdown of tweets about Luke by state (retweets included).


Screen Shot 2017-01-15 at 8.34.45 PM.png

We’ll see if Nick wins over the naysayers in weeks to come.

A few of the women have won over the haters this week:

Screen Shot 2017-01-15 at 8.54.35 PM.png

Raven, Taylor, and Christen have really come up in the sentiment rankings. Danielle M is still a fan favorite, Corinne and Liz are still very hated, and we didn’t see much of anyone else this week.

Just a note, the next two episodes will have a combined analysis because I’ll be out of the country next week, hopefully I don’t miss the chance to watch The Most Dramatic Episode Ever live, but with the Liz drama probably wrapping up this week I think I’m safe.


Bachelor Tweets Week 1

Week one of The Bachelor did not disappoint. It was an episode filled with moments that are bound to provoke deep conversation, contestants with very clear villain and fan favorite edits, and an A+ bachelor. Let’s look at the data.

I collected ~168k tweets in a period of 4 hours and 15 minutes. I took out all retweets from analysis this week, this left me with a set of 95,075 tweets. Below is the number of tweets by minute over the period my tweet streamer was active (times in EST).

Screen Shot 2017-01-08 at 12.30.16 PM.png

Obviously the west coast needs to step up its live tweeting game. Because they showed such a disappointing live tweet performance, I eliminated their tweets from the tweet count timeline. If you’re offended by the lack of west coast topics, tune in next week when I take a closer look at regional differences between tweet content.

Screen Shot 2017-01-08 at 12.32.36 PM.png

In case you don’t remember exactly what was happening in the episode to cause the peaks of tweets, my dad took extremely detailed notes. It was the first episode he’s ever seen, but I think his timeline really captures what people are tweeting about at these popular times. Here are some excerpts of his timeline that correspond to the peak tweet times.

 8:01:20 I'm nick and I'm the bachelor -- tastes funny coming out of the  mouth
 8:10:53 Nick gets advice from Ben Chris and Sean
 8:12:57 Trust yourself....Be Nick
 8:25:14 She [Corrine] has a nanny
 8:38:34 Yellow dancer -- Christine [Christien]
 8:39:13 Taylor in maroon
 8:43:50 Hailey short intro
 8:44:06 Lame joke [Do you know what a girl wearing underwear says?]
 8:56:36 Red dresses
 9:00:20 Shark or Dolphin??
 9:12:47 Girl talk --what a hoe [Corinne]
 9:22:59 Liz and Nick are talking
 9:34:10 Gives the first impression rose to Rachel
 9:46:34 Hailey gets rose
 9:46:55 Whitney get rose
 9:51:28 Stay tuned for exciting highlights
 9:59 ??? [Timeline cuts off at 9:51]

Obviously my Bachelor rookie dad didn’t realize the best part of the premiere is the “This season on the Bachelor…” promo.

A quick look at the data answers the question of what people were tweeting about at 9:59

My vagiiiiiineeeee is platinum DYING -@NikNacNicky
Her vageeennn is platinum  -@HollywoodTony28
Omg Did Corinne just say that  -@ JMP119
My heart is gold but my vajeen is platinum. Corinne what would your nanny think -@monicakwatson

Oh right, the highlight of the premiere:

platinum vagine.png

Oh Corinne, I don’t know what happened to you to make you the perfect Bachelor contestant. This prompts the question, what is the proper spelling of the word “vagine”? To answer I made a word cloud.


I also wanted to look at what people thought of this season’s contestants and determine some early fan favorites.

I used text tagging to separate out tweets by contestant. With text tagging I could define rules to classify each of the contestants’ tweets. Many people don’t tweet about the contestant by name, so I had to dive into the personality of each contestant.

Example rule for a singular name:

  • Christen: tweets that contain either Christen or virgin.

To separate the two Danielles I used different rule sets.

  • Danielle M: tweet contains Danielle M or syrup or neonatal or nashville.
  • Danielle L: tweet contains Danielle L or tweet contains Danielle and boobs.

Even with this deep analysis there were still tweets that didn’t fall into Danielle L or Danielle M’s categories, so there is also an Ambiguous Danielle tag.

To determine fan favorites and contestants fans love to hate I used Sentiment Analysis. Sentiment Analysis scores each tweet based on a dictionary of positive and negative words and adds the word scores to determine if a tweet is positive, negative, or neutral.

Here are some examples of the scoring process:

Vanessa looks very fertile  -@StevenWoahdick
Shark girl got skills And gills -@SarahJulson

Positive tweets have scores greater than 0. These tweets are positive because words skills and fertile have a score of 1. Since both tweets have one positive word they both have a score of 1.

I do not like Josephine. Do not. Not at all. She's crazy. Mark my words AND THIS BITCH BROUGHT A HOT DOG OMG I TOLD YOU CRAZY -@aMyLyNn1984

Negative tweets have scores less than 0. This tweet is negative and has multiple words with a score of -1: crazy x 2, not like, bitch. If you add these -1 scores together you get the score of the tweet, -4.

Danielle is a host from westworld  -@jvandegriff92
Corinne is gorgeous and successful but her nasty attitude will be her downfall -@paper_canyon

Neutral tweets have scores of 0. The first tweet has no words in the sentiment dictionary so its default score is 0. The second tweet has positive and negative words whose scores add up to 0: gorgeous 1, successful 1, nasty -1, downfall -1.

There is not one perfect sentiment dictionary for all situations, different types of text need different sentiment dictionaries, and for bachelor tweets the dictionary needed a lot of adjustments. In the base sentiment dictionary  words like dope, shark, and damn had negative scores. In the world of bachelor tweets none of these words are necessarily negative, so I added and subtracted words from the base sentiment dictionary to make a custom bachelor dictionary.

Here are a few examples of words and scores that I added:

(‘kween’,1),(‘queen’,1) (‘my girl’, 1) (‘front runner’,1) (‘sociopath’,-1), (‘ho’,-1)(‘giving me life’,1) (‘smh’,-1)(‘da fuq’,-1) (‘bachelorette’,1) (‘yaaaas’,1) (‘yass’,1) (‘trump supporter’, –1)(‘psycho’,-1)

I ran sentiment analysis on all of the tweets with contestant tags, put it in Tableau, and made the below visualization which shows number of tweets and the sentiment breakdown by contestant. Red is negative sentiment, orange is neutral, and green is positive.


I also extracted some popular phrases for some of the most talked about contestants. I didn’t include some sentiments on some contestants because there wasn’t enough consensus, i.e. Vanessa did have negative feedback, but people didn’t give the same negative feedback so there weren’t popular negative phrases associated with her.

Screen Shot 2017-01-08 at 2.46.48 PM.png

Screen Shot 2017-01-08 at 2.47.03 PM.png

Alexis was by far the most talked about contestant, her appearance sparked the Shark or Dolphin debate heard ’round the world. I wouldn’t call her a fan favorite because of the fairly equal amount of positive and negative tweets.

Early front runners seen to be Rachel, Vanessa, and Danielle M. Danielle L, Sarah and Raven also look like early fan favorites. They weren’t as talked about, but they do have a very high proportion of positive feedback and low negative feedback. It’s likely the next bachelorette will come from this group of women, Rachel in particular is getting a lot of bachelorette buzz.

Corinne, Liz, and Josephine seem to be the most controversial with higher negative tweets than positive. Taylor could potentially join that group in the future, her positive tweets barely beat out the negative.

If you have some analysis you would like to see in the future, think my analytics are total crap, or want to know my thoughts about the shark vs dolphin debate, leave a comment or email me at


Since I learned that analytics could be applied to television I knew that’s what I wanted to do, but I quickly learned interesting data about television is hard to acquire for a nonprofessional television and analytics enthusiast. A few months ago I got put on a project at work pulling and analyzing tweets for a company around the same time as speculation started about who was going to be the next bachelor. It finally occurred to me that I had access to a huge repository of data about television, social media, twitter in particular.

I wanted to know what people were actually saying about the next bachelor and break it down a little more than the one sided view I was getting on the internet. At that time people were mostly tweeting about Bachelor in Paradise with the #TheBachelor hashtag and with the Twitter API limits if I just used the standard hashtag I would get a very small number of tweets about the next Bachelor. Fortunately Bachelor creator Mike Fleiss started tweeting clues about the next bachelor, so I pulled all of the tweets in response to his clues, did some simple sentiment analysis, and came up with this:



Not exactly earth shattering stuff, I didn’t even standardize the axes, but I had a lot of fun doing it. I currently work in a consulting type role, so the data I analyze is from a variety of industries. I never get the opportunity to be a subject matter expert in any of the data that I work with. I would consider myself a Bachelor Franchise subject matter expert, and it was exciting to actually go into a data set and know what I was looking at right away. I loved it and I wanted to do more of it, this time with bigger data sets and more analysis.