Bachelor Tweets Week 1

Week one of The Bachelor did not disappoint. It was an episode filled with moments that are bound to provoke deep conversation, contestants with very clear villain and fan favorite edits, and an A+ bachelor. Let’s look at the data.

I collected ~168k tweets in a period of 4 hours and 15 minutes. I took out all retweets from analysis this week, this left me with a set of 95,075 tweets. Below is the number of tweets by minute over the period my tweet streamer was active (times in EST).

Screen Shot 2017-01-08 at 12.30.16 PM.png

Obviously the west coast needs to step up its live tweeting game. Because they showed such a disappointing live tweet performance, I eliminated their tweets from the tweet count timeline. If you’re offended by the lack of west coast topics, tune in next week when I take a closer look at regional differences between tweet content.

Screen Shot 2017-01-08 at 12.32.36 PM.png

In case you don’t remember exactly what was happening in the episode to cause the peaks of tweets, my dad took extremely detailed notes. It was the first episode he’s ever seen, but I think his timeline really captures what people are tweeting about at these popular times. Here are some excerpts of his timeline that correspond to the peak tweet times.

 8:01:20 I'm nick and I'm the bachelor -- tastes funny coming out of the  mouth
 8:10:53 Nick gets advice from Ben Chris and Sean
 8:12:57 Trust yourself....Be Nick
 8:25:14 She [Corrine] has a nanny
 8:38:34 Yellow dancer -- Christine [Christien]
 8:39:13 Taylor in maroon
 8:43:50 Hailey short intro
 8:44:06 Lame joke [Do you know what a girl wearing underwear says?]
 8:56:36 Red dresses
 9:00:20 Shark or Dolphin??
 9:12:47 Girl talk --what a hoe [Corinne]
 9:22:59 Liz and Nick are talking
 9:34:10 Gives the first impression rose to Rachel
 9:46:34 Hailey gets rose
 9:46:55 Whitney get rose
 9:51:28 Stay tuned for exciting highlights
 9:59 ??? [Timeline cuts off at 9:51]

Obviously my Bachelor rookie dad didn’t realize the best part of the premiere is the “This season on the Bachelor…” promo.

A quick look at the data answers the question of what people were tweeting about at 9:59

My vagiiiiiineeeee is platinum DYING -@NikNacNicky
Her vageeennn is platinum  -@HollywoodTony28
Omg Did Corinne just say that  -@ JMP119
My heart is gold but my vajeen is platinum. Corinne what would your nanny think -@monicakwatson

Oh right, the highlight of the premiere:

platinum vagine.png

Oh Corinne, I don’t know what happened to you to make you the perfect Bachelor contestant. This prompts the question, what is the proper spelling of the word “vagine”? To answer I made a word cloud.


I also wanted to look at what people thought of this season’s contestants and determine some early fan favorites.

I used text tagging to separate out tweets by contestant. With text tagging I could define rules to classify each of the contestants’ tweets. Many people don’t tweet about the contestant by name, so I had to dive into the personality of each contestant.

Example rule for a singular name:

  • Christen: tweets that contain either Christen or virgin.

To separate the two Danielles I used different rule sets.

  • Danielle M: tweet contains Danielle M or syrup or neonatal or nashville.
  • Danielle L: tweet contains Danielle L or tweet contains Danielle and boobs.

Even with this deep analysis there were still tweets that didn’t fall into Danielle L or Danielle M’s categories, so there is also an Ambiguous Danielle tag.

To determine fan favorites and contestants fans love to hate I used Sentiment Analysis. Sentiment Analysis scores each tweet based on a dictionary of positive and negative words and adds the word scores to determine if a tweet is positive, negative, or neutral.

Here are some examples of the scoring process:

Vanessa looks very fertile  -@StevenWoahdick
Shark girl got skills And gills -@SarahJulson

Positive tweets have scores greater than 0. These tweets are positive because words skills and fertile have a score of 1. Since both tweets have one positive word they both have a score of 1.

I do not like Josephine. Do not. Not at all. She's crazy. Mark my words AND THIS BITCH BROUGHT A HOT DOG OMG I TOLD YOU CRAZY -@aMyLyNn1984

Negative tweets have scores less than 0. This tweet is negative and has multiple words with a score of -1: crazy x 2, not like, bitch. If you add these -1 scores together you get the score of the tweet, -4.

Danielle is a host from westworld  -@jvandegriff92
Corinne is gorgeous and successful but her nasty attitude will be her downfall -@paper_canyon

Neutral tweets have scores of 0. The first tweet has no words in the sentiment dictionary so its default score is 0. The second tweet has positive and negative words whose scores add up to 0: gorgeous 1, successful 1, nasty -1, downfall -1.

There is not one perfect sentiment dictionary for all situations, different types of text need different sentiment dictionaries, and for bachelor tweets the dictionary needed a lot of adjustments. In the base sentiment dictionary  words like dope, shark, and damn had negative scores. In the world of bachelor tweets none of these words are necessarily negative, so I added and subtracted words from the base sentiment dictionary to make a custom bachelor dictionary.

Here are a few examples of words and scores that I added:

(‘kween’,1),(‘queen’,1) (‘my girl’, 1) (‘front runner’,1) (‘sociopath’,-1), (‘ho’,-1)(‘giving me life’,1) (‘smh’,-1)(‘da fuq’,-1) (‘bachelorette’,1) (‘yaaaas’,1) (‘yass’,1) (‘trump supporter’, –1)(‘psycho’,-1)

I ran sentiment analysis on all of the tweets with contestant tags, put it in Tableau, and made the below visualization which shows number of tweets and the sentiment breakdown by contestant. Red is negative sentiment, orange is neutral, and green is positive.


I also extracted some popular phrases for some of the most talked about contestants. I didn’t include some sentiments on some contestants because there wasn’t enough consensus, i.e. Vanessa did have negative feedback, but people didn’t give the same negative feedback so there weren’t popular negative phrases associated with her.

Screen Shot 2017-01-08 at 2.46.48 PM.png

Screen Shot 2017-01-08 at 2.47.03 PM.png

Alexis was by far the most talked about contestant, her appearance sparked the Shark or Dolphin debate heard ’round the world. I wouldn’t call her a fan favorite because of the fairly equal amount of positive and negative tweets.

Early front runners seen to be Rachel, Vanessa, and Danielle M. Danielle L, Sarah and Raven also look like early fan favorites. They weren’t as talked about, but they do have a very high proportion of positive feedback and low negative feedback. It’s likely the next bachelorette will come from this group of women, Rachel in particular is getting a lot of bachelorette buzz.

Corinne, Liz, and Josephine seem to be the most controversial with higher negative tweets than positive. Taylor could potentially join that group in the future, her positive tweets barely beat out the negative.

If you have some analysis you would like to see in the future, think my analytics are total crap, or want to know my thoughts about the shark vs dolphin debate, leave a comment or email me at


Since I learned that analytics could be applied to television I knew that’s what I wanted to do, but I quickly learned interesting data about television is hard to acquire for a nonprofessional television and analytics enthusiast. A few months ago I got put on a project at work pulling and analyzing tweets for a company around the same time as speculation started about who was going to be the next bachelor. It finally occurred to me that I had access to a huge repository of data about television, social media, twitter in particular.

I wanted to know what people were actually saying about the next bachelor and break it down a little more than the one sided view I was getting on the internet. At that time people were mostly tweeting about Bachelor in Paradise with the #TheBachelor hashtag and with the Twitter API limits if I just used the standard hashtag I would get a very small number of tweets about the next Bachelor. Fortunately Bachelor creator Mike Fleiss started tweeting clues about the next bachelor, so I pulled all of the tweets in response to his clues, did some simple sentiment analysis, and came up with this:



Not exactly earth shattering stuff, I didn’t even standardize the axes, but I had a lot of fun doing it. I currently work in a consulting type role, so the data I analyze is from a variety of industries. I never get the opportunity to be a subject matter expert in any of the data that I work with. I would consider myself a Bachelor Franchise subject matter expert, and it was exciting to actually go into a data set and know what I was looking at right away. I loved it and I wanted to do more of it, this time with bigger data sets and more analysis.