Statistics in Movies: Never use a Bowling Scene

If your next movie idea  includes a bowling scene then you should give up, change your name and move to Greece . I hear their economy is thriving.

 Why?  A recent article on the New York Times states that Hollywood will start using data analysis to predict what viewers want to see.

“For as much as $20,000 per script, Mr. Bruzzese and a team of analysts compare the story structure and genre of a draft script with those of released movies, looking for clues to box-office success…Bowling scenes tend to pop up in films that fizzle, Mr. Bruzzese, 39, continued. Therefore it is statistically unwise to include one in your script.” 

Apparently  Mr. Bruzzese  is legit  and has worked on over 1,000 films. So, when he says not to include a bowling scene, I listen.


The Big Deal

In a world of Big Data, most companies are now using mathematical models to minimize risk and predict the success of their next product. Think  Netflix, Pandora, Amazon and other data driven companies.

Among  these math tools  is a silly thing called statistics. Basically, statistics allows you to analyze data and do a few calculations to determine if there is any relationship between your inputs and your outputs. For example: the relationship between # of A-list stars (input) and gross profit (output).

Does it work? YES and NO.  The real answer = IT DEPENDS. (IT DEPENDS is an answer you can use to any question in any MBA class).

But seriously, it depends. Statistics is powerful, but just like most mathematical models it needs human judgement. One of the downfalls of statistics is the use of previous data to make predictions about the future.

During a statistics class at Chapman University my group did a quick study to see if we could beat the system and find the exact variables needed for a blockbuster formula. Turns out that we were super close. OK, so maybe not close at all. In our defense, there was  a lot of subjectivity in some of our variables. Our findings could have benefitted from a  more detailed ranking list with specific numerical values.

Don’t worry we will not give up until we find a perfect formula. By that time we will all be famous producers or starving P.A’s. In the meantime scrap your bowling movie, and start uploading grumpy cat videos.


Below is a summary of our research.   If you are interested in the entire study just comment below.


We analyzed the effects of a series of 8 inputs on one output, gross profit. We chose what we defined as box office hits (i.e., the top 100 movies per the list found on BoxOfficeMojo.com) for the year 2011. The inputs used for analysis include the following: Opening Weekend, Number of Theaters, MPAA Rating, Number of Male Stars, Number of Female Stars, Derivatives, and the Number of Key Creatives.

After running the simple linear regression for all of the inputs against the output, we found four (Opening Weekend, Theaters, Derivatives, and Number of Males Stars) to be significant at a 95% confidence interval.  Opening Weekend absorbed the impact that any of the other inputs could have on the dependent variable.

Our final formula only includes the most significant of significant inputs that we were able to group.

β1= Number of Male Stars      β2= Number of Theaters

Y= -72183733.77 + 8805216 (β1) + 39156.07 (β2)

For more movie statistics fun check out this awesome site: BoxOfficeQuant

Leave a Comment

Your email address will not be published. Required fields are marked *