Fill Those Seats

Serpico’s excellent post yesterday got me thinking.

A professional baseball franchise has two goals which sometimes conflict: winning as many games as possible and drawing in fans. You might think those two go hand in hand but, as Serp pointed out, swapping out new talent every season makes it hard for the fans to invest in the team.

“Well, yes,” I thought, “but you get to save so much money.”

Then I started to wonder - how much money? And what’s the tradeoff?

So I went to ESPN.com and Sportsline.com and I got two figures:

  • Total player salaries by team;
  • Average home game attendance as a percentage of stadium capacity

And I made a graph in Excel.

Seats vs Salary

(click the graph to expand to its full size)

Some interesting findings:

  • In the lower half, you get two sudden spikes at the San Diego Padres and the Milwaukee Brewers. They get White Sox level attendance despite playing like, well, the Padres and the Brewers. Where’s the draw? What did the Brewers do last season that I and the rest of the world missed?
  • The Boston Red Sox had 101.4% attendance on average in 2006. That’s not seats sold; that’s actual home game attendance. Look it up yourself. It pleases me to know that John Henry will admit more fans than the stadium has seats; anything for revenue.
  • The trendline continues upward pretty clearly except for one embarassing drop by the Baltimore Orioles. They spent $93.55 million in 2006 on player salaries but only filled 57.1% of their seats on average. They’re spending St. Louis Cardinals money to get Toronto Blue Jays attendance.

I don’t know whether this data supports my thesis or Serpico’s. It may be too soon to draw that kind of conclusion. But I do know that it’s really interesting.

Moneyball: Con

Today’s post is Part Two of Two, the “Con” argument in an ongoing and mostly friendly NerdsOnSports debate over “Moneyball,” the stats-driven baseball management popularized by Oakland A’s General Manager Billy Beane.

My esteemed friend and fellow Nerds on Sports contributor Perich laid out a series of perfectly reasonable, incontestable facts (the rules of baseball - a team needs to score more runs, each team has 27 outs to do it, fixed number of players to acquire, etc) to begin his conversation.  He then, keeping those facts in mind, laid out the crux of Moneyball: certain statistics mean more than others and by finding and properly weighting those statistics, a GM can better evaluate potential than his competitors.  It all makes a lot of sense, given the nature of the game and how the machinations of baseball scouting work.

My issue is not with any of these facts or assertions surrounding Moneyball, per se, but rather with the strategic implementation of the science.  Allow me to explain.  Paying for undervalued or overlooked talent is fantastic, as is paying for any undervalued commodity in the marketplace.  In 2006, Beane compiled the 5th best record in baseball with the 21st highest salary.  Seems like a series of sound investments.  Expanding it back to prior seasons -  in 2005, Beane got the 10th best record with the 21st highest and in 2004, it was the 9th best with the 16th.  Solid year in and year out.  The A’s, with this strategy have produced far more than their salary level would suggest.   Seems like Moneyball is working.

But I’d like to reveal another set of numbers - 26th,  19th, 19th.  That’s where the A’s finished in game attendance in 2006, 2005 and 2004.  Of note, they’re on pace for 26th place again this year.  That’s a downward trend in the number of folks that are interested enough in the A’s to go shell out money to watch them play.  In that, I believe, lies one of the hidden costs of Moneyball.  It is tough for a fanbase to get behind a team which such a revolving door concept of talent.  Miguel Tejada left town before the 2004 season, Jermaine Dye before 2005 and Zito and Frank Thomas before 2007.  The A’s, in keeping with their spending and scouting strategy, got what they could out of these players while they were still reasonably priced and were forced to jettison them after the market got smart.  It’s part of the game that Beane has to play with his budgetary constraints.  It’s chicken-and-egg scenario though.  Beane doesn’t have the money to keep/retain big name talent, and fans get disgusted and don’t attend games with regularity, and empty seats prevent Beane from getting the money to keep/retain big name talent.  Sure, you might be able to get more VORP per Dollar with Nick Swisher than Jermaine Dye, but is that what the ticket-buying, jersey-wearing fans care about?  Or do they care about having a masher they’ve heard of drilling them into the seats at McAfee?

Baseball pundits and average fans alike don’t generally believe the A’s are going to ever mount a big-time run deep into the playoffs.  Certainly not this year (they’re sub-.500) and most likely not next year.  And if they have, they’ve been quiet about it.  The A’s, using their strategy, can be a middling team at the price of a basement team.  From a financial perspective, it’s wonderful because they always beat expectations.  But from a baseball perspective, there’s just no fire there.  The object of a baseball game is to score more runs than the opposing team.  But the object of a baseball season is to win a championship.  Moneyball can help the A’s accomplish the first against teams with a mightier payroll.  But until I see them in a World Series Game, I’ll be skeptical of the second.  Haven’t St. Louis and Arizona been getting there on the cheap lately?  I wonder what their GMs are using.

Moneyball: Pro

Today’s post is Part One of Two, the “Pro” argument in a NerdsOnSports exclusive debate over “Moneyball” - or stats-driven baseball management. Serpico takes the opposing side elsewhere.

My argument is that a manager can derive superior value in his team by managing based on statistics, rather than what are commonly called “intangibles.”

Consider the following:

(1) The object of a baseball game is to score more runs than the opposing team.

(2) Baseball does not have a clock; it ends when each team suffers 27 outs.

(3) Given 1 and 2, the team that can score more runs while suffering fewer outs will win a ballgame.

(4) Players earn runs by advancing along the basepaths. This can be done either by hitting or by being advanced through a walk or pitcher error (balk, etc).

(5) There is a fixed pool of available players for any given season. There are a fixed number of positions in the starting lineup - nine, to be precise.

As Kevin Bacon said in A Few Good Men, these are the facts of the case, and they are not in dispute. Those are the rules of baseball. All of the above are objectively true.

From that, I will assert the following:

(6) Given #4, a statistic which measures all the ways that a player can advance along the bases (for instance, on-base percentage) will be a more useful tool in evaluating a player than a statistic which does not (for instance, batting average).

That right there is the core of Moneyball - the idea that many traditional statistics, such as stolen bases, RBIs and batting average are not as useful as OBP, slugging or VORP.

Consider: RBI is the number of runs a player bats in. But in order to hit in a run, another player needs to have advanced to scoring position. So your RBI stat hinges on the scoring ability of the player before you in the lineup. This changes every time the lineup is altered, or every time you change teams, but no one thinks to qualify RBI with a little asterisk.

Batting average is neat, too, but it doesn’t measure the times that a player will advance a base through being walked. And for the big hitters like David Ortiz, Rickey Henderson or Joe Morgan, bases on balls constitute a significant percentage of their run production.

(7) Given #5, teams with less money to spend will not be able to outbid teams with more money. As such, the only way to maintain a competitive edge over those teams is to find undervalued statistics - stats which point the way to potential runs without seeming to.

The Oakland A’s do not have as much money to throw around as the New York Yankees (the most lucrative sports franchise in the world after Arsenal Football). Oakland will never beat New York in a bidding war over a hot free agent. What they can do, however, is search for run-generating players who New York overlooks. They do this by mining statistics that no one else looks at (such as OBP, or pitches broken down by ballpark) and turning up players like Scott Hatteberg and Kevin Youkilis.

That, right there, is the core of the Moneyball contention. There are certain statistics which illuminate a player’s potential more than others. If those statistics remain overlooked, a money-savvy manager can scoop up big-hitting talent at bargain prices. Such a case seems indisputable.

Now that you’ve read my argument, go read Serpico’s counter.

Nerds on Sports University: LUCK

In the last Nerds on Sports University, I gave you a very technical and involved statistic. This time, I’m going for something a little more light hearted. Today, I will tell you about the pitching stat LUCK. LUCK isn’t an acronym for anything it’s just luck.

Before I get into LUCK itself, let’s look at pitchers Expected Win-Loss [E(W) & E(L)]. We all know that Wins and Losses are very dependent on how the rest of the pitchers team and your bullpen, especially if you’ve been watching Bronson Arroyo this year (Luck -6.31). To calculate E(W) and E(L) we look at the pitchers innings pitched and runs allowed for each game and compare that to the same pitching line’s wins and losses historically. So if a pitcher went 6 innings and gave up 5 runs you would expect them to get a win 30% of the time. So the E(W) is .3 and the E(L) is .7.

To get LUCK we compare the expected numbers to the actual numbers. Taking the difference between the expected numbers and the actual numbers and adding that together (W-E(W)) + (E(L)-L) is LUCK. So a pitcher with a high LUCK is lucky and their team is helping them out. I hope you enjoy LUCK, and I will end this class with some current LUCK stats. Read more »

Nerds on Sports University: VORP

Welcome to Nerds on Sports University, class is in session. I am professor Willis, and though I may not be a Notorious Ph.D. professor of critical studies, I have read some books. Today’s topic is VORP which means Value Over Replacement Player. VORP was created by Keith Woolner of Baseball Prospectus to find a way to value a player’s contribution that factored in playing time and position (and park factors).

Step one to deciphering this stat is to figure out the RP or Replacement Player. Replacement level is a complicated calculation that can be summed up easily. Say a team gets hit with an injury to an everyday player, then the team may be stuck playing a utility bench player, bringing up a AAA player, or finding some other journeyman to fill the gap. That player is basically the replacement level.

Baseball Bell CurveThat’s the idea of Replacement Level, now the hard part: how do we calculate that level so we can compare it to all players? First off Major League baseball players don’t fall into a pretty bell curve, they fall more into the front half of a bell curve (see picture) where there are very few players with the most baseball skill (right side of chart) and a ton of players that play in softball leagues across the world (democratic side).

Before I go more into the calculation of “RP” Replacement Player, let me cover the “V” Value used. No matter where you look to read about VORP, one of the first things they remind you of is that baseball is a zero-sum game. Meaning that every game (except the All-Star game) has a winner and a loser. And as you know, runs are what determine the winner of a baseball game. So the V in VORP is measured in Runs. Now back to calculations. Read more »

And five points for sticking the landing

Megalomaniac Scott Boras is pushing for a new baseball stat to recognize strong defensive skills. Now while defensive skills do tend to be overlooked in judging baseball players’ worth, the attempt to quantify them through such an ill-defined metric as “exceptional play” does not really indicate anything beyond a player’s ability to show up on Baseball Tonight’s Web Gems.

The official scorer would be asked to distinguish between an exceptional play and a routine one in the same way he is asked to distinguish between a hit and error.

Now, the distinction between a hit and an error is usually clear-cut. The fielder misjudged, dropped, or bobbled the ball. Difficult to mistake one for the other. But what makes a fielding play “exceptional”? Distance? Style? Degree of difficulty? Should we have fielding judges giving out scores like in diving? If player A makes a diving catch, but player B is fast enough to already be in position to make an “ordinary” catch, don’t they deserve the same amount of praise?

Other comically stupid ideas mentioned were the nine-game world series with the first two at neutral sites. I don’t see how this would make anybody happy, and I have no idea why anyone would want to do this. Scott Boras needs to stick with inflating contracts and stay away from how the game is actually played.

Crunching The Numbers

One of my favorite non-sports blogs, CoyoteBlog, linked me to this little gem the other day: a breakdown of expected runs given runners on base and outs already scored.

RE 99-02 0 1 2
Empty 0.555 0.297 0.117
1st 0.953 0.573 0.251
2nd 1.189 0.725 0.344
3rd 1.482 0.983 0.387
1st_2nd 1.573 0.971 0.466
1st_3rd 1.904 1.243 0.538
2nd_3rd 2.052 1.467 0.634
Loaded 2.417 1.65 0.815

So, for instance, if you have a runner on third and 1 out, you can expect 0.983 runs in this inning. Runner on 1st and 2 outs, you only have a 25.1% chance of scoring.This kind of data can occupy me for hours. It looks relatively unimpressive - such a small table! - but there’s so much implied in those numbers.

calculate THIS!There’s a line in The Hunt for Red October where the Russian sub’s navigator boasts about being able to “fly a plane in the Alps with no windows” with a compass and a map. Well, I could manage a professional baseball team in an underground bunker with no windows given a spreadsheet with the team’s stats, and this table.

Here’s an example, taken from the Coyote himself:

You can actually calculate what percentage chance of success you need to justify stealing second. Lets again take man on first, no outs. The RE is 0.953. If he steals successfully, the RE goes to 1.189. If he gets thrown out, the RE goes to 0.297 (bases empty, one out). If X is the probability of stealing success, then 1.189X+0.297(1-X)>0.953. X must be about 74% or greater.

I open it up to the forum. What other exciting facts or predictions does this matrix make for you?

Everybody’s Got Something To Hide Except Me and Joe Morgan

There’s already a site doing it better, but I’d like to weigh in on something terribly stupid Joe Morgan said during last Sunday’s Red Sox / Yankees game.

He said, and I paraphrase, “guys like Ted Williams didn’t get to be hitting champions by getting walked a lot. People talk all the time about drawing walks, but Ted Williams didn’t get a lot of walks.”

Ted Williams, USMCEven without access to a laptop, the Internet and a century of baseball statistics at the time, I knew in my heart that that was false. First, because Joe Morgan was saying it with authority. And second, because, well, when you’re pitching to a guy who hits .318 on a bad season, you’ll occasionally throw a few outside.

However, I’d be no better than Stumbling Joe himself if I didn’t find the facts to back me up. So here, in an easy to read chart, are the all-time career walk leaders:

Rank Player Years AB BB
1. Barry Bonds 21 9507 2426
2. Rickey Henderson 25 10961  2190
3. Babe Ruth 22 8399 2062
4. Ted Williams 19 7706 2019
5. Joe Morgan 22 9277 1865
6. Carl Yastrzemski 23 11988 1845
etc.

(Edited to clean up HTML and revise figures that suggested Rickey Henderson was one of the “giants in the earth [...] mighty men which were of old, men of renown” (Genesis 6:4))

It’s no longer shocking that Joe Morgan has such little respect for statistical analysis that he’d be flat-out, incontrovertibly wrong about whether Teddy Ballgame drew a lot of walks or just a few. That’s par for the course. The man wouldn’t be doing his job if he were right more than half the time.

But you’d think that, given the fact that Ted William’s #4 and Joe Morgan himself is #5, that he’d at least remember that number. That he might have heard his own name brought up in that context before. That Joe Morgan might at least be cognizant of a record he’s really really close to Ted Williams on.

Ted Williams drew 154 more walks than Joe Morgan did, over 1571 fewer at-bats. That tells me that, yeah, better hitters draw more walks, regardless of how counter-intuitive that might strike the dumbest man to talk about baseball since Tim McCarver. It also tells me that Joe Morgan not only knows nothing about statistics - he knows nothing about his own career.

Next Page »