Sunday, September 10, 2006

I'm sick of politics too, so let's talk baseball. Two items on the agenda, a very good book and a very good team.

Let's start with the book. Baseball Between the Numbers is an anthology of 27 articles by the guys at Baseball Prospectus. It's in the Bill James tradition of statistical analysis and, although it lacks James's idiosyncratic writing style, it's without doubt the best book of this genre ever written. Nevertheless, it's not without flaws, as I'll point out below.

The common theme is the attempt to capture baseball performance in a single statistic such as VORP or EqA. Ultimately, these attempts are unsatisfying for a number of reasons.

1. While it is apparent that the core of these numbers is some normalized linear sum of elementary stats, the formulas are not actually given anywhere in the book, not even in the notes or the glossary.

2. Presumably, one advantage of these numbers over something simple like OPS (OBP+SLA) is that they take into account all manner of marginalia like baserunning and defense. But as it happens, the most interesting results in the book are negative results. Baserunning doesn't matter hardly at all, there's no such thing as a clutch hitter, batting order is irrelevant, catchers' ability to "handle" pitchers is a myth, most of the stuff mangers call "strategy" (intentional walks, sacrifice bunts, saving the "closer" for the nith inning) doesn't help and often hurts. Given that, why not just stick with OPS? (As I read this over, it's beginning to sink in that the best stuff in this book is all stuff that Thorn and Palmer did decades ago. Still, these guys have so much more data to work with that it was worth discovering all these insights again.)

3. The authors uniformly inflate the importance of normalizing their stats. In other words, after they get their initial formula (i.e., OPS jazzed up by tons of insignificant crap), they make a big fuss of doing some linear mapping so that the average player (or the "replacement-level" player, i.e., the dime-a-dozen throwaway guys) maps to zero. Why they think this is a crucial point eludes me.

4. Although the point is made here and there, the book does not sufficiently emphasize that the only perishable commodity on offense is outs. You get 27 of them in a game. At bats you can get more of, so the right denominator for offensive statistics is outs not at bats (or plate appearances). Thus, as I've argued before, OPS/(1-OBP) is the right simple stat for measuring offensive performance. Indeed, if you look at team stats, no other stat correlates better with runs scored by a team.

5. There is still no good stat for capturing fielding performance. The authors try to rig some stuff up that looks at how many balls in play a fielder reaches and then tries to account for effects of pitcher, stadium and so forth. It's a start. One implication of the numbers they come up with is that the difference between a good fielder and an average one is way less significant than the difference between a good hitter and an average one. If you consider that the difference between a good hitter and an average one is one hit every four games or so, you might question that conclusion.

Read the book. If you're into baseball stats, you won't regret it.

And now, I owe Omar Minaya an apology. Time and again, I've questioned his judgement and every time he turned out to be right. Jae Seo's second half in 2005 was indeed a fluke and getting Duaner Sanchez for him was brilliant. Kris Benson will never amount to much and getting John Maine alone for him would have been worth it, let alone getting nut job Jorge Julio, whom Minaya then turned into Orlando Hernandez. Picking up Chavez and Valentin for nothing also turned out to be brilliant. (The jury is still out on Cameron for Nady for Perez and Hernandez.) Most importantly, he didn't panic at the deadline and trade serious prospects for some guy who'll be a free agent at the end of the year. Now if he can trade Victor Zambrano for Scott Kazmir, I'll really be impressed.

Just so I don't get too gushy, I should point out that when the Mets were out of starters, they should have given Aaron Heilman a shot at starting rather than bringing in Jose Lima.

Hey, don't blame me. You said enough with the politics.

5 comments:

  1. Thanks I thoroughly enjoyed that... especially as it did away with the need for my usual night-cap to settle me down before bed.

    However, I wouldn't be too shocked to hear that the Sominex people served you with papers for unethical business practices and unfair competition. :-)

    ReplyDelete
  2. Well, then, in some odd way we're even. Cuz after reading your how-to post for terrorists, I'm unlikely to be sleeping for at least a week...

    ReplyDelete
  3. garik,
    Quite right. While the earlier post focused on NOPS, in the fine print there I also mention the formula cited in yesterday's post. I refer to that one now because it is simpler and works just as well as NOPS.

    ReplyDelete
  4. I get the picture. Trep says I put him to sleep, garik says i can't even quote myself correctly and moc says my work is boring math crap. I should really do these baseball posts more often.
    (Fear not fellows, I'm quite thick-skinned. Keep it coming.)

    ReplyDelete
  5. No really, I didn't mean boring in a bad way... ;-) BTW, I have a couple of nice selections of bourbon for us to sample Thursday evening.

    ReplyDelete