Scouts vs. Stats and Searching for Synergies
The Pain Guy at VEB seems to always stir up the scouts vs. stats debates. The current version stems from a post analyzing Pagnozzi’s swing. My thoughts are in the comments there. However what I’d like to discuss is the way stats guys can either hypothetically (because we don’t have access to scouts) or actually (because we have pitch f/x data and the FSR) leverage the scouting world to improve the way us stat-heads see it.
Before I dig too deep into that question I need to take a slight tangent to talk about the way sabermetricians do their projecting/forecasting. The basic formula is to take a weighted average of past data, regress those against some population average, and apply an aging curve. So where do scouts come into play? I think there are opportunities to leverage scouting data in all 3 steps. I’ll address them in order.
- Getting a weighted average – Generally projection systems take 3-4 years of data weighting the most current information the most and gradually decreasing weight the further back the data comes from. Scouting data can be added in at this step by pointing out opportunities to over/under weight recent data because of things like mechanical/philosophical changes or injuries. Now this is a slippery slope as overweighting recent results based on philosophical changes can get you a Kyle Lohse extension, but used correctly there could be some value there.
- Regression to the mean – In my opinion this is where the saberist can get the most bang for his buck by leveraging scouting data/information. The question with regression to the mean is what mean to regress to. You want to regress to a mean of a population that the player belongs to; that population could be all of MLB (like MARCEL), players with similar builds, histories (like PECOTA), or similar stuff for pitchers (Like Nick Steiner did). I think that using scouting data like Nick did for pitchers is likely the next step in projections. I don’t know of any projections that currently are as in depth as what Nick did, but there are a few that use fastball velocity (MGLs for example). I do something similar in my defensive projections, using the Fans Scouting Report as a proxy for actual scouting reports. I wonder if a similar thing could be done for hitters using “swing type” buckets.
- Aging – Undoubtedly players of different skill sets and types age differently. The problem becomes binning players into certain types. I’d guess that having scouts input on this grouping process would be helpful.
I know that the Cardinals say they leverage scouts in their analytical models, which makes me happy. Hopefully we can get Mo to maybe pay a little more attention to the analytical department (I’m looking at you Feliz and Miles).