Looking for site with semi-accurateNCAAF play by play
durito
Senior Member
I use to scrape Yahoo, but they got rid of play by play and everything else when they destroyed their scoreboard and stats and replaced them with that ugly black disaster.
ESPN is riddled with errors.
ncaa.com looks decent but don't really want to deal the with the javascript to scrape it. any other suggestions?
ESPN is riddled with errors.
ncaa.com looks decent but don't really want to deal the with the javascript to scrape it. any other suggestions?
Comments
Are you trying to scrape in-game play by play or do it after the game?
I've never found one site that was accurate enough to bother. I use the school sites. They have the official boxscore.
After the game. I actually won't even need it till after the season, but usually have something that updates weekly and don't want to fall behind.
I've seen you mention that before. Are we talking 300 different scripts to hit each ncaab team site? Would certainly be worth it if they are more accurate.
It can get a little crazy year to year but once the year is set up, it is ok. There are always a handful of schools that change their format. Most of the time, it changes to something I already have from another school. What I did was set up a database with each teams web address, boxscore web format and boxscore format. Many schools are nice enough to have a basic format for their boxscores, something like game12.html at the end so I can create their address automatically and put it in a file. Others, a couple clicks to the boxscore, then I copy it to the file I created. After that, push a button and it will run thru the file, check the format of the boxscore and use the appropriate template to scrape it. A few schools use PDF's so I copy the PDF to my computer, convert it to text and scrape the file I converted(this was a lifesaver for the NFL PDF gamebooks). I would much rather they all go to PDF's because it is so easy and the format is always the same after i convert it.
You won't need 125. I have 12 or 13 for CFB but I don't think they are all being used right now. The worst part is when they get sloppy with their coding.
Doesn't sound too bad. I do a bunch of Euro sports where each country has it's own league and none are close to the same and half aren't in English lol.
I'm an habitual pbp reader. It's more efficient, for me, to "watch" games, to get a "real" score, by reading the pbp (and I've not quite worked out an a program that I trust better than my own judgment, but that's true of my handicapping generally, though I do put numbers on everything).
But in game pbp at EPSN are lol. Why? HTF hard can it be? As has been noted, the college sites themselves have access to the proper game summaries, so the info is out there. They also have accurate starting lineups and game participation lists, and other good material (game notes and such). Us CFB junkies go there and to the team's public forums (you know you're a sicko when you're reading the UTSA forum looking for info on whether the starting DB is in or not).
My dream is for a PFF of CFB. Some of that data--precise player # of plays, for example--is out there for major programs. But I suppose there's edge to be had by the mere fact that data isn't well aggregated, so maybe I shouldn't wish for change in this.
We'll see. Other sources suck.