I've actually been working on it on and off for a couple years, funny enough.
In my experience, predictions of up to a week or so can be accurate enough to be useful but beyond that it gets very hard to keep a classifier inside any meaningful confidence interval. Predictions of one trading day are quite accurate but hard to use in any practical way.
Scraping and parsing news/twitter can help some but it's extremely noisy. I only use it to rule stocks out because, in my opinion, good businesses don't need to make waves to make steady profits. Trying to use news to actively purchase puts you in the same boat as people who are trying to guess the market; don't compete with them, they're just going to fail.
One thing I want to try in the near future is using news to feed a semantic web parser. The semantic web is essentially a large, heavily interconnected network of names and their relationships, like businesses, their key members, where they are, etc. With a mix of semantic lookup/parsing, network traversal, and an enormous heap of patience it should be possible to connect news events to stocks through more distant tangents. This is still, of course, focused on eliminating potentially risky purchases, but it's much more likely to put together that it was the wife of Oracle's CEO that got that DUI, for (fictitious) example, and thus preemptively rule out purchasing that stock.