« Perfect way of stating it | Main | Angry Mac Bastards »

On the Author's Guild Brou-ha-ha

While most folks would probably not wish to disagree with Neil Gaiman, or even mildly disagree with Wil Wheaton, (seriously, I met Wil Wheaton once while Shawn King suckered him into playing a "Shock the shit out of you" game at Macworld, and both him and Neil are so relentlessly nice, that you want to give them both hugs and hot cocoa.)

However, in this case, I kind of have to, (Neil more than Wil), and really, with most people who think that the TTS (Text-To-Speech), issue the Author's Guild raised is stupid, and that TTS will never even come close to the kind of performance you can get from human-read audio books. Luckily, both Neil and Wil are intelligent in their opinions, so unlike disagreeing with Enderle, the whole thing can be quite mellow and civilized.

There are a few problems with the "TTS is harmless" issue, the major one being that the Author's Guild was mind-bogglingly stupid in how they brought this up. They came out of the gate in an alarmist fashion, 5 years too early. They created, by being alarmist, and honestly, stupid, a shit-ton of noise about this that is helping no one. However, they were completely correct to bring this issue up now. The thing is, the main argument against treating TTS like audiobook rights is based on the (currently) correct fact that mainstream TTS sucks. It still sounds like a drunken swede, and it hasn't gotten much better since the Talking Moose days, at least not in mainstream usage.

However...mainstream. Face it right now, who are the mainstream users for TTS? The visually impaired and teenagers goofing with the drunken swede saying "cocksucker". The latter group wants things to suck, the hur-hur-hur factor is much higher. The former group...well, face it folks, at least in this country, the attitude's been "Hey, be glad you got shitty screen readers and braille, much less good voices." Face it, when it comes to things like helping the handicapped, "barely good enough" has been the standard, and that sucks.

However, that's not to say that no one is doing research, and that it isn't getting better, albeit slowly. Check out some of the work here from IBM, AT&T, and the Festival Online Demo, from the Centre for Speech Technology Research at the University of Edinburgh.

If all you're used to is what you get on your Mac, the stuff from IBM and the Festival Demo will be rather eye-opening. That brings us to the central problem with the "TTS is harmless" argument: It's based on the theory that mainstream TTS is state of the art, and that it will be years and years and years before this stuff gets better enough to be good enough. Well, state of the art is a damned site better than mainstream now, to where it is almost "good enough", and if someone, say a major publishing house threw a ton of money at the various research projects, state of the art could get to "good enough" really fast, and maybe a darned sight better than "good enough" a lot faster than people think.

"TTS is harmless" is betting against human ingenuity and technology improvements, and as someone who predates the home computer revolution by ten years, and has seen the insane pace of improvements in this area, um....that's a very bad idea. If you're in the San Jose Computer Museum some time, see if they still have the Atari 800 there. In the early 80s, just over 25 years ago, that 48K of RAM was the tits, along with slow floppy drives, etc. Compare that to the current state of the art in desktop computers. Now, take a look at what the state of the art was in 2003 and compare it to now.

That shit moves fast people. Betting that TTS will suck for years to come? Sucker bet. Seriously. Now, will it be able to completely replace humans, especially trained actors anytime soon? No. But here's the thing: it doesn't have to.

This is the other problem with "TTS is harmless". People are trying to pose the argument that TTS would have to be capable of human-level performance to be a viable commercial replacement for audio books. That dear readers, is ignoring the world around us. Face it, we will not only accept "good enough" we will pay for it in spades.

Does anyone think the initial version of the iTunes Music Store was giving you the same quality as you got on CD? No. No one did. In fact, some folks said that 128Kbps audio would be a real problem for iTMS, as it was so bad.

Yeah.

Turns out, it was "good enough" and a bit more. Listen to that IBM "expressive" demo a few times. It's not perfect, but it's not a dead flat monotone either. It's really close to "good enough" if it isn't there already. Throw some money and interest at that, and it will hit "good enough" and quickly.

Then what happens?

Okay, who here, especially anyone who's dealt with publishing houses before, thinks that they're a charity?

Anyone?

Right. Now, pretend you run a publishing house, and you realize that with just a slight improvement, TTS can hit "good enough" for the vast majority of your books. You could easily charge say, two bucks to make a given e-book "TTS allowed" or some better branding. "ULTRA SPEECH!™" Whatever. Now, that's less than say, Audible charges for their fine, human-read books that are of far higher quality. But, who makes money with Audible and other audiobooks?

The reseller.
The publishing house.
The author.
The people reading it.

That's a lot of people taking their cut of that $7.50 per unit, and you have to negotiate audiobook contracts, which means it takes longer for it to be profitable...what a pain in the ass. Now, with "ULTRA SPEECH!™", there's only two dollars involved, but who makes that money?

The publishing house.

Hmm...100% of a little or <smaller percent> of slightly more. No lawyers. No rights negotiation. And, if you're really smart, you work it so that the people who create the tech you're using have an exclusive license with you for say...two years? For two years, you get to decide who uses "ULTRA SPEECH!™" Amazon wants to put it in the Kindle? They have to pay you per unit. Now, you're making money off the Kindle and it doesn't matter if anyone ever listens to an "ULTRA SPEECH!™" book. Because for two years, you make that licensing fee. At the end of that period, oh sure, you're not exclusive. But, you have a head start, you have the main brand, and face it, unless you get greedy or someone comes up with something way better that makes it worth the effort to reprogram the silicon, you'll probably collect that "ULTRA SPEECH!™" license for a long time. All those other publishing houses? They have to scramble and find non-infringing TTS tech that's as good as, or close to as good as "ULTRA SPEECH!™", and they're behind you from the start, and their tech won't be on every Kindle sold for the next two years. Sucks to be them.

That's a lot of money, and it's allll yours. All you have to do, right now, is "support the people who agree that the Author's Guild is being stupid". Then quietly work to make a lot of money. Because face it, at least in the United States, we will jump for "good enough" if it's cheap and convenient. If we cared about high-quality and all the rest, Wal*Mart would be a hick convenience store in bumfuck Arkansas. So you have all your books supporting "ULTRA SPEECH!™", and you jack up the cost for audiobook companies to be able to have humans read this stuff. "Real" audiobooks become boutique items, and sure, everyone makes more money per unit, but the customer has to pay more, so they are making a little more money off a lot fewer sales. Big win for the publishing house, and they'd be stupid not to seriously consider this, at least from a fiscal POV. Like it or not, the fiscal POV is a valid, albeit cold-hearted way to look at things. Especially in a shit economy.

So yeah...I think the Author's Guild was dead on to bring this up now, (even if they are going about it as stupidly as possible), and I think the authors who are dismissing this issue might want to do some research on it. I do agree with Neil's point that reading aloud to an audience should not be considered the same as an audiobook, but I think that it is far better to perhaps jump the gun a bit, and deal with the details of this now, rather than pulling a RIAA, ignoring it, and then bringing out the lawyers in a panic because you just realized that technology has passed you by. Playing catch-up always sucks.

Categories:     Other, Technology
Posted by John C. Welch at 10:56 | Permalink



Comments

Warning for Notes users: The commenting system uses HTML.
I know this will be scary for some of you, especially Notes fans. However, open standards, rah-rah.
If you want to use less-than or greater-than signs, or other similar characters that HTML reserves,
you'll simply have to learn to do it the HTML way. Luckily, HTML is kind of popular, no matter what
your re-educators have told you, and you can easily find help on the intertubes.
digital.forest Where Internet solutions grow

There, a PayPal Button.

Bing
About the Author
How I do stuff on this site
Family
The Artwork of Melissa Findley
Diane Francis @ the National Post Eric Francis @ the Calgary Sun

BUY MY BOOK! BUY MY BOOK!
Non-DRM eBook PDF:
Get it direct from Peachpit!

Kindle Version:


Dead Tree Version:


Apple Amazon Links
Mac OS X Server 10.6 Snow Leopard

Mac OS X 10.6 Snow Leopard

Mac OS X 10.6 Snow Leopard Family Pack (5-User)

Amazon Book Links
Legacy of Ashes: The History of the CIA

The Donnas: Bitchin'

Wizards at War (The Young Wizards, Book 8)

The Demon's Sermon on the Martial Arts

The Collected Stories of Arthur C. Clarke

JavaScript and Ajax for the Web, Sixth Edition

Awakening Warrior: Revolution in the Ethics of Warfare

FOB Links

Mac Web Writers

Techie Links

Review Victims