Category Break the Internet

The French fancy making life hard for Google, but are they kidding themselves?

After the revelation that withdrawing from Google News seems to do little (if any) damage to publishers, Eric Schmidt has been in France trying to persuade the President not to allow news publishers to charge Google for including their content on Google News.

Google says such a move would “threaten the very existence” of Google. A feeble protest, and an overblown threat. As if anyone thinks such a thing could kill Google; but even if it could why should anyone care? If Google isn’t smart enough to know how to innovate their way past challenges then maybe their days are numbered anyway.

Google also say that if the French persist with this they will just stop including French content in Google News. Based on the Brazilian experience that’s not much of a threat, since the French publishers probably wouldn’t feel much impact at all.

More intriguing is why the publishers don’t just withdraw their content rather than ask their government to get involved. They can do it any time they like; nobody forces them to be included in any Google search.

It might sound like a wizzard wheeze to get the law changed to force payment but there’s a flipside. If such a condition is imposed by law rather than negotiation, it could end up making Google’s access to their content a right, as long as payment is made.

I think control should stay with publishers, they should set terms and prices, the government should provide the framework within which they do so and then stand well back.

As soon as government start interfering, treating different categories of content differently, setting prices or terms or anything else bad things happen. The market, such as it is, gets locked in to a particular way of working and it destroys future innovation and competition. And this market hasn’t even got started yet, we shouldn’t force old age on it quite yet.

I know French (and German, and other) newspaper industries are desperate for revenues, and easy quick ways of getting them are attractive, but this sort of thing is a last resort. Traditionally they are reserved for when everything else has failed.

There are a few things to try first. Here are some suggestions for beleaguered newspapers trying to work out how to deal with search.

Be brave.

Withdraw your content from Google News. Maybe even from Google search (leave enough behind so people searching for your title can find it). And other search engines too. Since you get so little money from those sources, you’ll be risking little. And you can turn it back on easily enough.

If a search engine offers to make it worth your while to include your content in their product, negotiate with them. Do a deal which works for you – payment, helping sell subscriptions, ad share, whatever.

Tell your readers about it, why your content is in one place and not another. Point out the gap in the results they get from the search engines which don’t want to do a deal.

If none of them want to pay you, use them to deliver what you need, not what they need. Put enough stuff in them to attract the attention you need, and no more. Experiment with the best way to do that, and constantly refine your approach. Use other channels and relationships to attract users. Ask your users to pay, and work hard to make sure your product is worth paying for. Spend your SEO budget on other kinds of marketing, or just save it.

Just do something. Stand up for yourselves and the value of what you do.

Make a market.

Stop being so impotent and stop asking governments to load the dice in your favour.

The law you need is already there; just start using it.

What is a “temporary copy” and who cares?

An obscure and technical piece of copyright law has been stretched out of recognition by the aspirations of entrepreneurs. What is the “temporary copying exception” to copyright and what was it really supposed to do?

I sometimes wonder whether the history we are taught would be recognised by the people who were actually there.

Recently, perhaps due to age or perhaps due to the pace of change, I have heard people talking authoritatively about things I personally was involved with, and getting it completely wrong.

One such thing is “temporary copies”. This is a concept which exists in copyright law making certain kinds of copying legal even when there is no explicit licence, which featured in the NLA’s web licensing case with Meltwater. The claim that the legal exception for temporary copies covers  paid-for media monitoring was rejected by the courts – and some people are outraged. Browsing has been rendered illegal they say. The internet will break if the law stands.

Of course it’s fine to say that you think the law is wrong and should be changed – and equally fine for people like me to disagree. But to say that the law will destroy the internet is, aside from being self-evidently untrue, also a rather dishonest way of trying to post-rationalise poor business and legal judgements of the past.

The temptation of the entrepreneurs

The legal concept of temporary copies solves a lot of problems for entrepreneurs. Building a business involving copying other peoples work, but without the need to get permission from them, makes otherwise impossible businesses viable. If you can make your idea fit within the scope of “temporary copies” you have a business, if you can’t you don’t. Since some of the biggest businesses on the internet, such as Google, have been built on the idea of making copies without asking first, the prospect is tantalising and it’s easy to lull yourself into thinking you’re covered.

So it’s easy to see why the law on temporary copies has been subject to rather optimistic interpretation by those who need to stretch it to cover their business, and rather narrower interpretation by those who would rather avoid loopholes which reduce the control they have over their content. I come from the narrow interpretation side of that argument, and I actually had a small involvement in the process which led up to the law in question being enacted.

The rather less tantalising reality

But back to the law. What, according to it, are temporary copies?

Here’s what article 5.1 of the Copyright Directive (officially and pithily known as “Directive 2001/29/EC of the European Parliament and of the Council of 22 May 2001 on the harmonisation of certain aspects of copyright and related rights in the information society”) says:

1. Temporary acts of reproduction referred to in Article 2, which are transient or incidental [and] an integral and essential part of a technological process and whose sole purpose is to enable:
(a) a transmission in a network between third parties by an intermediary, or
(b) a lawful use
of a work or other subject-matter to be made, and which have no independent economic significance, shall be exempted from the reproduction right provided for in Article 2.

This is the clause whose drafting I got peripherally involved with, the little bit of history I glimpsed in the making. It is transposed, more or less word-for-word, into section 28A of the UK Copyright Designs and Patents Act.

I guess it’s easy to see how, by simply glancing at this wording, you could persuade yourself that your service – for example your media monitoring service – might fall within it.

It’s a little harder if you look at the wording carefully. Even if you can persuade yourself that “transient and incidental” applies to you, and that because your business depends on technology anything you do is automatically “an integral and essential part of a technological process” (and I would say neither applies to a business like media monitoring), it’s kind of tricky to get past the overarching stipulation that your activity has “no independent economic significance” when your whole business depends on it.

But what was the intention of the law?

Even if you do manage to convince yourself it’s all OK looking at the text, the Directive provides some explanations in the form of recitals which are designed to help interpretation.

Recital 33 says:

The exclusive right of reproduction should be subject to an exception to allow certain acts of temporary reproduction, which are transient or incidental reproductions, forming an integral and essential part of a technological process and carried out for the sole purpose of enabling either efficient transmission in a network between third parties by an intermediary, or a lawful use of a work or other subject-matter to be made. The acts of reproduction concerned should have no separate economic value on their own. To the extent that they meet these conditions, this exception should include acts which enable browsing as well as acts of caching to take place, including those which enable transmission systems to function efficiently, provided that the intermediary does not modify the information and does not interfere with the lawful use of technology, widely recognised and used by industry, to obtain data on the use of the information. A use should be considered lawful where it is authorised by the rightholder or not restricted by law.

This makes things a little trickier. It’s more explicit that the exception is designed to cover only very low-level technical things rather than whole business processes. It reminds us that anything with “separate economic value on [its] own” isn’t covered. It specifically states that acts which enable browsing ARE included, making any hyperbolic claims that this law outlaws browsing rather feeble. And it points out that if something isn’t authorised then it isn’t covered either, which makes it hard to depend on this law if you haven’t asked permission and harder still if you have actually been asked to stop.

If you (or your lawyers) thought hard about it, you would probably conclude that a court is the last place you want to have this argument. But it has been forced into court anyway, and it’s hard to see how they could have reached any different conclusions, given that courts decide cases based on what the law actually says rather than what people wish it would say.

How did it get written that way?

As it happens, this particular clause was subject to an incredibly long-winded and arduous process of negotiation, discussion and debate before it was finalised. One thing it is not is ill-considered. My small part was on the side of content owners; I worked for a newspaper company and participated in some meetings on behalf of them and a media industry trade group.

The heart of the issue as I remember it was a tension between ISPs (mostly at the time dial-up providers and the large telcos who provided the bandwidth and interconnections for them) and content owners.

Content owners were keen to maintain control over content and ensure that the law didn’t create loopholes for infringement to take place.

The telcos were worried that very often copies were made as an unavoidable part of the technical process of sending data around the internet – such as in routers, where technically data is copied, forwarded and then instantly deleted – should not be regarded by the law as infringing copies just because they weren’t specifically licenced.

Everyone was sympathetic to each others’ concerns, the question was how to get it worded in such a way that it didn’t create huge loopholes or unintended barriers. In other words, turning a clear understanding about the intention into workable language. Equally, using language which was too specific to the technical issues of the day would quickly make the wording obsolescent, along with the technology it referred to, so it had to try to find generic language which would still be relevant in the future.

The important thing to note is that this clause was intended to address a very small and narrow issue. This is reflected in the wording. Read it again, but now think about data packets passing through routers and switches, or caches being created by ISPs rather than media monitoring services being set up without the irritating need to ask permission to exploit peoples stuff.

It was a long time ago but I have some memories of some of the discussion of some of these phrases

“transient and incidental”. This was really about the copies made in routers. Technically speaking, data is copied, but only for as long as needed for the router to function. The copy is really an irrelevance, fleeting in duration and nobody ever sees it. It can also apply to cached copies which hang around a little longer but are not necessarily infringing (see below).

“an integral and essential part of a technological process”. There was a big discussion about caching here (among other things). At the time most internet access was dial-up and the biggest players provided services for free to users. To save money some of them operated large caches of popular content, serving their users directly from the cache rather than fetching the content from the original site’s servers. This caused some consternation, because it meant the owners of the sites never knew their content had been accessed, couldn’t charge for ads, sometimes old content was served instead of newer updates and so on. However, there is a technical way to control caching, using a setting in the (invisible) http headers which are served along with content. As long as ISPs respected these settings (which were integral to the technological process of serving web pages) then their caches were fine, as soon as they started ignoring them they weren’t. In other words the site owner should always have control.

“whose sole purpose is to enable a transmission in a network between third parties by an intermediary”. I email a file to you. The file goes from me to my ISP, my ISP to any number of routers operated by any number of third parties, then to your ISP and finally to you. Lots of copies are created, most of them in systems which have no direct relationship with either of us. These copies should not need their own licence so the law creates an exception for them.

“whose sole purposes is to enable a lawful use”. I look at a webpage. My computer creates a copy in memory and maybe on my hard disk. These copies are just allowing me to look at the webpage and so should not need their own separate licence (although I think it’s implied in any case). So the law created an exception for them.

“which have no independent economic significance”. This one seems to be one of the most wilfully misinterpreted. I have heard the argument made, with a straight face, that a company which keeps complete copies of entire websites in their servers in order to use them for their business is covered by this exception. The logic seems to be that although they keep copies of the entire content, and they depend on them to do business, they don’t make more than small snippets available to their users and so the copies in their servers have no economic significance. Since this is self-evidently asinine and self-justifying I don’t think it needs a lengthy deconstruction – it’s obviously absurd.

The legalities drag expensively on…

The NLA and Meltwater litigation rumbles pointlessly on, and so all this will be subject to even more scrutiny by the courts.

Fortunately for them, they have copious sources which can help them understand the process which led up to the wording. As well as the law in its final form, and the recitals explaining some of the intent,.the whole official and political process was documented as it went along. There are also plenty of people who participated who can help round out the picture if necessary. The courts won’t need to use the forensic skill of the ancient historian to determine what the law was intended to achieve – they can get the first-hand version. I find it hard to see how they could change the conclusion of the lower courts whose judgement, in my view, reflects the letter and intent of the law.

Meanwhile back in the real world, more sensible things are happening. Meltwater has agreed a licence with the NLA. They’re doing business, their clients are getting a service, so are the clients of their rivals who are on a level playing field. The internet is still there, it’s not broken. Browsing is still legal. A few angry businessmen, put out by the idea that someone else’s property isn’t available as a free resource for them, continue to scream and shout and look foolish.

Move along now, nothing to see. Time for a nice cup of tea.

DIsclosure: I am a former chairman of the NLA and still do occasional freelance work with them and their members

The internet wants to be open, but some internets are more open than others

Sergey Brin of Google had a discussion with The Guardian and talked about his vision for the future of the internet, alongside his concerns about threats to that vision.

It’s an incredible insight into his (and Google’s) world view, which seems to be from a truly unique perspective. There is nobody else who sits astride the internet like Google and it seems that from the top, the sense of entitlement to be the masters of all they survey is strong.

Take this quote, from towards the end of the piece:

If we could wave a magic wand and not be subject to US law, that would be great. If we could be in some magical jurisdiction that everyone in the world trusted, that would be great … We’re doing it as well as can be done

I’m not sure what this “magical jurisdiction” would be but it doesn’t sound like Sergey wants it to be based on US law, and there’s no sign that Google has any greater love for any other existing jurisdiction. I wonder if he’s thinking that perhaps it should be a Google-defined jurisdiction? After all, Google is fond of saying that the trust of users is their key asset – they presumably consider themselves to be highly trusted. I wonder if the magic wand is in development somewhere deep in their bowels? Perhaps one of their robotic cars can wave it when the time comes! Google can declare independence from the world…

But why should we trust them? There’s almost nothing they do which you can’t find fierce critics to match their army of adoring fans. Without deconstructing them all, surely the point is this: whenever a single entity (be it a government, company or individual) has complete control over any marketplace, territory or network, bad things tend to happen. Accountability, checks-and-balances, the rule of law, democratically enacted, are all ways of trying to ensure that power does not achieve its natural tendency to corrupt.

Google asks us to just trust it. And many people do.

Another quote:

There’s a lot to be lost,” he said. “For example, all the information in apps – that data is not crawlable by web crawlers. You can’t search it.

The phrasing is interesting. Is is really true that because data in apps is not crawlable it is “lost”? I use apps all the time, and the data appears to be available to me. I don’t think the fact that it’s not available to Google means it’s “lost” (except I suppose to Google). Defining something that is not visible to Google as “lost” suggests not just that Google considers that it should be able to see and keep everything that exists online, but also that they have an omniscient role that should not be subject to the normal rules of business or law. Like people being able to choose who they deal with and on what terms. Or being able to choose who copies and keeps their copyright works.

The “lost” app data could, of course, easily be made available to Google if the owner chose. Brin’s complaint seems to be that Google can’t access it without the owner deciding it’s OK – there is a technical obstacle which can’t simply be ignored. Yet all they have to do, surely, is persuade the owners to willingly open the door: hardly a controversial challenge in the world of business. It’s called doing a deal, isn’t it?

Here’s what he had to say in relation to Facebook

You have to play by their rules, which are really restrictive.. The kind of environment that we developed Google in, the reason that we were able to develop a search engine, is the web was so open. Once you get too many rules, that will stifle innovation.

Another telling insight. Too many rules stifle innovation. Rules are bad.

Hard to agree with even as a utopian ideal (utopia isn’t usually synonymous with anarchy), but even less so when you consider the reality of dealing with Google. I have visited various Google offices at various times and have always been asked to sign in using their “NDA machine” at reception. Everyone has to do it. You have to sign an NDA simply to walk into their offices. The first rule of Google is you can’t talk about Google. Hardly the most open environment – they are the only company I have ever visited which insists on this.

Of course, Google is no stranger to rules either. They set their own rules and don’t offer room for discussion or adjustment. When they crawl websites, for example, they copy and keep everything they find, indefinitely. They have an ambition to copy and keep all the information on the internet, and eventually the world. Their own private, closed, internet. This is a rule you have to play by.

Even if you ban crawling on some or all of your site using robots.txt, they crawl it anyway but just exclude the content from search results (this was explained to me by a senior Google engineer a few years ago and as far as I know it has not changed). If you want to set some of your own rules, using something like ACAP or just by negotiating with them, good luck: they refuse to implement things like ACAP and rarely negotiate.

“You have to play by their rules, which are really restrictive”

Here’s an interesting story. A while ago, Google refused to include content in their search results if clicking on the link would lead a user to a paywall. They said it was damaging to the user experience if they couldn’t read the content they had found with Google (another Google rule: users must be able to click on links they find and see the content without any barriers or restrictions). However it also meant users couldn’t find content they knew they wanted, for example from some high-profile newspapers like the FT and Wall Street Journal.

So Google introduced a programme called “First Click Free“. It set some rules (more rules!) for content owners to get their content included in Google search even if it was “restricted” behind a paywall. It doesn’t just set rules for how to allow Google’s crawlers to access the content without filling in a registration form, but also the conditions you have to fulfill – primarily that anybody clicking a link to “restricted” content from Google search needs to be allowed to view it immediately, without registration or payment.

This is a Google rule which you have to play by, unless you are willing to be excluded from all their search results. Not only is it technically demanding, it also fails to take account of different business models and the need for businesses to be flexible.

Unfortunately it was also wide open to abuse. Many people quickly realised they could read anything on paid sites just by typing the headline into a Google search.

Eventually Google made some changes. Here’s how they announced them:

we’ve decided to allow publishers to limit the number of accesses under the First Click Free policy to five free accesses per user each day 

They have “decided to allow” publishers to have a slightly amended business model. Publishers need permission from Google to implement a Google-defined business model (or suffer the huge impact of being excluded from search), and now they are allowed to vary it slightly.

For a company which objects to the idea of having to play by someone else’s rules, they’re not too bothered about imposing some of their own.

Which brings me back to trust. If Google want a world in which they have access to scan, store and use all “data” from everywhere, where they don’t have to play by the “restrictive” rules or laws (like copyright) set by others – even their own government – don’t they need to start thinking about their demand for openness both ways round? Rather than rejecting rules which don’t suit them (such as “US law”) shouldn’t they try to get them changed; argue and win their case or accept defeat graciously? Shouldn’t they stop imposing rules on those whose rules they reject, ignore or decry?

Google is a very closed company. Little they do internally is regarded by them as being “open”, and they build huge and onerous barriers to protect their IP, secrets and data. Even finding out what Google know about you, or what copies of your content they have, is virtually impossible; changing or deleting it even harder.

They ask us to trust them. We would be unwise to do so, any more than we trust any monopolies or closed regimes which define their own rules. It wouldn’t matter so much but for their huge dominance, influence and reach. They have, it is said, personal data on more than a billion people all of whom are expected to trust them unquestioningly.

Surely the first step to earning, rather than simply assuming, that trust is that they need to start behaving towards others in the way they demand others treat them.

Openness cuts both ways, Sergey. How about starting by practicing what you preach and opening Google up fully?

breaktheinternet.info

While I was writing the previous post, it occurred to me that gathering hyperbolic claims like the claim that something will “Break the Internet” in one place might help anyone trying to make sense of them. To assist with this task I would like to invite you to join me.

I am setting up a new website, embryonic for the moment but it’s up, called www.breaktheinternet.info. Its purpose is to document, with a quick note and links, any hyperbolic and seemingly catastrophic claims that some proposed change will end free speech, break the internet or cause some other sweeping disaster.

By documenting them we can perhaps help inform a real, honest and sensible debate about important issues. There are two sides to every argument, and if you’re the referee you need to hear them both. I have put a few links there to get it going

So rather than allowing either side to simply dress their commercial interests up as existential threats to the internet, democracy or humanity, lets try to get everyone to raise their game and be a bit more honest.

If you have seen anything worth mentioning, please put a comment with a link here or on the other blog… or maybe on twitter with #breaktheinternet

Breaking the Internet, one absurd claim at a time

I’m not much of a geek, so I can’t pretend to understand the technical minutae of the internet intimately.

But one thing I do know is that it was designed to be fault-tolerant, decentralised and robust. The basic technology was developed by the US Defense Department, some say to survive nuclear war but certainly to survive dodgy connections, and it seems to have worked.

While we all have our frustrations with the internet sometimes, and whole countries have been affected by interference from their governments, I have never heard of the whole internet breaking down. Even as bits of it fail, the rest carries on regardless.

The internet, by design, is hard to break.

Which means it’s hard to imagine something which would “Break the Internet”.

Yet that phrase, “Break the Internet” is one I have heard with increasing frequency. It is used as a dire threat, a prediction of doom, the ultimate and unimaginably awful unintended consequence of a terrible and naïve mistake.

Often, it is used as a way of explaining to policymakers, who by-and-large are even less geeky than me, why they should not do something they have proposed.

I first heard it when I was involved with the ACAP project. ACAP is a simple way of making content permissions machine-readable, thereby solving the problem of how automated services like Google are supposed to comply with terms of use.

We were on a trip to the USA to introduce ACAP to various industry and government people. It was going down well, in Europe as well as the USA. It was seen as a way of solving a sticky problem without having to legislate and avoided lots of awkward issues like DRM.

Google, who had initially been keen on ACAP and even delegated one of the search engineers to a committee defining its technical development, had turned against it. Presumably, although they never said this, they realised that if they were aware of terms of use they might have to comply with them.

Public statements were made by the likes of Eric Schmidt saying that there were technical problems with ACAP (even though Google had helped design the technical aspects of it) but implying that once they were solved Google would support ACAP. In fact they never engaged with ACAP to try to solve the supposed technical issues, nor explained what they were.

Anyway, the first time I heard the phrase “Break the Internet” was on that US trip. We had visited Google, and privately, on the way to dinner, I was told that the distinguished engineers were saying internally that ACAP would “ Break the Internet”. So however polite they were being, the engineers did not support it and there was little chance of getting much progress.

Obviously such a dire consequence would be cataclysmic, and nobody could knowingly support something which would lead to it.

But we were surprised because we couldn’t think of how ACAP could possibly do such a thing. How ANYTHING could do such a thing? My conversation was an informal one with a non-technical person (a lesser species at Google) and he was unable to explain what it meant – but it sounded bad.

We asked more technical people at Google but they were unable or unwilling to explain. Silence was the stern reply, and the dialogue pretty much dried up after that.

However we did hear the phrase “Break the Internet” again. This time it came from government officials, who told us that while they liked the idea of ACAP they had been told that it would “Break the Internet”.

We asked if this warning had come with an explanation, they said no. When we suggested that it would be a good idea to set up a meeting to discuss this with whoever had said it so that, once we had established the problem, we could fix it they agreed. ACAP after all, was about the end not the means. But the meetings never happened.

I reached the conclusion that ACAP was not some terrible time-bomb ticking under the internet. Quite clearly it couldn’t break anything at all (not least because technically it didn’t really do anything more than a copyright notice in a book – all it did was make licences machine-readable).

What it MIGHT have broken, or at least changed a little bit, is one aspect of Google’s business rationale. The bit which justifies them accessing any website, and using content by default for their various search products, without asking first, without paying any attention to restrictions or conditions which those sites might have specified in their terms of use and without paying money or offering anything other than traffic in return.

But the damage was done. Every politician and policy-maker wanted to be friends with the internet and with Google. All of them wanted to appear progressive and technically ept. None of them wanted to go down in history as the person who unwittingly “Broke the Internet”, and none of them were geeky enough to ask even the simplest questions to explore the substance of this ludicrous claim, or willing to facilitate a conversation which might lead to an answer.

So, even though they liked the idea of ACAP they were scared of supporting it in case something bad happened. Google’s rivals didn’t want to implement it if Google did not. The well intentioned and in my mind quite benign effort which ACAP represented became controversial and demonised.

The politicians and official, I get the impression, just looked the other way, and hoped that in time everyone would learn to just be friends.

Something rather good was lost, temporarily at least, as the result of a silly catchphrase – “Break the Internet”.

Anyway… it turned out that the absurd, hyperbolic and completely false assertion, in private, that ACAP would “Break the Internet” worked so well that the phrase caught on.

Taking advantage of the fact that many people seem to regard Google and everyone who works for it as some sort of super-species of superior intelligence and insight, unattainable by normal humans, the phrase came out in relation to other “threats” to Google’s (and others’) interests.

Recently David Drummond, Google’s chief lawyer, told an audience at Davos that the European proposals on privacy, specifically the “right to be forgotten” would – yes – “Break the Internet”. Again, clearly absurd, but seemingly taken seriously by those without the confidence to challenge it.

In relation to PIPA and SOPA there were numerous articles and blog posts making, spookily, the same prediction. These pieces of legislation, designed to reduce copyright piracy and help media organisations survive, would “Break the Internet”.

We can all chuckle at this, but it’s not funny. However little this claim stands up to scrutiny, those it is made to rarely if ever have the confidence to challenge it. It’s preposterousness is exceeded only by its effectiveness. It is a crazy, disingenuous, self-interested, untruthful and alarmingly potent claim.

So I want to challenge it, and other equally absurd claims like “the end of free speech” which runs a close second when it comes to silly predictions, and I want to show it up for the dishonest and false allegation it invariably is.

I want to appeal to everybody, especially policymakers and their staff, to not just disregard it but positively reject it as you would any other obviously ridiculous claim. Put it to the test, probe and enquire, find out what is really meant and if you discover that the reality doesn’t live up to the claim then you should deprecate not just the claim but all the evidence or claims put forward by that source.

Demand honesty, demand rigour, demand truth and punish those who would seek to deceive you by ignoring them.